Ali Bangash
@alibangash4
Data Solutions Architect building cloud-native lakehouse and ETL/streaming pipelines with AI-driven analytics.
What I'm looking for
I’m a Data Solutions Architect with 11+ years of experience designing and building scalable, AI-driven data platforms and ETL/ELT pipelines. I translate business needs into architectures that enable efficient ingestion, transformation, and analytics—built for both batch and real-time use cases.
My work focuses on cloud-native lakehouse architectures and enterprise-grade data platforms using AWS, Azure, Databricks, Snowflake, and BigQuery. I integrate LLM workflows—including retrieval-augmented generation (RAG), LangChain, and LlamaIndex—to deliver intelligent analytics and data access.
I’ve led end-to-end solution delivery across healthcare, financial services, and retail, including Kafka/Spark Streaming-based processing for high-volume systems. I also establish data governance, security, and compliance foundations such as data lineage, access control, and metadata management to keep platforms trustworthy.
Throughout my career, I’ve mentored engineers and aligned AI and data strategies with organizational goals. Whether modeling dimensional warehouses (Kimball/Star) or implementing Data Vault approaches, I aim to improve system performance, scalability, and reliability—so teams can act on insights faster.
Experience
Work history, roles, and key accomplishments
Data Solutions Architect
ScienceSoft
Feb 2024 - Present (2 years 2 months)
Designed end-to-end data solutions on AWS and Azure, translating business needs into scalable lakehouse architectures for healthcare and financial analytics. Built batch and real-time pipelines using Databricks, Snowflake, Kafka, and Spark Streaming, and integrated LLM-based querying with governance and compliance controls.
Lead Data Engineer
NexHealth
Apr 2021 - Jan 2024 (2 years 9 months)
Engineered scalable healthcare pipelines for EHR and claims data, enabling near real-time analytics for clinical reporting and population health. Orchestrated HL7/FHIR ingestion and delivered HIPAA-compliant lakehouse and warehousing solutions, automating ETL monitoring/testing and reducing pipeline failures by 35%.
Senior Data Engineer
SentiLink
Developed large-scale batch and streaming pipelines with Spark, Kafka, and Hadoop to process millions of transactions for fraud detection and risk analysis. Built data storage and warehousing solutions and collaborated with data science to deploy ML models using Python, scikit-learn, and MLflow.
ETL & Data Warehouse Engineer
FourKites
Developed and maintained enterprise ETL pipelines to integrate retail and POS datasets into centralized warehouse systems using Informatica, Talend, and SSIS. Migrated on-prem data to AWS and BigQuery, built near real-time ingestion with NiFi, optimized SQL/Spark performance, and delivered BI dashboards for supply chain and sales decision-making.
Education
Degrees, certifications, and relevant coursework
Ali hasn't added their education
Don't worry, there are 90k+ talented remote workers on Himalayas
Tech stack
Software and tools used professionally
Amazon Redshift
Apache Spark
AWS Glue
Apache Flink
Talend
Amazon Quicksight
Microsoft Azure
Google Cloud Platform
Amazon S3
GitHub
Kubernetes
NumPy
Pandas
PySpark
dbt
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
HBase
Gmail
Databricks
Redis
Jira
Java
Azure Machine Learning
TensorFlow
PyTorch
MLflow
scikit-learn
Kafka
Apache NiFi
Vercel
Kafka Streams
Airflow
Time Analytics
SQL
LangChain
LlamaIndex
Weaviate
Pinecone
Delta Lake
Trino
Bash
Faiss
Microsoft Fabric
Factory
Jan
SentiLink
Availability
Location
Authorized to work in
Portfolio
alibangash.vercel.appSocial media
Job categories
Skills
Interested in hiring Ali?
You can contact Ali and 90k+ other talented remote workers on Himalayas.
Message AliFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
