A Shaw
@ashaw1
I’m a senior data engineer and data architect specializing in cloud-native streaming, lakehouses, and analytics.
What I'm looking for
I’m a Data Engineer and Data Architect with 9+ years of experience delivering end-to-end big data solutions across AWS, Azure, and GCP. I build cloud-native data platforms that turn operational complexity into reliable, scalable analytics.
I specialize in real-time streaming pipelines with Kafka, Flink, and Spark, and I orchestrate ETL/ELT workflows using Databricks, Airflow, and Python. My work centers on SQL, data modeling, and modern data warehousing like Snowflake, Redshift, and BigQuery.
I’m trusted for designing high-performance architectures that unify structured and unstructured data from RDBMS and NoSQL systems, so teams can power BI, machine learning, and executive analytics. I also lead data governance and schema standardization to support compliance goals such as HIPAA and GDPR.
As a team lead, I mentor junior engineers and partner cross-functionally to align data architecture with AI/ML and business priorities. I love building lakehouse patterns, optimizing cost and performance, and delivering trustworthy pipelines that reduce latency and improve decision-making.
Experience
Work history, roles, and key accomplishments
Lead Data Engineer (Arch)
StealthAI (StitchVision)
Aug 2023 - Present (2 years 11 months)
Architected a real-time inventory and supply chain data platform using Kafka, Flink, and Databricks to enable sub-second product availability updates. Led multi-cloud (AWS and GCP) lakehouse and streaming ETL/ELT designs, including governance aligned to HIPAA and GDPR.
Data Engineering Team Lead
Horizon Technologies (Addo AI)
Jan 2022 - Jul 2023 (1 year 6 months)
Conceptualized and built a user health data platform using Kafka streams and cloud data lakes on Google Cloud. Led enterprise data architecture and migration to AWS Redshift, Snowflake, and BigQuery, supporting machine learning and business intelligence.
Data Engineer (Remote Contract)
Amazon Web Services (AWS)
May 2020 - Dec 2021 (1 year 7 months)
Built scalable ETL pipelines with AWS Glue, PySpark, and Lambda to ingest and transform customer behavior and transaction data. Designed Redshift data marts and a centralized S3 data lake with Glue Data Catalog and schema versioning, supporting analytics via QuickSight and Athena.
Data Engineer
Mercurial Minds
Jan 2015 - Apr 2020 (5 years 3 months)
Developed a document management system with real-time indexing and retrieval using Kafka and Flink, including OCR for searchable content. Built end-to-end data architectures integrating relational and NoSQL systems, and implemented collaboration analytics and data backup using AWS and Azure storage services.
Education
Degrees, certifications, and relevant coursework
University of Management
Bachelor of Computer Science and Technology, Computer Science and Technology
Earned a Bachelor of Computer Science and Technology degree from the University of Management.
Tech stack
Software and tools used professionally
Azure Synapse
Apache Spark
AWS Glue
Blockchain
Superset
Google Cloud Platform
Amazon S3
GitHub
Kubernetes
Jenkins
GitHub Actions
Pandas
PySpark
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
InfluxDB
HBase
Gmail
Databricks
Neo4j
Redis
Terraform
Java
Julia
TensorFlow
PyTorch
MLflow
scikit-learn
Kafka
Apache NiFi
Elasticsearch
AWS Lambda
Serverless
Kafka Streams
Airflow
Time Analytics
Root Cause
Amazon Web Services (AWS)
SQL
Azure Blob Storage
XGBoost
LightGBM
Dagster
Delta Lake
Trino
Bash
Transform
Factory
Remote
Availability
Location
Authorized to work in
Portfolio
github.com/Shaw-codeJob categories
Skills
Interested in hiring A?
You can contact A and 90k+ other talented remote workers on Himalayas.
Message AGet matched with your dream remote job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
