Skip to main content
SS
Open to opportunities

Shikha Sharma

@shikhasharma4

Data engineer and analyst building scalable cloud data platforms, lakehouses, and GenAI-enabled analytics.

United States
Message

What I'm looking for

I’m looking for a team where I can build secure, scalable cloud data lakehouses and real-time pipelines, strengthen data governance (HIPAA/RBAC), and apply GenAI/ML to deliver measurable analytics with strong engineering practices.

I’m a Data Engineer/Data Analyst with 6+ years of experience building scalable batch and real-time data platforms across healthcare, fintech, and retail. I design high-performance data lakehouse and data warehouse architectures that process multi-terabyte datasets end-to-end—from ingestion to governance and analytics.

I specialize in AWS and Azure ecosystems, Apache Spark (PySpark, Spark-SQL), Databricks, Snowflake, Delta Lake, and Kafka, with strong data modeling foundations in dimensional modeling (Star/Snowflake), Data Vault, and medallion architecture. In HIPAA-governed environments, I implement RBAC, IAM policies, KMS encryption, row-level security, dynamic/column-level masking, and audit logging to ensure secure PHI handling.

I also enable Machine Learning and GenAI solutions using MLflow, SageMaker, Azure ML, and LLM integrations (AWS Bedrock, Hugging Face, LangChain). I build production-grade pipelines with CI/CD, Infrastructure as Code (Terraform, CloudFormation), and observability (CloudWatch, Grafana, OpenTelemetry, Elasticsearch) so teams can move faster without sacrificing reliability.

Experience

Work history, roles, and key accomplishments

UG
Current

Data Engineer / Data Analyst

Oct 2024 - Present (1 year 8 months)

Designed and implemented an AWS/Azure data platform processing 15+ TB/day of healthcare claims and provider data. Built PySpark/Databricks lakehouse pipelines (Delta Lake, Hudi) and Kafka streaming to reduce downstream data latency to under 30 minutes and enabled HIPAA-compliant PHI governance across multi-tenant environments.

Mayo Clinic logoMC

Data Engineer

Apr 2022 - Sep 2024 (2 years 5 months)

Engineered Azure-based clinical and research data pipelines using ADF/Synapse/ADLS Gen2 and optimized PySpark workloads on Databricks for HL7/JSON/XML datasets. Migrated workloads to Synapse and Delta Lake, supporting 300M+ patient records, and implemented PHI-compliant access controls and streaming ingestion for near real-time monitoring dashboards.

AM

Data Engineer

Amount

May 2020 - Mar 2022 (1 year 10 months)

Architected batch and streaming data pipelines with PySpark and Databricks to process 2TB+ daily lending and credit risk data. Implemented Kafka streaming and Airflow-based ELT to Snowflake, reducing pipeline latency from hours to sub-hour SLA while enforcing ACID storage with Delta Lake and applying data validation for governance.

HD

Data Analyst

Feb 2019 - Apr 2020 (1 year 2 months)

Analyzed retail and supply chain datasets using advanced SQL across Teradata, Oracle, and SQL Server to support merchandising and inventory planning. Built automated Power BI/Tableau dashboards and ETL workflows (Hive, Sqoop, Hadoop) to consolidate POS and logistics data, enabling SKU-level trend analysis across thousands of stores.

Education

Degrees, certifications, and relevant coursework

University of the Cumberlands logoUC

University of the Cumberlands

Master's degree in Business Analytics, Business Analytics

Completed a master's program in business analytics at the University of the Cumberlands.

Interested in hiring Shikha?

You can contact Shikha and 90k+ other talented remote workers on Himalayas.

Message Shikha

People also viewed

View all talent

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan