HimalayasHimalayas logo
KJ
Open to opportunities

kalveen joseph

@kalveenjoseph

Senior Big Data Engineer architecting petabyte-scale batch and real-time platforms with 99.99% uptime.

United States
Message

What I'm looking for

I’m looking to own end-to-end big-data platforms at petabyte scale—building dependable batch and real-time pipelines, lakehouse modernization, and data quality/governance teams can trust.

I’ve built and led big-data ecosystems for nine years, focused on turning complex pipelines into reliable, high-performance platforms. At Oznolo, I architected and owned a healthcare data platform that processes 28B events daily across 38 hospital network clients, runs on a 2,400-node Spark cluster, and delivers 4.7PB of data with 99.99% uptime.

I’m especially strong at modernizing lakehouse and streaming architectures—redesigning batch from MapReduce to Spark 3.4 to cut nightly ETL from 14 hours to 47 minutes, and building real-time streaming with Kafka and Apache Flink for sub-60-second sepsis and deterioration alerts. I’ve driven measurable wins across governance, quality, and cost: migrating HDFS to a Delta Lake lakehouse on S3 (68% storage cost reduction, 12x query performance lift), enforcing automated data quality at scale with Great Expectations (18,000 contracts nightly), and establishing HIPAA-compliant governance with Unity Catalog to support 280 analysts and 45 ML engineers.

Experience

Work history, roles, and key accomplishments

OZ
Current

Senior Big Data Engineer

Oznolo

Mar 2022 - Present (4 years 2 months)

Architected and owned a healthcare big data platform processing 28B clinical and claims events daily across 38 hospital clients, managing a 2,400-node AWS EMR Spark cluster serving 4.7PB with 99.99% uptime. Redesigned batch and real-time pipelines (MapReduce to Spark; Kafka/Flink) and migrated HDFS to Delta Lake on S3, cutting ETL runtime from 14 hours to 47 minutes and reducing storage costs by 6

Humana logoHU

Big Data Engineer

Jun 2019 - Feb 2022 (2 years 8 months)

Built and operated Hadoop/Spark claims analytics at 820M records quarterly and led a 6-person data platform team. Delivered real-time fraud detection at 2.4M transactions daily and led a Cloudera-to-AWS EMR/S3 migration for 180 Hive jobs, cutting infrastructure costs by 52% and improving throughput by 8x.

Education

Degrees, certifications, and relevant coursework

Georgia Institute of Technology logoGT

Georgia Institute of Technology

Master of Science, Computer Science

Earned an M.S. in Computer Science (Computing Systems) from Georgia Institute of Technology.

Find your dream job

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan