Samuel Ekanem

SE

Open to opportunities

Samuel Ekanem

@samuelekanem

Senior Data Engineer specializing in scalable lakehouse and streaming platforms for high-volume telemetry.

What I'm looking for

I seek senior engineering roles building scalable lakehouse and real-time data platforms, collaborating with ML and analytics teams, and driving architecture and performance improvements.

I am a Senior Data Engineer with 10+ years building large-scale data platforms and distributed pipelines, specializing in Spark, Databricks, Kafka, Python, and AWS data ecosystems. I design both real-time and batch pipelines, lakehouse architectures, and ML training data platforms that process billions of events and multi-petabyte datasets.

At Tesla I led the architecture of a vehicle telemetry lakehouse platform processing over 5B daily events and designed the Autopilot training data platform to transform multi-petabyte sensor and camera datasets. I built large-scale streaming pipelines with Kafka and Spark Structured Streaming, implemented enterprise lakehouse patterns with Delta Lake and medallion modeling, and improved distributed pipeline performance by 40% through Spark optimization and partitioning strategies.

I also developed orchestration frameworks with Apache Airflow and CI/CD automation, collaborated with ML engineers on scalable feature and training data pipelines, mentored data engineers, and led architecture decisions for next-generation data platforms supporting analytics, ML, and operational intelligence.

Experience

Work history, roles, and key accomplishments

TE

Current

Senior Data Engineer

Current

Jan 2021 - Present (5 years 6 months)

Led architecture of the Vehicle Telemetry Lakehouse processing over 5B daily events and designed Autopilot training data pipelines scaling to multi-petabyte datasets, improving pipeline performance by 40% and enabling near-real-time analytics for diagnostics and fleet monitoring.

S3 Spark Databricks Delta Lake Kafka Python EMR Redshift Spark Structured Streaming Airflow CI CD

TE

Data Engineer

Jan 2017 - Dec 2020 (3 years 11 months)

Developed the Fleet Analytics Data Lake and large-scale ETL pipelines using Spark, Python, and SQL to process high-volume telemetry for fleet performance analytics and implemented streaming ingestion for vehicle diagnostics.

S3 Spark Python SQL Kafka Redshift Airflow ETL Data Warehouse

TE

Big Data Engineer

Jun 2014 - Dec 2016 (2 years 6 months)

Built the initial vehicle data pipeline framework ingesting large-scale telemetry with Hadoop, Spark, and Python, optimized batch processing to reduce compute costs and produced SQL analytics datasets for engineering insights.

S3 Performance Optimization Hadoop Spark Python HDFS Batch Processing SQL Data Ingestion

Education

Degrees, certifications, and relevant coursework

SU

Stanford University

Master's degree, Computer Science

2009 - 2011

Master's degree in Computer Science completed at Stanford University with focus on advanced computing topics relevant to large-scale data systems.

Tech stack

Software and tools used professionally

GitHub

Hadoop

Databricks

Terraform

Kafka

SQL

Availability

Open to opportunities

Location

United States

Authorized to work in

Portfolio

linkedin.com/in/sisi-zheng-0a5418a3

Salary expectations

140k-150k USD

Social media

Job categories

Data Engineer Data Engineer Big Data Engineer Data Platform Architecture Manager Machine Learning Infrastructure Engineer Streaming Data Engineer Data Architect Senior Staff Data Engineer Senior Principal Data Engineer Senior Data Engineer Positions Senior Data Analytics Engineer Senior Data Platform Engineer Senior Data Infrastructure Engineer

Interested in hiring Samuel?

You can contact Samuel and 90k+ other talented remote workers on Himalayas.

People also viewed

View all talent

Get matched with your dream remote job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!