David Johnson
@davidjohnson2
Senior AI Data Engineer specializing in scalable, real-time data and ML pipelines.
What I'm looking for
I am a Senior AI Data Engineer with over a decade designing and operating distributed data systems for blockchain marketplaces, high-growth SaaS analytics, and cloud-native AI infrastructures. I focus on building real-time, reliable pipelines and feature stores that enable actionable intelligence for product and trading platforms.
At Courtyard.io I architected the core data backbone for a tokenized collectibles marketplace, integrating on-chain events, custodial vaults, and fiat processors while delivering AI-driven pricing intelligence and compliance workflows. I built hybrid Web2/Web3 reconciliation, anomaly detection, and low-latency analytics that improved visibility and stabilized marketplace volatility.
Previously at Mixpanel and Druva I scaled event ingestion systems to billions of events, implemented AI-ready feature pipelines, and migrated legacy ETL to Spark-based distributed workflows—reducing processing windows and query costs while enforcing secure multi-tenant data isolation. I optimize storage, partitioning, and query strategies to lower latency and cost across multi-terabyte datasets.
Technically, I am skilled in Spark, Kafka, Snowflake, Databricks, PySpark, MLflow, and cloud platforms (AWS) and I emphasize robust governance, compliance, and reproducible ML pipelines. I bring a pragmatic, product-oriented approach to data engineering that balances performance, security, and business impact.
Experience
Work history, roles, and key accomplishments
Senior AI Data Engineer
Courtyard.io
May 2022 - Present (3 years 9 months)
Architected the core data backbone for a tokenized collectibles marketplace, building blockchain-synchronized pipelines and AI pricing models that enabled sub-minute visibility into liquidity and reduced query latency by 45%.
Scaled distributed event ingestion processing billions of events daily, built AI-ready feature pipelines and low-latency event models, and reduced high-cardinality query costs by 32% through storage and partitioning optimizations.
Engineered cloud-native telemetry and compliance data pipelines for SaaS backup products, migrated batch ETL to Spark reducing nightly windows by 40% and built predictive capacity planning models for storage forecasting.
Education
Degrees, certifications, and relevant coursework
California State University, East Bay
Master of Science, Computer Science
2010 - 2016
Completed a Master of Science in Computer Science at California State University, East Bay from August 2010 to May 2016.
Tech stack
Software and tools used professionally
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring David?
You can contact David and 90k+ other talented remote workers on Himalayas.
Message DavidFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
