Lyle P
@lylep
Senior data engineer building scalable cloud ETL pipelines and strategic enterprise data platforms.
What I'm looking for
I’m a Senior Data Engineer who turns messy, multi-domain data into reliable, decision-ready pipelines. I focus on scalable workflows, strong data modeling, and optimizing end-to-end data pipeline design and management—so teams can move faster with confidence.
Most recently, at Proxify and Independent Consulting, I designed, built, and maintained serverless pipelines and Streamlit interfaces to process ~500GB of multi-domain data, directly supporting analyst and stakeholder decisions at Trafigura. I also partnered with a CEO at a pre-seed startup to architect a greenfield financial data pipeline ingesting 20 years of SEC filing data across 4,000+ companies into clean, LLM-queryable financial statements.
Previously, as Lead Data Engineer at DataPraxis, I automated a once-manual survey ingestion workflow—saving over 6 hours per survey—and built a self-hosted container-based ingestion and analytics platform from scratch using Google Compute Engine, Docker/Kubernetes, and the Windmill workflow engine. By enforcing team-wide engineering best practices, automated tests, and toolsets, I helped the Analytics team write consistent, bug-free code, and I leveraged AI tools like ChatGPT and Claude Code to improve prototyping speed by 70%.
Earlier, I led mission-critical data engineering at the Municipal Securities Rulemaking Board, overseeing 30 ETL pipelines to improve transparency in the $4 trillion municipal bonds market and spearheading a $50K pipeline upgrade that increased pricing yield curve data availability on 2.7MM securities by 800%. Before that, at Catalist LLC, I led a full restructure of analytics models, improving performance by 60–70%, productionized machine learning to run at scale, and served as a subject matter expert across infrastructure, processes, and analytics—plus I began my path with a Data Science Fellow role where I delivered ML-driven improvements to patient intake compliance.
Experience
Work history, roles, and key accomplishments
Senior Data Engineer
Proxify and Independent Consulting
Jan 2025 - Present (1 year 5 months)
Designed and maintained serverless data pipelines and Streamlit interfaces to process ~500GB of multi-domain data, enabling stakeholder decision-making. Partnered with a CEO to architect and deliver a greenfield financial data pipeline ingesting 20 years of SEC filings across 4,000+ companies into LLM-queryable financial statements.
Lead Data Engineer
DataPraxis
Jan 2024 - Jan 2025 (1 year)
Automated an ad hoc survey analytics workflow, saving 6+ hours per ingestion and reducing ingest time from 45 minutes to 3 minutes. Built a self-hosted container-based ingestion and analytics platform (GCE, Docker/Kubernetes, Windmill), implemented team-wide engineering best practices, and improved prototyping speed by 70% using AI tools.
Data Engineer
Municipal Securities Rulemaking Board
Jan 2022 - Jan 2024 (2 years)
Oversaw data engineering for a $47M+ annual operating budget, creating 30 ETL pipelines that improved transparency across the $4T municipal bonds market and supported $9B+ trades per day. Led a $50K pipeline upgrade as the sole engineer, increasing yield-curve data availability for 2.7M securities by 800% and reducing full data loads from 3 weeks to 6 hours while cutting operating expenses by 95%.
Data Engineer
Catalist LLC
Jan 2019 - Jan 2022 (3 years)
Led a restructure of a suite of 9 analytics models, improving performance by 60–70% and keeping pipelines resilient for 3+ years while saving $200K+ in development time and costs. Productionized ML and analytics models at 1,000,000x scale, improved phone number matching by 4x for 85M records, and mentored a new hire to boost performance by 60%.
Data Science Fellow
MATClinics
Jan 2019 - Present (7 years 5 months)
Developed and executed a machine learning study on 50+ patient intake data columns, identifying key metrics and improving intake compliance by 30%. Built a business dashboard and refactored intake workflows to increase data usability by 20 and save $5K+ in operating expenses.
Education
Degrees, certifications, and relevant coursework
Johns Hopkins University
Master of Science in Applied Mathematics and Statistics, Applied Mathematics and Statistics
Earned a Master of Science in Applied Mathematics and Statistics from Johns Hopkins University (Whiting School of Engineering).
Johns Hopkins University
Bachelor of Science in Applied Mathematics, Applied Mathematics
Earned a Bachelor of Science in Applied Mathematics from Johns Hopkins University.
Johns Hopkins University
Bachelor of Arts in Mathematics, Mathematics
Earned a Bachelor of Arts in Mathematics from Johns Hopkins University.
Tech stack
Software and tools used professionally
Apache Hive
Google Cloud Platform
Google Compute Engine
Kubernetes
Jupyter
NumPy
Pandas
PySpark
dbt
MySQL
PostgreSQL
Vertica
Gmail
Databricks
Jira
JavaScript
scikit-learn
Streamlit
NLTK
SQLAlchemy
Linux
macOS
Windows
Serverless
Airflow
Google BigQuery
Amazon Web Services (AWS)
Amazon Athena
SQL
Rundeck
Windmill
Bash
Claude Code
Matter
ChatGPT
Availability
Location
Authorized to work in
Job categories
Skills
Interested in hiring Lyle?
You can contact Lyle and 90k+ other talented remote workers on Himalayas.
Message LyleFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
