Skip to main content
HimalayasHimalayas logo
DD
Open to opportunities

Dariusz Dyrga

@dariuszdyrga

Senior Big Data Engineer building scalable AWS/GCP data platforms and ML-ready datasets end to end.

Poland
Message

What I'm looking for

I’m looking for a role where I can architect scalable AWS/GCP data platforms—building medallion lakes, streaming pipelines, and ML-ready datasets—while partnering with stakeholders, improving performance, and mentoring engineers.

I’m a Senior Big Data Engineer with 5+ years of experience designing and delivering scalable data platforms on AWS and GCP. I architect medallion data lakes and build end-to-end pipelines from raw ingestion to BI dashboards and ML-ready datasets, often replacing costly third-party ETL tools with custom, production-grade solutions.

I’ve led high-stakes legacy system migrations with complex business logic, working closely with non-technical stakeholders to clarify semantics and validate outcomes. In recent roles, I delivered executive-facing Looker dashboards, built BigQuery ingestion pipelines from GCP APIs and legacy sources, and eliminated critical BigQuery performance bottlenecks—reducing runtime from 30 minutes to 20 seconds.

I’m hands-on across the modern data stack—PySpark, Kafka, Airflow, Delta Lake, Iceberg, and Terraform/Terragrunt—and I enjoy designing streaming architectures and governance patterns (e.g., schema registries). I also mentor engineers, run internal tech-talks, and bring an ML perspective from my MSc thesis on transfer learning applied to medical imaging.

Experience

Work history, roles, and key accomplishments

FN

Senior Big Data Engineer

Fortune 500 (NDA)

Jan 2025 - Jan 2026 (1 year)

Led a full-system cutover migration of cost, sales, and client data from a legacy SQL system to BigQuery, translating complex business logic into the target schema. Built Cloud Run Java ETL jobs, BigQuery views, and Informatica IICS workflows to orchestrate and support the migration workflow with business stakeholders.

MB

Senior Big Data Engineer

MBH Bank

Jan 2024 - Jan 2025 (1 year)

Architected and delivered an end-to-end streaming data lake for a financial institution processing hundreds of Kafka topics with under 1 TB/month of net new data across Bronze, Silver, and Gold layers. Implemented Kafka and AWS Glue schema governance, built an Airflow DagFactory orchestration pattern for hundreds of pipelines, and defined repeatable infrastructure using Terraform and Terragrunt.

GD

Senior Big Data Engineer

Grid Dynamics

Jan 2023 - Jan 2024 (1 year)

Led end-to-end design and delivery of a custom AWS data platform integrating approximately 45–50 data sources to replace FiveTran, enabling lower-cost ETL and a foundation for future ML workloads. Built distributed PySpark ETL on AWS Glue, migrated GA3/GA4 into Redshift with schema redesign and nesting flattening, and orchestrated workflows using Glue Workflows and Step Functions.

GD

Junior Big Data Engineer

Grid Dynamics

Jan 2021 - Jan 2022 (1 year)

Developed Java integration code connecting internal systems to Databricks for batch data cataloguing from BigQuery. Optimized BigQuery SQL queries and built/maintained Databricks ETL notebooks using GitLab for CI/CD.

HG

Senior Big Data Engineer

HSBC & Google

Jan 2026 - Present (5 months)

Designed and implemented GCP API and legacy-system ingestion pipelines into BigQuery for executive-facing Looker dashboards covering cost reporting, resource metrics, and security compliance. Resolved BigQuery performance bottlenecks by partitioning/clustering and splitting a large table, reducing a key query from 30 minutes to 20 seconds and cutting monthly BigQuery costs from $6,000+ to under $1

GD

Senior Big Data Engineer

Grid Dynamics

Jan 2025 - Present (1 year 5 months)

Supported training of a Two-Towers AI recommendation model by building SQL-driven feature tables and orchestrating data movement between BigQuery and BigTable. Implemented Kubeflow pipelines with logging, staging, and validation steps, and added data-quality checks with unit tests and GitLab CI/CD automation.

GD

Big Data Engineer

Grid Dynamics

Jan 2022 - Present (4 years 5 months)

Implemented FiveTran-based ingestion combined with BigQuery to address growing multi-source data volumes and prepare datasets for analytics consumption. Refactored an Airflow DAG management approach by introducing a DagFactory pattern with YAML configurations and built pipelines migrating data from BigQuery to BigTable.

Education

Degrees, certifications, and relevant coursework

Jagiellonian University logoJU

Jagiellonian University

Master of Science in Computer Science, Computer Science

2021 - 2023

MSc in Computer Science at Jagiellonian University with a thesis on transfer learning applied to coronary angiography image analysis.

Jagiellonian University logoJU

Jagiellonian University

Bachelor of Science in Computer Science, Computer Science

2018 - 2021

BSc in Computer Science at Jagiellonian University with a thesis on a patient findings data platform as a REST API (Java/Spring) with a React/MaterialUI frontend.

Find your dream job

Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up
Himalayas profile for an example user named Frankie Sullivan