Loading...
Loading...
Himalayas
About usHimalayas PlusCommunityTech stackEmployee benefitsTerms and conditionsPrivacy policyContact usFor job seekers
Create your profileBrowse remote jobsDiscover remote companiesJob description keyword finderRemote work adviceCareer guidesJob application trackerAI resume builderResume examples and templatesAI cover letter generatorCover letter examplesAI headshot generatorAI interview prepInterview questions and answersAI interview answer generatorAI career coachFree resume builderResume summary generatorResume bullet points generatorResume skills section generator© 2025 Himalayas. All rights reserved. Built with Untitled UI. Logos provided by Logo.dev. Voice powered by Elevenlabs Grants
Join the remote work revolution
Join over 100,000 job seekers who get tailored alerts and access to top recruiters.
@jaybergen
Senior Data Engineer specializing in machine learning and data pipelines.
With over 9 years of experience in data engineering and machine learning, I excel at transforming messy data into actionable insights. My expertise lies in building robust data pipelines and real-time systems that enhance operational efficiency across various industries, including healthcare and fintech.
At CitiusTech, I designed and optimized terabyte-scale data pipelines, developed LLM-powered document parsing systems, and engineered fraud detection models that saved millions in fraudulent payouts. My technical skills span a wide range of tools and platforms, including Apache Spark, Kafka, and AWS, enabling me to deliver high-quality data solutions that drive business success.
I am passionate about mentoring junior engineers and collaborating with cross-functional teams to implement AI-powered solutions. I thrive in environments where data integrity and reliability are paramount, and I am committed to ensuring that data works effectively for all stakeholders.
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!

Work history, roles, and key accomplishments
CitiusTech
May 2021 - Jun 2025 (4 years 1 month)
Designed, built, and optimized terabyte-scale data pipelines using Databricks, Apache Spark, and Azure Data Factory, ensuring seamless ingestion, transformation, and storage of structured and unstructured healthcare data. Developed and deployed LLM-powered document parsing systems leveraging OCR, deep learning models, and Graph Convolutional Networks (GCNs), improving data extraction accuracy by 4
Sift
May 2019 - Apr 2021 (1 year 11 months)
Designed and implemented real-time fraud detection pipelines using Azure Synapse, Apache Spark, and Kafka, analyzing millions of transactions daily and reducing fraudulent activity by 30%. Developed high-performance ETL workflows in PySpark and SQL, increasing data processing efficiency by 50% for e-commerce datasets.
Rivery
Nov 2017 - Apr 2019 (1 year 5 months)
Developed financial ETL pipelines using Azure Data Factory, Python, and SQL, ensuring seamless aggregation of transactional data from multiple banking systems. Built and deployed fraud detection models using Isolation Forests and One-Class SVM, successfully reducing financial fraud risks.
Fivetran
Feb 2017 - Oct 2017 (8 months)
Assisted in building Azure-based data ingestion pipelines, supporting large-scale ML projects. Developed ETL scripts for data normalization, improving query performance.
Degrees, certifications, and relevant coursework
Master's degree, Information Science
2015 - 2016
Completed a Master's degree in Information Science at the National University of Singapore, deepening expertise in advanced topics and research methodologies.
Bachelor of Science, Information Science
2011 - 2015
Studied Information Science at the National University of Singapore, focusing on foundational concepts and applications within the field.
Software and tools used professionally
Fivetran
Azure Synapse
Apache Spark
AWS Glue
Apache Flink
AtScale
GitHub
Bitbucket
Azure Repos
Kubernetes
Jenkins
GitHub Actions
Salesforce
PySpark
Debezium
dbt
DB
MySQL
PostgreSQL
MongoDB
Cassandra
Hadoop
Gmail
Databricks
Zendesk
Redis
Terraform
Java
TensorFlow
PyTorch
MLflow
scikit-learn
HubSpot
Kafka
Apache NiFi
Grafana
Prometheus
OpenTelemetry
Azure Monitor
Windows
Datadog
ClickUp
GraphQL
Google Cloud Pub/Sub
pytest
OAuth2
Airflow
Apache Beam
SQL
Dagster
You can contact Jay and 90k+ other talented remote workers on Himalayas.
Message Jaysantosh kharal
Lead Data Engineer, Raytheon Company
Julian Smith
Senior Data Engineer, Stripe
Ryan User
Senior Data Engineer, Prime Healthcare
Syeda Yasir
Staff Data Engineer, Monte Carlo
Jordan Wright
Senior Data Engineer (AI & ML), Flatiron Health, Inc.
Affi Ahmad
Senior Data Engineer, GlidePath Logistics, LLC
Samip Subedi
Senior Data Engineer, Johnson & Johnson
Sangay Tenzin
Data Engineer, Change Healthcare
Saujan Baniya
Senior Data Engineer, Pfizer
Alexander Nettekoven
Senior Data Engineer, Dutech