Mahela Pradeep
@mahelapradeep
Backend engineer specializing in low-latency Kafka distributed systems for financial markets.
What I'm looking for
I’m a backend engineer with 4+ years of experience building low-latency, high-throughput distributed systems for financial markets at London Stock Exchange Group. My work centers on Kafka-based event-driven architectures, exactly-once processing, and cross-region disaster recovery where correctness and performance both matter.
I’ve delivered systems processing 10,000 messages/sec with sub-100ms latency by tuning streaming pipelines—like increasing Kafka partition counts, removing redundant state stores, and addressing partition hotspots. I also improved reliability and latency by reworking publishing and consumption semantics, including careful handling of deduplication to preserve correctness.
Operationally, I focus on making production behavior predictable: I diagnosed and fixed a critical gRPC concurrency bug that caused production hangs during end-of-day processing, and I resolved intermittent data loss by enforcing deterministic SQL pagination (adding the missing ORDER BY). I also built backpressure control using watermark buffering, stabilizing throughput under variable load.
For resilience, I led a cross-region disaster recovery architecture for a 15+ microservice trading platform, enabling automated failover within ~20 minutes (RTO) while guaranteeing RPO = 0 for acknowledged messages. Earlier, I created a type-safe FIX-to-Protobuf translation framework and a Kafka-based health checker to prevent cascading failures when downstream systems fell behind.
Experience
Work history, roles, and key accomplishments
Optimized Kafka-based streaming architecture to reduce market data latency from ~150ms to <50ms (p95) by increasing partitions (18 vs 12) and removing redundant state stores. Improved downstream processing latency from ~150ms to ~97ms and resolved a production gRPC concurrency bug to eliminate end-of-day hangs.
Improved performance to achieve <100ms latency and 10,000 msg/sec throughput across 12 partitions using profiling and concurrency fixes. Re-architected Kafka-to-database ingestion (single-threaded to partition-aware multi-threading) to raise write throughput from ~300 to ~3,500 records/sec, and implemented backpressure and RPO=0 recovery across pod restarts and failover.
Designed cross-region disaster recovery for a 15+ microservice trading platform, enabling automated failover to a secondary AWS region in ~20 minutes (RTO) with RPO=0. Implemented replay-based Kafka recovery using MirrorMaker, region-aware Helm deployments, and a crash-resumable Kafka Streams initialization strategy that reduced startup time from ~20–40 minutes to ~1–3 minutes.
Built a type-safe FIX-to-Protobuf translation framework in Java to support bidirectional transformation across trading gateways, including reusable mappings for enums and complex FIX groups. Developed a Kafka-based system health checker using partition-level lag analysis to gate ingestion during sustained degradation and prevent cascading failures.
Education
Degrees, certifications, and relevant coursework
University of Moratuwa
Bachelor of Science Engineering (Hons.), Computer Science & Engineering
2018 - 2023
Grade: First class (GPA 3.74); Dean’s list (3 semesters)
Activities and societies: Dean’s list in 3 semesters.
B.Sc. Engineering (Hons.) program specializing in Computer Science & Engineering at the University of Moratuwa, completed in 2023.
Availability
Location
Authorized to work in
Job categories
Interested in hiring Mahela?
You can contact Mahela and 90k+ other talented remote workers on Himalayas.
Message MahelaFind your dream job
Sign up now and join over 250,000+ remote workers who receive personalized job alerts, curated job matches, and more for free!
