Mahela Pradeep
@mahelapradeep
Backend engineer specializing in low-latency Kafka distributed systems for financial markets.
What I'm looking for
I’m a backend engineer with 4+ years of experience building low-latency, high-throughput distributed systems for financial markets at London Stock Exchange Group. My work centers on Kafka-based event-driven architectures, exactly-once processing, and cross-region disaster recovery where correctness and performance both matter.
I’ve delivered systems processing 10,000 messages/sec with sub-100ms latency by tuning streaming pipelines—like increasing Kafka partition counts, removing redundant state stores, and addressing partition hotspots. I also improved reliability and latency by reworking publishing and consumption semantics, including careful handling of deduplication to preserve correctness.
Operationally, I focus on making production behavior predictable: I diagnosed and fixed a critical gRPC concurrency bug that caused production hangs during end-of-day processing, and I resolved intermittent data loss by enforcing deterministic SQL pagination (adding the missing ORDER BY). I also built backpressure control using watermark buffering, stabilizing throughput under variable load.
For resilience, I led a cross-region disaster recovery architecture for a 15+ microservice trading platform, enabling automated failover within ~20 minutes (RTO) while guaranteeing RPO = 0 for acknowledged messages. Earlier, I created a type-safe FIX-to-Protobuf translation framework and a Kafka-based health checker to prevent cascading failures when downstream systems fell behind.
Experience
Work history, roles, and key accomplishments
Optimized Kafka-based streaming architecture to reduce market data latency from ~150ms to <50ms (p95) by increasing partitions (18 vs 12) and removing redundant state stores. Improved downstream processing latency from ~150ms to ~97ms and resolved a production gRPC concurrency bug to eliminate end-of-day hangs.
Improved performance to achieve <100ms latency and 10,000 msg/sec throughput across 12 partitions using profiling and concurrency fixes. Re-architected Kafka-to-database ingestion (single-threaded to partition-aware multi-threading) to raise write throughput from ~300 to ~3,500 records/sec, and implemented backpressure and RPO=0 recovery across pod restarts and failover.
Designed cross-region disaster recovery for a 15+ microservice trading platform, enabling automated failover to a secondary AWS region in ~20 minutes (RTO) with RPO=0. Implemented replay-based Kafka recovery using MirrorMaker, region-aware Helm deployments, and a crash-resumable Kafka Streams initialization strategy that reduced startup time from ~20–40 minutes to ~1–3 minutes.
Built a type-safe FIX-to-Protobuf translation framework in Java to support bidirectional transformation across trading gateways, including reusable mappings for enums and complex FIX groups. Developed a Kafka-based system health checker using partition-level lag analysis to gate ingestion during sustained degradation and prevent cascading failures.
Education
Degrees, certifications, and relevant coursework
University of Moratuwa
Bachelor of Science Engineering (Hons.), Computer Science & Engineering
2018 - 2023
Grade: First class (GPA 3.74); Dean’s list (3 semesters)
Activities and societies: Dean’s list in 3 semesters.
B.Sc. Engineering (Hons.) program specializing in Computer Science & Engineering at the University of Moratuwa, completed in 2023.
Availability
Location
Authorized to work in
Job categories
Interested in hiring Mahela?
You can contact Mahela and 90k+ other talented remote workers on Himalayas.
Message MahelaFind your dream job
Sign up now and join over 100,000 remote workers who receive personalized job alerts, curated job matches, and more for free!
