About Luma Financial Technologies
Founded in 2018, Luma Financial Technologies (“Luma”) has pioneered a cutting-edge fintech software platform that has been adopted by broker/dealer firms, RIA offices, and private banks around the world. By using Luma, institutional and retail investors have a fully customizable, independent, buy-side technology platform that helps financial teams more efficiently learn about, research, purchase, and manage alternative investments as well as annuities. Luma gives these users the ability to oversee the full, end-to-end process lifecycle by offering a suite of solutions. These include education resources and training materials; creation and pricing of custom structured products; electronic order entry; and post-trade management. By prioritizing transparency and ease of use, Luma is a multi-issuer, multi-wholesaler, and multi-product option that advisors can utilize to best meet their clients’ specific portfolio needs. Headquartered in Cincinnati, OH, Luma also has offices in New York, NY, Miami, FL, Zurich, Switzerland, and Lisbon, Portugal. For more information, please visit Luma’s website
About the role
At Luma, our Site Reliability Engineer (SRE) team keeps our platform reliable, secure, and lightning fast. They own everything from AWS infrastructure and Kubernetes clusters to CI/CD pipelines, monitoring, and alerting. If you’re passionate about tackling big challenges, automating at scale, and making systems more resilient, we’d love to have you on the team.
What you'll do
- Collaborate with product engineering teams to design and build the infrastructure their services run on.
- Keep our Kubernetes clusters on AWS EKS running smoothly, secure, and ready to scale.
- Design and deliver resilience strategies that cover multi-region architecture, backups, disaster recovery, and failover.
- Automate infrastructure with Terraform and Infrastructure-as-Code, reducing manual effort and human error.
- Help teams ship faster by improving CI/CD pipelines and deployment practices.
- Monitor performance and reliability using modern observability tools.
- Support on-call rotations and lead incident response with a focus on long-term fixes.
What We're Looking For
- You code to solve problems and are comfortable in one of the following languages: Python, Bash, Go, Java, or similar.
- You have strong experience with AWS (RDS, CloudFront, IAM, VPCs), Terraform, and Kubernetes.
- You are resilience focused, with experience designing and running systems that remain dependable during failures and recover seamlessly.
- You have hands-on experience improving and operating CI/CD pipelines (e.g., CircleCI, GitHub Actions, or similar) to help teams ship faster with confidence.
- You stay calm under pressure, bringing incident response expertise and strong root-cause analysis skills.
- Most importantly, you are a team player who brings clear communication, strong collaboration, and a mindset of continuous improvement.