Open to opportunities

Vasileios Martos

@vasileiosmartos

Senior Site Reliability Engineer specializing in multi-cloud infrastructure, automation, and AI-driven platform operations.

Sweden

Message

What I'm looking for

I seek a senior platform or SRE role driving platform automation, AI-driven operations, and observability at enterprise scale, with emphasis on reliability and team mentorship.

I am a Senior Site Reliability Engineer with 10+ years designing and operating high-scale, multi-cloud platforms across AWS, Azure, and GCP. I combine deep software engineering skills with infrastructure expertise to build resilient systems for payments and high-throughput gaming.

I've led enterprise observability and governance efforts, standardizing monitoring for thousands of services and delivering Terraform-based frameworks adopted as company standards. I also co-led progressive delivery initiatives and implemented automated canary releases and self-healing rollouts.

My recent work focuses on AI-driven operations: building RAG utilities, LangGraph-based agentic workflows, and self-service bots that automate deployment approvals and daily operational tasks. I take pride in improving reliability metrics—maintaining 99.999% availability for critical payment systems and dramatically reducing OpEx through autoscaling.

I thrive on automating complex operational workflows, mentoring teams, and architecting scalable infrastructure that balances reliability, cost, and developer velocity. I seek roles where I can drive platform automation and operational excellence at enterprise scale.

Experience

Work history, roles, and key accomplishments

Current

Senior Site Reliability Engineer

Current

MasterCard

Feb 2024 - Present (2 years 5 months)

Leading AWS EKS platform governance and GitOps (Flux CD) for high-scale financial services. Enthusiast in AI-driven automation using RAG and agentic workflows to streamline deployments and compliance. Expert in Monitoring-as-Code (Dynatrace/Terraform) and managing large-scale migrations (Azure to AWS Aurora) for 4,000+ services.

Kubernetes Flux CD GitOps Azure OpenAI Terraform Observability Incident Response

Senior Site Reliability Engineer

Ubisoft Entertainment SA

Sep 2020 - Feb 2024 (3 years 5 months)

Managed global multi-cloud live-game infrastructure across GCP, Azure and Ubi-Cloud, standardized observability with a GKE Prometheus Operator stack, and supported high-traffic titles with 24/7 on-call incident response.

GCP Terraform Ansible Prometheus Grafana SaltStack Observability

Senior Software Engineer - DevOps

MAX IV Laboratory

Jan 2015 - Sep 2020 (5 years 8 months)

Led the modernization of particle accelerator control systems. I engineered a fleet management system for 400+ FPGA servers using Ansible and C++, and architected GPU-accelerated data pipelines for real-time scientific analysis. I standardized facility-wide observability for 500+ entities and transitioned legacy infrastructure to an agile IaC model using Terraform and VMware.

Python C Terraform Ansible Prometheus Grafana ELK

Software Engineer

CERN

Sep 2011 - Apr 2014 (2 years 7 months)

Developed and maintained C++ control-system applications and Python automation frameworks for LHC alarm/archive systems, supporting large-scale deployment and monitoring for accelerator operations.

C Python Distributed Systems Automation Monitoring

Data Acquisition Engineer

KaHo Sint-Lieven

Feb 2008 - Aug 2008 (6 months)

Developed a LabVIEW-based data acquisition system to detect and analyze vibrations in laboratory environments as part of an engineering exchange program.

Signal Processing Testing Embedded Systems