This is a remote position.
We are looking for Data Engineers (Junior, Mid or Senior)!
- Design, build, and maintain end-to-end data pipelines (batch and/or streaming), from ingestion to transformation and delivery.
- Develop and operate ETL/ELT workflows, ensuring reliability, scalability, and performance.
- Write efficient, production-grade SQL queries for data extraction, transformation, and analytics use cases.
- Implement and maintain data models (e.g., star schemas, incremental models) optimized for analytics and reporting.
- Develop reusable and modular Python code for data transformations and pipeline logic.
- Monitor data pipelines, troubleshoot failures, and perform root cause analysis across code, orchestration, data sources, and cloud services.
- Ensure data quality by implementing automated validation checks (schema validation, freshness checks, row-level assertions).
- Translate business and analytical requirements into robust technical data solutions.
- Collaborate with analysts, backend engineers, and other stakeholders to define data contracts and ensure data availability.
- Actively participate in planning, estimation, and prioritization of data engineering tasks.
- Proactively identify risks related to performance, scalability, or data integrity and propose mitigation strategies.
- Contribute to continuous improvement of data platforms, processes, and team practices.
- Write and maintain technical documentation for pipelines, schemas, and data lineage.
- Communicate clearly with team members and clients, raising questions and concerns when requirements or priorities are unclear.
- Support and mentor other team members when appropriate, contributing to overall team delivery.
Requirements
- Professional experience as a Data Engineer working with production data pipelines.
- Strong experience with SQL, including query optimization, indexing, partitioning, and performance trade-offs.
- Professional experience writing Python for data transformations, following good design and modularization practices.
- Experience designing and implementing data models for analytics use cases.
- Experience building and operating pipelines using cloud-based data platforms.
- Hands-on experience with Azure, Databricks, and Data Lake environments.
- Experience operating data pipelines, including error handling, monitoring, and data quality processes.
- Familiarity with Git for version control, including branching and resolving merge conflicts.
- Experience working with Kubernetes or containerized data workloads.
- Understanding of data formats such as Parquet or ORC, including cost and performance considerations.
- Knowledge of basic data security and governance practices (access control, masking, PII handling).
- Ability to deliver less complex tasks independently and more complex tasks with guidance.
- Strong sense of ownership, responsibility, and accountability for data workflows.
- Good organizational and time management skills, with the ability to estimate and meet delivery deadlines.
- Advanced English for collaboration with global clients.
- Team-oriented mindset with strong communication and problem-solving skills.
- Live in Latin America region.
- Experience with data orchestration tools (e.g., Airflow, Azure Data Factory, or similar).
- Exposure to CI/CD for data pipelines and deployment automation.
- Experience with streaming data (e.g., Kafka, Event Hubs).
- Familiarity with data observability and monitoring tools.
- Experience collaborating with Machine Learning or advanced analytics teams.
- Experience working with Java Spring Boot in data engineering projects.
