Data Engineer

Datavail is a leading provider of database management services, specializing in solutions that enhance database performance and reliability.

Datavail

Employee count: 1001-5000

Colombia only

Apply now

Tailor my resume Write my cover letter Generate mock interview Tailor my resume Write my cover letter Generate mock interview

Data Engineer is responsible for designing, building, and maintaining the infrastructure and systems required for collecting, storing, and processing large datasets efficiently.

Education: Bachelor's degree in computer science with 8+ years of experience

Experience:

Technical Skills
- Programming Languages: Proficiency in Python, SQL, Java, or Scala for data manipulation and pipeline development.
- Data Processing Frameworks: Experience with tools like Apache Spark, Hadoop, or Apache Kafka for large-scale data processing.
Data Systems and Platforms
- Databases: Knowledge of both relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra).
- Data Warehousing: Experience with platforms like Snowflake, Amazon Redshift and Azure Synapse.
- Cloud Platforms: Familiarity with AWS, Azure Cloud for deploying and managing data pipelines. Having Good experience in Fabric is advantageous
- Experience working with distributed computing systems such as Hadoop HDFS, Hive, or Spark.
- Managing and optimizing data lakes and delta lakes for structured and unstructured data.
Data Modeling and Architecture
- Expertise in designing efficient data models (e.g., star schema, snowflake schema) and maintaining data integrity.
- Understanding of modern data architectures like Data Mesh or Lambda Architecture.
Data Pipeline Development
- Building and automating ETL/ELT pipelines for extracting data from diverse sources, transforming it, and loading it into target systems.
- Monitoring and troubleshooting pipeline performance and failures.
Workflow Orchestration
- Hands-on experience with orchestration tools such as Azure Data Factory, AWS Glue jobs, DMS or Prefect to schedule and manage workflows.
Version Control and CI/CD
- Utilizing Git for version control and implementing CI/CD practices for data pipeline deployments.

Key Skills:

Proficiency in programming languages such as Python, SQL, and optionally Scala or Java.
Proficiency in data processing frameworks like Apache Spark and Hadoop is crucial for handling large-scale and real-time data.
Expertise in ETL/ELT tools such as Azure ADF and Fabric Data Pipeline is important for creating efficient and scalable data pipelines.
A solid understanding of database systems, including relational databases like MySQL and PostgreSQL, as well as NoSQL solutions such as MongoDB and Cassandra, is fundamental.
Experience with cloud platforms, including AWS, Azure and their data-specific services like S3, BigQuery, and Azure Data Factory, is highly valuable.
Data modeling skills, including designing star or snowflake schema, and knowledge of modern architectures like Lambda and Data Mesh, are critical for building scalable solutions.

Role and Responsibilities:

Responsible for designing, developing, and maintaining data pipelines and infrastructure to support our data-driven decision-making processes.
Design, build, and maintain data pipelines to extract, transform, and load data from various sources into our data warehouse and data lake.
Proficient in creating data bricks creating notebooks, working with catalogs, native SQL, creating clusters, Parameterizing notebooks, and administrating data bricks. Define security models and assign roles as per requirement.
Responsible for creating data flow in Synapse analytics integrating external source systems, creating external tables, data flows and create data models. Schedule the pipelines using various jobs, creating trigger
Design and develop data pipelines using Fabric pipelines, spark notebooks accessing multiple data sources. Proficient in developing Data bricks notebooks and data optimization
Develop and implement data models to ensure data integrity and consistency. Manage and optimize data storage solutions, including databases and data warehouses.
Develop and implement data quality checks and validation procedures to ensure data accuracy and reliability.
Design and implement data infrastructure components, including data pipelines, data lakes, and data warehouses.
Collaborate with data scientists, analysts, and other stakeholders to understand business requirements and translate them into technical solutions.
Monitoring Azure and Fabric data pipelines, spark jobs and work on fixes based on the request priority.
Responsible for data monitoring activities, having good knowledge on reporting tools like Power Bi and Tableau is required.
Responsible for understanding the client requirements and architect solutions in both Azure and AWS cloud platforms.
Monitor and optimize data pipeline performance and scalability to ensure efficient data processing.

Datavail is a leading provider of data management, application development, analytics, and cloud services, with more than 1,000 professionals helping clients build and manage applications and data via a world-class tech-enabled delivery platform and software solutions across all leading technologies. For more than 17 years, Datavail has worked with thousands of companies spanning different industries and sizes, and is an AWS Advanced Tier Consulting Partner, a Microsoft Solutions Partner for Data & AI and Digital & App Innovation (Azure), an Oracle Partner, and a MySQL Partner.

Apply now

Please let Datavail know you found this job on Himalayas. This helps us grow!

Apply now

About the job

Apply before

Jun 17, 2025

Posted on

Apr 18, 2025

Job type

Full Time

Experience level

Mid-level

Location requirements

Colombia

Hiring timezones

Colombia +/- 0 hours

About Datavail

Learn more about Datavail and their company culture.

View company profile

Founded with a mission to help companies effectively manage their data through top-notch database services, Datavail has become a leader in the data management industry. Datavail offers a range of services designed to enhance the performance, availability, and overall success of database solutions.

The company specializes in providing database administration, development, and support services for a multitude of database platforms. With a focus on customer satisfaction, Datavail implements tailored solutions to meet the unique needs of each client, ensuring seamless data operations and management. Their expertise spans across various industries, allowing for customized approaches that facilitate growth and operational efficiency.

Tech stack

Learn about the tools and technologies that Datavail uses to build, market, and sell its products.

View tech stack

.NET

Amazon Redshift

Snowflake

AWS Glue

Talend

Tableau

Amazon Quicksight

Microsoft Azure

Salesforce

MySQL

PostgreSQL

5 more

Datavail employees can create an account create an account to update this tech stack.

Employee benefits

Learn about the employee benefits and perks provided at Datavail.

View benefits

Unlimited time off

Datavail's PTO and Vacation policy typically gives unlimited days off a year.

Flexible working hours

We embody flexibility for availability. That means we put the needs of our clients first but also understand our employees' work-life balance.

Learning and development budget

We hire great people and invest in them to expand their skillsets. That's how we've been able to grow. We happily give great people the chance to grow with us.

View Datavail's employee benefits

Apply now

Please let Datavail know you found this job on Himalayas. This helps us grow!

Apply now

New

Elevate your application

Generate a resume, cover letter, or prepare with our AI mock interviewer tailored to this job's requirements.

Tailor my resume Write my cover letter Generate mock interview Tailor my resume Write my cover letter Generate mock interview

By using our AI job search tools, you consent to sharing your profile with our AI partner for this purpose.