We are seeking a talented and motivated Data Engineer to join our dynamic team. The ideal candidate will have 1-3 years of experience in data engineering, with a strong background in data manipulation, building data models, ETL processes, and data pipeline development. This role offers an exciting opportunity to work on diverse projects, collaborating closely with cross-functional teams to design, build, and optimize data infrastructure solutions.
Key Responsibilities:
- Develop and maintain robust ETL processes to extract, transform, and load data from various sources into our data warehouse.
- Design, implement, and optimize data pipelines to support the ingestion, processing, and storage of large-scale datasets.
- Collaborate with data scientists and analysts to understand their requirements and ensure data availability and quality for analytical insights and model development.
- Work closely with software engineers to integrate data engineering solutions into our existing systems and applications.
- Monitor and troubleshoot data pipelines, ensuring reliability, performance, and scalability.
- Implement data governance best practices to ensure data integrity, security, and compliance with regulatory requirements.
- Continuously evaluate and adopt new technologies, tools, and techniques to enhance our data engineering capabilities and drive innovation.
- Document technical designs, processes, and procedures to facilitate knowledge sharing and collaboration within the team.
Qualifications:
Bachelor's degree in Computer Science, Engineering, or related field.
1-3 years of hands-on experience in data engineering, with proficiency in SQL, Python, and/or other programming languages for data manipulation and scripting.
Strong understanding of database concepts and experience working with relational databases (e.g., PostgreSQL, MySQL, MSSQL) and NoSQL databases (e.g., MongoDB, Cassandra, DynamoDB).
Experience with data warehousing solutions (e.g., Snowflake, Redshift) and cloud platforms (e.g., AWS, Azure, GCP).
Familiarity with ETL tools (e.g., Apache Airflow, Talend, Informatica, Databricks) and workflow orchestration frameworks.
Knowledge of big data technologies (e.g., Hadoop, Spark) and distributed computing principles is a plus.
Excellent problem-solving skills, attention to detail, and ability to work independently as well as part of a team.
Strong communication and interpersonal skills, with the ability to effectively collaborate with stakeholders across different functional areas.