We are looking for a Senior Data Engineer to join a team to work on data pipeline development. You would work closely with small, highly collaborative Operations and Development teams to architect and deploy high-impact security solutions.
You must be able to work US time zones (UTC-8 to UTC-5/UTC-7 to UTC-4).
Job Responsibilities
- Build data management pipelines from heterogeneous data sources into Data Marts and other repositories for subsequent analysis and data mining
- Code, test, and support Python-based services, implementing ETL/ELT, data cleansing, etc.
- Integrate with internal (on-prem) and external services utilizing various approaches like RESTful APIs, DMS, distributed event streaming etc., develop shippable code, documentation, and unit test new features
- Collaborate with Quality, Product, and other Engineering teams
- Provide code reviews, design feedback, demos, technical requirements & documentation
- Scope projects, provide accurate estimates for reliable delivery of projects
- Facilitate discovery sessions to map out business processes.
- Collaborate with data architects, business users and create comprehensive functional specifications, and process flow diagrams that bridge the gap between business needs and data engineering implementation.
- Independently explore backend systems and database schemas to determine the feasibility of business requirements.
- Trace data lineage to understand how a metric is calculated in the source system versus how the business defines it, resolving discrepancies before they reach the dashboard
Key Applicant Requirements
- Professional development experience using Python, including an understanding of functional programming: 5+ years
- Cloud infrastructure experience in AWS, Docker: 3+ years
- Experience in Distributed event streaming and process orchestration and with modern data stack.
- Infrastructure As Code (Terraform) experience: 1+ years
- Strong proficiency in Python and SQL.
- Proficient with relational databases such as Oracle, MySQL, PostgreSQL
- Experience with AWS services such as RDS, Redshift, S3, EMR, MWAA (Airflow) and MSK, ECS, SNS, SQS etc.
- Experience with designing and building Data Marts and ETL/ELT
- Proficient in data profiling and analysis techniques utilizing SQL and/or Python.
- Proficient with Jira, Confluence, and GIT for version control.
- Hands-on experience with Agile/Scrum/Kanban
- Excellent written & verbal communication skills, working asynchronously and independently in a remote environment.
- Ability to coordinate between teams.
- Data visualization dashboarding knowledge/experience.
Pluses
- Design, implementation, and/or maintenance of CICD, Bash scripts
- Experience with using AWS EMR Serverless
Interview Process
We will assist you with preparation, including mock interviews, and coaching to succeed! The steps typically are:
- Prescreen with recruiters
- Technical interview with development team
