We are seeking an experienced Hadoop Developer to design, build, and operate large-scale data processing pipelines and analytics platforms on Hadoop and related big-data ecosystems.
Requirements
- Design, develop, and operate end-to-end big-data pipelines on Hadoop, ingesting data from a diverse mix of relational, file-based, streaming, and API-driven sources.
- Build robust ETL/ELT workflows using Apache Spark, Hive, Pig, and Sqoop, with strong attention to data quality, idempotency, error handling, and recoverability.
- Develop high-throughput streaming data pipelines using Kafka, Spark Streaming, or Flink, and integrate them with downstream analytical and operational systems.
- Optimize Spark and MapReduce jobs through careful tuning of partitioning, memory, serialization, and skew handling to meet demanding SLAs at minimal cost.
- Design and maintain data models and storage layouts on HDFS, Hive, HBase, and modern lakehouse formats (Parquet, ORC, Delta, Iceberg, Hudi) to balance flexibility and performance.
- Implement data governance, lineage, and quality controls in collaboration with data governance and security teams.
- Build robust monitoring, alerting, and logging strategies for big-data pipelines, including job-level SLAs and proactive failure detection.
- Partner with data scientists and analysts to deliver curated, reliable, and well-documented datasets that accelerate their work.
- Automate pipeline orchestration using Airflow, Oozie, or similar workflow engines, with clean dependency management and clear ownership boundaries.
- Continuously evaluate and adopt new technologies in the big-data and cloud ecosystem (EMR, Databricks, Snowflake, BigQuery) where they offer meaningful improvements.
- Lead performance reviews and architecture audits of existing pipelines, proposing concrete refactoring and optimization initiatives.
- Document data architectures, schemas, pipeline behaviors, and operational runbooks in a way that makes the platform supportable as the team scales.
- Mentor junior engineers and contribute to the team’s engineering standards and best practices.
Benefits
- Competitive base salary commensurate with experience, plus benefits.
