Location:
Key Responsibilities
- Build and maintain Python Notebooks to ingest data from third-party APIs
- Design and implement Medallion layer architecture (Bronze, Silver, Gold) for structured data organization and progressive data refinement
- Store and manage data within Microsoft Fabric's Data Lake and Warehouse using delta parquet file formats
- Set up data pipelines and sync key datasets to Azure Synapse Analytics
- Develop PySpark-based data transformation processes across Bronze, Silver, and Gold layers
- Collaborate with developers, analysts, and stakeholders to ensure data availability and accuracy
- Monitor, test, and optimize data flows for reliability and performance
- Document processes and contribute to best practices for data ingestion and transformation
- Python (Notebooks)
- PySpark
- Microsoft Fabric Data Lake & Data Warehouse
- Delta Parquet files
- Azure Synapse Analytics
- Azure Data Factory, Azure DevOps
Requirements
- Strongexperience with Python for data ingestion and transformation
- Proficiencywith PySpark for large-scale data processing;
- Proficiencyin working with RESTful APIs and handling large datasets;
- Experiencewith Microsoft Fabric or similar modern data platforms;
- Understandingof Medallion architecture (Bronze, Silver, Gold layers) and data lakehouseconcepts;
- Experienceworking with Delta Lake and parquet file formats;
- Understandingof data warehousing concepts and performance tuning;
- Familiaritywith cloud-based workflows, especially within the Azure ecosystem.
Nice toHave
- Experiencewith marketing APIs such as Google Ads or Google Analytics 4;
- Familiaritywith Azure Synapse and Data Factory pipeline design;
- Understandingof data modeling for analytics and reporting use cases;
- Experiencewith AI coding tools;
- Experiencewith Fivetran, Aribyte, and Riverly.
Details
