What You'll Do:
- Build and optimize Sauce's lakehouse architecture using Azure Databricks and Unity Catalog for data governance
- Manage and optimize cluster resources to balance performance and cost
- Transform stage/raw layer data, define and manage data schemas, and add data quality tests
- Create and maintain data quality tests, improve existing alerting setups
- Develop reports/dashboards about data quality and consistency
- Improve existing CI/CD pipeline(s) using Databricks Asset Bundles and GitHub Actions/Workflows
- Own data warehouse - connecting data sources, and maintaining a platform and architecture in coordination with R&D infrastructure and operations teams
What You Bring:
- Must Have:
- 5+ years of experience building data pipelines, data infrastructure, data quality suites, and alerting functionality
- Proficient in SQL, Python
- Proficient in Git
- Experience with Databricks
- Familiar with data governance, implementing or managing Unity CatalogUnderstanding of CI/CD principles, with experience implementing Databricks Asset Bundles
- Hands-on experience with data quality suites/frameworks/tools
- Experience with document-based databases (MongoDB) as well as relational databases (PostgreSQL)
- Nice to Have:
- Experience with cloud infrastructure (Azure, GCP)
- Experience with web scraping or API integrations (HubSpot, Stripe, etc.)
Required Technology & Equipment:
- Personal computer or laptop with up-to-date software
- High-speed internet connection
- Keyboard, mouse, working webcam, and headset with a microphone
- Primary 24” monitor (with an additional 24” monitor preferred)
What We Offer:
- Strong & Competitive Compensation Package
- Flexible Work Environment
- 10 Paid Personal/Vacation Days
- 5 Paid Sick Days
- Monthly Wellness Stipend
