Core Responsibilities & Essential Job Functions
- Cloud Platform Proficiency: Strong understanding and hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP). This includes knowledge of cloud services like compute, storage, networking, and databases.
- Data Architecture and Modeling: Proficiency in designing and implementing scalable, high-performance data architectures. Strong understanding of data modeling techniques, including relational, dimensional, and NoSQL data models.
- Data Warehousing: Experience in building and managing data warehouses using technologies like Amazon Redshift, Google BigQuery, or Snowflake. Knowledge of optimizing data warehouse performance and cost management.
- Data Integration and ELT: Expertise in building Extract, Load, Transform (ELT) processes for ingesting and transforming data from various sources into a unified format.
- Big Data Technologies: Proficiency in big data technologies such as Hadoop ecosystem (HDFS, Hive, HBase), Apache Kafka, and Apache Flink. Ability to leverage these technologies for large-scale data processing and real-time analytics.
- Programming Skills: Strong programming skills in languages such as Python, Scala, or Java. Ability to write efficient, maintainable code for data processing, analytics, and automation tasks.
- Data Governance and Security: Prior experience in implementing data governance principles, data privacy regulations (e.g., HIPAA, GDPR, CCPA), and best practices for ensuring data security and compliance.
- Machine Learning and AI: Strong knowledge and deep experience in developing Machine Learning applications. Experience in MLOps and model lifecycle management & model governance in regulated industries like banking and healthcare.
- Experience in deploying LLM models for custom business use cases.
- Data Bricks Expertise: In-depth knowledge of Data Bricks, including data processing using Apache Spark, Delta Lake, Unity Catalog, and MLflow is a plus. Ability to design and implement scalable data pipelines and analytics solutions on the Data Bricks platform.
- Leadership and Team Management: Effective leadership skills to lead and mentor a team of data engineers, BI analysts, data scientists, data governance analysts, and AI/ML engineers. Ability to set clear goals, provide guidance, and foster collaboration within the team.
- Communication and Stakeholder Management: Strong communication skills to interact with cross-functional teams, senior management, and stakeholders. Ability to translate technical concepts into non-technical terms and influence decision-making.
- Problem-solving and Critical Thinking: Aptitude for identifying complex data engineering challenges and devising innovative solutions. Ability to think critically and make data-driven decisions to optimize processes and systems.
- Continuous Learning and Adaptability: Commitment to staying updated with the latest advancements in data engineering, cloud technologies, and industry trends. Willingness to adapt to evolving technologies and business requirements.
- Develop and implement quality controls and departmental standards to ensure quality standards, organizational expectations, and regulatory requirements.
- Contribute to the development and education plans on data engineering capabilities, systems, standards, and processes.
- Anticipate future demands of initiatives related to people, technology, budget and business within your department and design/implement solutions to meet these needs.
- Communicate results and business impacts of insight initiatives to stakeholders within and outside of the company.
Qualifications
Minimum Education, Experience & Training Equivalent to:
- 10 years of experience with modern data engineering projects and practices: designing, building, and deploying scalable data pipelines with 5+ years of experience deploying cloud-based data infrastructure solutions.
- Strong programming skills in Python, Java, or Scala, and their respective standard data processing libraries.
- 3 years of experience building data pipelines for AI/ ML models using PySpark or Python.
- Experience building data pipelines with modern tools such as Data Bricks, Fivetran, dbt etc.
- Strong experience in establishing and maintaining relational databases, SQL, data warehouses, and ELT pipelines
- Experience with Spark, Kafka, etc.
- Experienced integrating data from core platforms like EHR, CRM, and Claims Processing into a centralized warehouse and data lake.
- Extensive knowledge in statistical modeling, machine learning model development and data science.
- Software development best practices with strong rigor in high quality code development, automated testing, and other engineering best practices.
- Extensive knowledge of ELT and Data Warehousing concepts, strategies, and methodologies.
- Experience working with structured and unstructured data.
- Experience establishing real-time data pipelines and processing.
- Familiarity with Azure services like Azure functions, Azure Data Lake Store, Azure Cosmos, Azure Databricks, Azure Data Factory etc.
- Ability to provide solutions that are forward-thinking in data and analytics.
- Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently.
- BSc in Computer Science, Engineering, Statistics, Informatics, Information Systems or another quantitative field.
- Masters degree in Computer Science or Engineering fields preferred.
Knowledge & Skills:
- Sensitivity to working with an ethnically, linguistically, culturally, and economically diverse population.
- A commitment to the values of the organization while demonstrating good judgment, flexibility, patience and discretion when dealing with confidential and sensitive matters.
- Proficient in Microsoft Office (Outlook, Word, Excel, etc.), especially Excel and related computer software.
- Consistently demonstrate good judgment and decision-making skills while maintaining the highest level of confidentiality.
- Work in an exciting, fast paced high energy environment while effectively multitasking and managing day-to-day responsibilities without supervision.
- Personable; able to work comfortably with individuals at all levels within the organization.
- Excellent verbal and writing communication skills; frequent proofreading and checking documents for accuracy.
- Must be highly detail oriented.
- Strong interpersonal skills.
Physical Requirements:
- Must be able to communicate effectively within the work environment, read and write using the primary language within the workplace.
- Visual and auditory ability to work with clients, staff and others in the workplace continuously.
- Frequent speaking and listening (25-75%) to clients, staff, and others in the workplace.
- Utilize computer and cell phone for effective communication.
Conditions of Employment
- Ability to obtain and maintain criminal record clearance through the Department of Justice (DOJ). The People & Performance Department must analyze DOJ/FBI live scan reports in accordance with applicable Federal, State, and Local laws, as well as fitness for the position.
- Ability to obtain and maintain clearance through the Office of Inspector General.
- Must attend any required training.
Time Type:
Full timeCompensation:
$180,000.00 - $240,000.00 AnnuallyThe statements contained in this job description reflect general details as necessary to describe the principal functions of this job. It should not be considered an all-inclusive listing of work requirements. Individuals may perform other duties as assigned, including work in other functional areas as deemed fit for the organization.
Catalight is an equal opportunity employer.
