- Work on Data Warehouse, Data Lake and BI projects and architectures at Paige.
- Create and implement ETL pipelines that enables the extraction, transformation and transfer of large amounts of structured and unstructured data from various filesystems and databases, that are destined for the development of computation pathology algorithms.
- Handle the challenges that come with managing terabytes of data.
- Build tools to manage, automate and monitor our data and data processing infrastructure.
- Design and develop software tools into existing resources. Be responsible for design, coding, testing, packaging, debugging, documentation and deployment of software systems.
- Work independently to produce required functional, technical, and user documentation (e.g., business requirements, functional and technical specifications, system architecture, data flows, end-users training requirements) on assigned projects.
- Work and collaborate with data engineers, scientists, engineers, IT operations and medical doctors to build tools manipulating data in order to build a new generation of artificial intelligence applications for cancer detection and treatment.
- Experience in architecting, implementing and testing data processing pipelines (e.g. Spark, Beam, ...) and data mining / data science algorithms either on-premise or on a cloud environment.
- Experience in administrating and ingesting data into standard data warehouses (e.g. Amazon Redshift, Microsoft SQL Server, Google BigQuery or Snowflake).
- Experience architecting data warehouses and/or data lakes for large amounts of structured and unstructured data.
- Experience with data lakes and expertise with designing and maintaining a BI solution.
- Experience with workflow management tools and platforms, such as Airflow.
- Extensive experience in Python programming, or related language.
- Experience with RDBMS and NoSQL databases (e.g. MongoDB).
- Experience in packaging and deploying applications on-premise and in the cloud (e.g. AWS).
- Familiarity with modern development practices and DevOps.
- Interest in building non-standard medical software applications, in collaboration with medical partners. Cross-disciplinary and strong analytic skills.
- Bachelor’s degree in computer science or a related field, or equivalent years of experience.
- 3+ years of industry experience as a data engineer.
Please let Paige know you found this job on Himalayas. This will help us grow!
About this role
August 16th, 2021
Job posted on
February 24th, 2021
Paige is hiring for this role in the following timezones:
About the companyWe're building next generation computational technology that unlocks insights from each sample for doctors to optimize patient outcomes. Paige is a software company helping pathologists and clinicians...
We'll keep you updated when the best new remote jobs pop up.