Cambridge, MA · Job # 8639BK
As a small but rapidly growing company, we are looking for a data engineer to help us build and scale out our healthcare data pipelines - ETL Pipelines to harmonize and de-identify healthcare data, analytics pipelines to add features to datasets for bulk analysis.
As a data engineer on the Engineering team, you will bring significant experience developing to the team, including:
- 3+ years of data engineering experience.
- BS/MS Computer Science, Engineering, Bioinformatics, or a related field
- Experience building and maintaining scalable batch and streaming data pipelines.
- Skilled in Node.js and/or Python. Proficient with Linux. Working JVM knowledge a plus (any or all of Java/Kotlin/Scala).
- Experience building and maintaining scalable batch and streaming data pipelines for terabyte scale data.
- Experienced in database architecture, with professional experience using relational SQL (e.g. PostgreSQL, MySQL), and NoSQL (MongoDB, Elasticsearch).
- Experience with large-scale data processing platforms such as Spark, EMR, and/or HPC computing experience with e.g. Apache Aurora, Slurm.
- Working knowledge and experience with AWS infrastructure including DynamoDB, Athena/Glue, Lambda, EC2.
- Bioinformatics experience is a major plus, in either the research or clinical space, as well as healthcare data experience in general - e.g. HL7, FHIR. LOINC, ICD10, SNOMED, CPT.
Applicants must be authorized to work in the United States legally.Apply For this Position