Boston, MA · Job # 8672BK
Our client is seeking a Principal Software Engineer to help build the next generation data platform. As a key member of the data platform engineering team, you will help build a platform in AWS that supports both streaming and batch workloads, and can bring vital information to our clients.
- Contribute as a leader in the development of a modern data platform in a complex and fast-moving business and technical environment.
- Progressively incorporate data science, machine learning, and other analytical techniques into the data platform, to offer insights and capability both internally and externally.
- Lead the migration of our data platform to the cloud.
- Develop workflow and model management framework in support of engineering and data science applications.
- Understand the importance of CI/CD, participate in code reviews and contribute to automated tests.
- Contribute as a full-stack engineer in developing data platform capabilities, including user interfaces, database engineering, APIs, and metadata layer.
- B.S. in data science, computer science or equivalent technical field is required.
- Advanced degree in computer science, mathematics, statistics, data science or related technical degree preferred.
- At least 5 years of experience working in the field of data science engineering, involving building a data platform that serves a range of data science, data warehousing, visualization and ad hoc analytic needs.
- Demonstrated front-end development in frameworks such as React, Vue, or Angular.
- Deep understanding of relational databases and No SQL data storage technologies, including elastic cloud data warehousing solutions such as Snowflake (ideal), Redshift, EMR and Big Query. Prior experience with Elasticsearch a plus.
- Strong knowledge of distributed frameworks such as Spark, HBase, Presto and Flink.
- Conversant with at least one public cloud platform: Google Cloud Platform, Amazon Web Services or Microsoft Azure, significant experience with AWS, particularly with data-related services such as Glue, EMR, Lambda, Kinesis desired.
- Facility working with data from a variety of sources (databases, JSON, text-based, semi/unstructured data).
- 5+ years of experience with building efficient data processing pipelines for analytical systems, progressively incorporating streaming architectures, serverless event-driven ingestion pipelines, and good knowledge of ELT best practices.
- Solid understanding and work experience with the Python data science ecosystem (e.g. NumPy, Pandas, Jupyter, Scikit-learn), R, Spark in cloud environments.
- Experience with data science deployment environments such as Tensorflow and AWS SageMaker.
- Significant past experience with SQL, dimensional data modeling, relational and columnar databases.
- General understanding or experience with microservices and containerization.
- Proficiency with Python, facility with other languages desirable: Java, NodeJS, Scala, OS scripting.
- 5+ years of experience in progressive data architecture, incorporating cloud technologies, data lakes, elastic data warehouses, object stores and serverless architectures.
- 5+ years of professional software development experience.
Applicants must be authorized to work in the United States legally.
Apply For this Position