At insitro, we are shaping the future of drug discovery by combining human biology and machine learning to discover new therapeutics. At the heart of our strategy is the true partnership between various disciplines and expertises to produce large and high quality data sets that will drive machine learning and yield key biological insights. You will be joining a highly collaborative and dynamic research environment where we aim to advance the rate of scientific discovery using purposefully built solutions. You will work closely with a very talented team of distinguished scientists and engineers. You will directly contribute to shaping insitro’s culture, strategic direction, and outcomes, with many opportunities for significant and diverse impact across several functions and disciplines. With long-term stability thanks to significant funding, this is an exciting time to join us in leading the way to better medicines, with the integration of machine learning and biology at scale.
Data Engineering plays an essential role in insitro’s approach to rethinking drug development, shaping the foundations of a truly data-driven and integrated drug discovery approach. We are seeking a talented and highly motivated software engineer to join us in designing and developing our critical lab software infrastructure.
As a Software Engineer in the team, you will:
- Work closely with a cross-functional team of scientists and engineers to identify challenges and solutions to improve our lab data applications and integrations
- Design and implement a rigorous and scalable landscape of applications and data processing pipelines that interact with high throughput biology automation platforms, with performance and scale in mind.
- Design, implement and maintain scalable backends and intuitive frontends for capturing, extracting, integrating and analyzing large volumes of scientific and lab operational data such as high-content microscopy and sequencing data.
- Get exposed to fascinating science and contribute to building state-of-the-art machine learning infrastructure to advance our expertise in biology and disease modeling.
- Evaluate new technologies, practices and vendors that could increase scientific capabilities and/or efficiencies.
- Ensure that our solutions fit appropriately into our information ecosystem and ensure the integrity of our data architecture
In building our team, we look for people who share the collaborative, rigorous and scientific spirit of our culture. The successful candidate will possess:
- BS, MS, or Ph.D. in computer science, statistics, mathematics, physics, engineering, or equivalent practical experience
- Expertise in one or more general-purpose programming languages (such as Python, Java, Scala, C/C++, or Go)
- Demonstrated ability to write high-quality, production-ready code (readable, well-tested, with well-designed APIs)
- Familiarity with cloud computing services (AWS or GCP)
- Familiarity with relational databases.
- Familiarity with web services and application frameworks (Django, Flask).
- Proficiency in Linux environment (including shell scripting), experience with database languages (e.g., SQL, No-SQL) and experience with version control practices and tools (Git, Mercurial, etc.)
- Experience leading the development, delivery and maintenance of software products in collaboration with scientific stakeholders.
- Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions
- Experience with Agile Scrum software development methodologies
- Experience in all aspects of the software development process, including requirements gathering, architecture, design, implementation, release, and maintenance.
- Experience developing or implementing a Laboratory Information Management System (LIMS) solution is preferred.
- Experience working with large datasets, pipelines, and data warehouses
- Experience architecting infrastructure for high-throughput lab automation is a strong plus.
Nice to Have
- Experience working with lab and scientist stakeholders in a health tech or genomics field.
- Experience with medium-sized data sets (100TB+)
- Demonstrated ability to develop novel data engineering methods that go beyond applying existing frameworks and patterns,, and to apply problem-solving skills to complex issues
Benefits at insitro
- Excellent medical, dental, and vision coverage
- Open vacation policy
- Team lunches (catered daily)
- Commuter benefits
- Paid parental leave
insitro is a drug discovery and development company using machine learning and data generation at scale to transform the way that drugs are discovered and delivered to patients. We rely on human genetic cohorts, human-derived cellular disease models, and high-throughput biology and chemistry to identify coherent patient segments, actionable therapeutic targets, and new or existing chemical matter. The goal is to deliver predictive insights to improve the probability of success and reduce the number of costly dead ends along the R&D journey. The company has established enabling collaborations with Gilead in NASH and Bristol Myers Squibb in ALS and is building a pipeline of wholly owned and partnered medicines leveraging its unique insights on patient biomarkers, targets, and molecules. insitro is located in South San Francisco, CA and has raised over $600M from top tech, biotech, and crossover investors since formation in 2018. For more information on insitro, please visit the company’s website at www.insitro.com.