Data Engineer, DevOps

Data Engineering · South San Francisco, California
Department Data Engineering
Employment Type Full-Time

The Opportunity

Data Engineering plays a key role in insitro’s approach to rethinking drug development. The Data Engineering DevOps team ensures the infrastructure which powers our biological data factory’s robots, instruments, and machine learning platform is reliable, scalable, and manageable.  You will work closely with a cross-functional team of scientists, bioengineers, and data scientists to identify areas where data engineering can make a difference, by developing data architectures and systems on cutting edge, high throughput platforms that enable our scientists to be maximally productive. You will design, implement, and deploy cloud infrastructure, including managed databases, application servers, data warehouses, and interactive/batch computing environments, and work as part of a team to rigorously design our data platform, identify key architectural performance improvements, and join an on-call rotation to ensure that insitro's platform runs at maximum productivity.

You will be joining as the founding team of a biotech startup that has long-term stability due to significant funding, but yet is very much in formation. A lot can change in this early and exciting phase, providing many opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!

About You

  • 2-3 years of experience with provisioning AWS cloud services (Experience with GCP and Azure is also relevant).
  • Experience with cloud configuration and resource management tools such as Terraform
  • Experience architecting reliable infrastructure platforms including monitoring and alerting, load balancing, scalable services, multi-region
  • Experience with at least one high-end distributed data processing environment (Hadoop, Spark, etc)
  • Experience with batch computing systems such as AWS Batch, SLURM
  • Experience with container build and deployment systems like Docker, Kubernetes, or ECS
  • Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions
  • Proficiency in Linux environment (including shell scripting and Python programming), experience with database languages (e.g., SQL, No-SQL) and experience with version control practices and tools (Git, Mercurial, etc.)
  • Passion for making a difference in the world

Nice to Have

  • Experience with biological data 
  • Experience with managing medium-sized data sets (100TB+) in object storage systems like S3
  • Experience with defining infrastructure following compliance (GDPR, HIPAA, etc).
  • Experience with data processing pipelines
  • Experience with deploying and monitoring machine learning models in a production environment

Benefits at insitro

  • Excellent medical, dental, and vision coverage
  • Open vacation policy
  • Team lunches (catered daily)
  • Commuter benefits
  • Paid parental leave

About insitro

insitro is a data-driven drug discovery and development company using machine learning and high-throughput biology to transform the way that drugs are discovered and delivered to patients. The company is applying state-of-the-art technologies from bioengineering to create massive data sets that enable the power of modern machine learning methods to be brought to bear on key bottlenecks in pharmaceutical R&D. The resulting predictive models are used to accelerate target selection, to design and develop effective therapeutics, and to inform clinical strategy. insitro was launched in 2018 with a Series A of $100M funded by top investors including a16z, Arch Venture Partners, Foresite Capital, GV, and Third Rock Ventures.

The company has announced collaborations with Gilead Sciences in the area of NASH (2019) and Bristol Myers Squibb in the area of ALS (2020) and, in mid 2020, completed a Series B financing of $143M including current investors and new investors Canada Pension Plan Investment Board (CPP Investments), T. Rowe Price, BlackRock, Casdin Capital and other leading investors. The company is located in South San Francisco, CA. For more information about insitro, please visit the company’s website at 

Thank You

Your application was submitted successfully.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

  • Location
    South San Francisco, California
  • Department
    Data Engineering
  • Employment Type