(Senior) Genomic Data Scientist - Computational Biologist

Computing · South San Francisco, California
Department Computing
Employment Type Full-Time
Minimum Experience Mid-level

The Opportunity

Key to insitro’s approach to rethinking drug development is the production of high quality genomic data and its integration with patient genotypes and molecular phenotypes. As a Genomic Data Scientist, you will lead the development and application of cutting edge approaches and workflows to analyze multi-dimensional, multi-modality genomic data. You will work with high quality functional genomic data such as bulk and single-cell RNA-seq and ATAC-seq of cells from diverse genetic backgrounds under multiple genetic and chemical perturbations. You will also analyze whole-genome and whole-exome NGS sequencing data to characterize patients and cell lines. You will play a key role in helping develop and refine novel functional genomics assays and sequencing data pipelines. You will work closely with a cross-functional team of life scientists, bioengineers and machine learning scientists to design and analyze our in-house experiments to collect high-throughput in vitro genomic data. You will also analyze human level data from clinical trials (including genetics, transcriptomics and pathology) and integrate it with our in-house genomic data to identify therapeutic targets and develop drugs that have high efficacy and low toxicity.

You will be joining as the founding team of a biotech startup that has long-term stability due to significant funding, but yet is very much in formation. A lot can change in this early and exciting phase, providing many opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!


About You

  • M.S. or Ph.D. in computational biology, genetics or a related discipline, or equivalent practical experience

  • 3-5+ years demonstrated experience using and developing cutting-edge methods for analyzing NGS sequencing datasets, including both variant calling from WGS/WES and extracting signal from functional genomic assays (RNA/DNase/ATAC/ChIP-seq, etc)

  • Experience with modeling sequencing artifacts (e.g. GC content, fragment length bias, overdispersion, etc.) and interpretation of QC measurements to guide assay development

  • Significant expertise with NGS data processing tools (samtools, GATK, IGV, etc)

  • Experience working with large numbers of samples and modern workflow management frameworks (Snakemake, Cromwell/CWL/WDL, NextFlow, etc) running on either HPC systems (e.g. using SLURM) or on cloud services (e.g. AWS or GPC)

  • Strong fundamentals in applied multivariate statistics

  • Proficiency in working with large-scale datasets in Linux/Bash and Python; experience with R, C/C++ or other compiled, statically typed languages is a plus

  • Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions

  • Passion for making a difference in the world

Nice to Have

  • Experience working with genetic data, genotype imputation, and LD expansion

  • Domain knowledge in immunological, neurological or metabolic disorders

  • Experience working with single-cell sequencing analysis

  • Experience with quantitative trait loci (QTL) analysis

  • Experience developing or helping develop novel genomic assays

  • Experience with database languages (e.g., SQL) and experience with version control practices and tools (e.g. Git)

Benefits at insitro

  • Excellent medical, dental, and vision coverage
  • Open vacation policy
  • Team lunches (catered daily)
  • Commuter benefits
  • Paid parental leave

About insitro
insitro is an exciting startup company that aims to take a new approach to drug development: one with big data and machine learning at its core. We plan to build on the ground-breaking innovations that have occurred in life sciences to develop large data sets that are designed from the start to allow machine learning to address fundamental bottlenecks in the drug development process. Our goal is to cure more people, sooner, and at a much lower cost.
We are fortunate to have the strong support from the top investors in both biotech and tech: ARCH Ventures, Foresite Capital, A16Z, GV, and Third Rock Ventures. We are building a remarkable team that embodies a new type of culture, one based on a true partnership between scientists, engineers, and data scientists. Together we are working to define the problems, design experiments, analyze the data, and derive the insights that will lead us to new therapeutics. Join us, and help make a difference to patients!



Thank You

Your application was submitted successfully.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

  • Location
    South San Francisco, California
  • Department
  • Employment Type
  • Minimum Experience