Machine Learning Intern, Small Molecules

Computing · South San Francisco, California
Department Computing
Employment Type Temporary
Minimum Experience Entry-level

This internship is for Summer 2021

The Opportunity

Machine learning lies at the core of insitro’s approach to rethinking drug development. As a small molecule machine learning intern, you will lead the development of cutting edge machine learning methods that solve key problems in the drug development process. You will work closely with a cross-functional team of life scientists, bioengineers, and data scientists to identify areas where machine learning can make a difference, to conceptualize and develop chemical datasets using cutting edge high throughput platforms, to prototype and build robust data processing pipelines, and to analyze these datasets using the best machine learning methods, applied at scale. You will need to come up with novel methods that use a broad spectrum of machine learning approaches, including cutting edge techniques. You will work as part of a team to rigorously analyze our data, pull out key insights, and make accurate predictions that will let us quickly develop highly efficacious non-toxic drugs for diseases with urgent unmet needs. . You will work closely with a very talented team, learn a broad range of skills, receive close mentoring from senior scientists and engineers, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!  


About You

  • BS, MS, or Ph.D. in a quantitative subject area such as computer science, chemistry, statistics, mathematics, physics, engineering, or equivalent practical experience
  • Working knowledge of computational chemistry, including classic QSAR modeling, ligand and structure based drug-discovery, docking, virtual screening, library design etc 
  • Expertise in one or more general-purpose programming languages (such as Python, C/C++, or Scala) 
  • Demonstrated ability to write high-quality, production-ready code (readable, well-tested, with well-designed APIs)
  • Experience with at least one high-end ML development environment (Tensorflow, Pytorch, Caffe, etc)
  • Experience with at least one of the cheminformatics toolkits (OpenEye/RDKit/Schrodinger suite, etc)
  • Demonstrated ability to develop novel machine learning methods that go beyond putting together of existing code, and to apply problem-solving skills to complex issues
  • Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions
  • Passion for making a difference in the world

Nice to Have 

  • Experience with small molecules and DNA encoded library datasets
  • Experience with building and debugging graph convolutional neural networks
  • Experience with scalable machine learning, including the application to large datasets (100TB+)
  • Proficiency in Linux environment (including shell scripting), experience with database languages (e.g., SQL, No-SQL) and experience with version control practices and tools (Git, Perforce, etc.)
  • Familiarity with cloud computing services (AWS or GCP)

Benefits at insitro

  • Excellent medical, dental, and vision coverage
  • Open vacation policy
  • Commuter benefits
  • Paid parental leave


About insitro

insitro is a data-driven drug discovery and development company using machine learning and high-throughput biology to transform the way that drugs are discovered and delivered to patients. The company is applying state-of-the-art technologies from bioengineering to create massive data sets that enable the power of modern machine learning methods to be brought to bear on key bottlenecks in pharmaceutical R&D. The resulting predictive models are used to accelerate target selection, to design and develop effective therapeutics, and to inform clinical strategy. insitro was launched in 2018 with a Series A of $100M funded by top investors including a16z, Arch Venture Partners, Foresite Capital, GV, and Third Rock Ventures. In 2019 the company announced a collaboration with Gilead Sciences in the area of NASH and, in mid 2020, announced a Series B financing of $143M including current investors and new investors Canada Pension Plan Investment Board (CPP Investments), T. Rowe Price, BlackRock, Casdin Capital and other leading investors. The company is located in South San Francisco, CA. For more information about insitro, please visit the company’s website at


Thank You

Your application was submitted successfully.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

  • Location
    South San Francisco, California
  • Department
  • Employment Type
  • Minimum Experience