Talent.com
Wellcome Sanger Institute
Principal Data ScientistWellcome Sanger Institute • Hinxton, Cambridgeshire
Principal Data Scientist

Principal Data Scientist

Wellcome Sanger Institute • Hinxton, Cambridgeshire
30+ days ago
Salary
£45,000.00–£60,000.00 yearly
Job type
  • Full-time
Job description

Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenges.

Principal Research Data Scientist

We seek a Principal Machine Learning Research Data Scientist Scientist to join a collaborative project between the Wellcome Sanger Institute and Open Targets (targets ( This project aims to leverage datasets internally generated at the Sanger Institute and publicly available data from human cells to create foundational models for biology, enhancing our understanding of life's rules and improving health for all. You will work within an interdisciplinary team of life scientists and computer/ML scientists, with a shared objective of advancing biological research through these foundational models. This role will sit within the AI/ML Faculty group led by Dr. Mohammad Lotfollahi, and the successful candidates, across different seniority levels (senior and principal), will be responsible for delivering their portfolio of scientific research projects as part of the broader team strategy.

About the Role

Your role will involve designing foundational models leveraging multi-modal readouts. This includes integrating and processing data from various sources to develop robust and versatile AI models. To achieve this, you will work with open-source software, proposing, developing, and maintaining new solutions to analyze and interpret large-scale single-cell datasets. We have access to unique data and are also in the position to generate data to train unique models. Additionally, we have substantial computational power and GPU resources to train large models efficiently.

Our teams are well-positioned to tackle this problem with experience in both generating and analyzing datasets, including millions of cells across multiple tissues and conditions (e.g., disease, healthy). This involves a detailed understanding of the training of large-scale ML models and a track record of undertaking large data-science projects.

You will be responsible for:

  • Independently manage and lead machine learning research projects and write outcomes in a scientific publication for submission to journals or machine learning conferences (ICLR, ICML, CVPR, etc).
  • Collaborate with team members, propose, develop, and evaluate new machine learning models that enable understanding single-cell data and its application in drug discovery.
  • Work with Ph.D. students and postdocs in collaborating teams on developing solutions for interdisciplinary scientific problems in biology, providing supervision and training to junior members of the team.
  • Contribute to writing scientific papers on biotechnology and biology.
  • Distill your developed solutions into open-source and easy-to-install packages with documentation that facilitates the usage of your solution for downstream users, including biologists and bioinformaticians.
  • Present your research and analysis pipelines to internal and external audiences.

About You:

You will be supported in your personal and professional development and have the opportunity to lead peer-reviewed publications around using genetics and genomics approaches to guide drug discovery and present them at national and international conferences.

Essential Skills:

● Ph.D. or M.Sc. with equivalent research experience in a relevant quantitative discipline (e.g., Computer Science, Computational Biology, Genetics, Bioinformatics, Physics, Engineering, or Applied Statistics/Mathematics)

● Previous ML work experience in scientific/academic environment (RA/Internships are considered as work experience)

● Strong knowledge of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch.

● Expertise in machine learning algorithms and frameworks, with experience in designing, training, and deploying ML models.

● Proficiency in handling and processing large datasets, including techniques for data cleaning, feature engineering, and data augmentation.

● Experience with high-performance computing environments, including the use of GPUs for training large-scale machine learning models.

● Experience in natural language processing (NLP) and training models based on transformer architectures, such as BERT and GPT.

● Familiarity with generative models such as diffusion models and flow matching.

● Knowledge of software development good practices and collaboration tools, including git-based version control, Python package management, and code reviews.

● Strong problem-solving skills with the ability to analyze complex data and derive actionable insights.

● Excellent communication skills, with the ability to explain complex machine learning algorithms and statistical methods to non-technical stakeholders.

  • Evidence of related work experience as a researcher in the area of Machine learning
  • Strong publication record, first author position ideal

In addition to the above technical skills, you will also have the following:

  • Ability to quickly understand scientific, technical, and process challenges and breakdown complex problems into actionable steps
  • Ability to work in a frequently changing environment with the capability to interpret management information to amend plans
  • Ability to prioritize, manage workload, and deliver agreed activities consistently on time
  • Demonstrate good networking, influencing and relationship building skills
  • Strategic thinking is the ability to see the ‘bigger picture
  • Ability to build collaborative working relationships with internal and external stakeholders at all levels
  • Demonstrates inclusivity and respect for all

Relevant publication of the groups:

  • Lotfollahi, M., Naghipourfar, M., Luecken, M. D., Khajavi, M., Büttner, M., Wagenstetter, M., Avsec, Ž., Gayoso, A., Yosef, N., Interlandi, M. & Others. Mapping single-cell data to reference atlases by transfer learning. Nature Biotechnology 1–10 .
  • Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses. Nature Methods 16, 715–721 .
  • Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A. V. & Theis, F. J. Biologically informed deep learning to query gene programs in single cell atlases. Nature Cell Biology .

Create a job alert for this search

Principal Data Scientist • Hinxton, Cambridgeshire

Similar jobs

Principal Engineer

Synergetic Recruitment Group LimitedChelmsford, ENG, GB
Full-time

Our client is scaling a large, distributed cloud platform and is looking for a Principal Engineer to act as the Subject Matter Expert (SME) across observability and cloud infrastructure.Youll be wo... Show more

 • Promoted

Lead Data Engineer

Ground ControlLittle Burstead, England, United Kingdom
Permanent

We are seeking a highly skilled Lead Data Engineer to design, build, and optimise our modern data platform using Microsoft Fabric.This is a senior technical leadership role where you will shape how... Show more

 • Promoted

Data Platform Architect: Cloud-Native Snowflake & Lakehouse

N Consulting LimitedBasildon, England, GB
Full-time

A consulting firm is seeking a Data Platform Solution Architect for onsite work in Basildon, United Kingdom.The role demands strong expertise in Solution Architecture, Cloud-Native Architecture Des... Show more

 • Promoted

AI Scientist

ITonlinelearning RecruitmentBasildon, Essex, UK
Part-time

Junior AI Developer Programme – Job Support Included.Complete online training designed to help you enter junior AI and developer roles.Study part-time, build coding and AI fundamentals, and receive... Show more

 • Promoted

Principal Scientist Immunoassay

ResolianFordham, ENG, GB
Full-time

This role is centred on providing scientific leadership and oversight of all the technical aspects within the local Fordham immunoassay group, and throughout the wider Resolian immunoassay teams.Yo... Show more

 • Promoted

Data Science Specialist

BT SecurityChelmsford, England, United Kingdom
Full-time

Internal Closing Date: 11/05/26.Due to the sensitive nature of this role, you will be required to undergo DV (Developed Vetting) level Security Clearance ( https://www.An allowance of £5k per annum... Show more

 • Promoted

Data Science Specialist

Cyber Security training coursesChelmsford, England, GB
Full-time

Internal Closing Date: 11/05/26.Due to the sensitive nature of this role, you will be required to undergo DV (Developed Vetting) level Security Clearance ( https://www.An allowance of £5k per annum... Show more

 • Promoted

Lead Data Engineer: Build a Modern Fabric Data Platform

UKund ControlBillericay, England, United Kingdom
Full-time

UKund Control is looking for a Lead Data Engineer in Billericay.You will design and optimize a modern data platform using Microsoft Fabric while leading a team of engineers.Candidates should have a... Show more

 • Promoted

DV-Cleared Data Science Leader—AI/ML Architect

BT SecurityChelmsford, England, United Kingdom
Full-time

BT Security is seeking a skilled Data Scientist to lead the development of data science capabilities within our Security Division.You will support critical customer platforms, mentor a team of AI p... Show more

 • Promoted

Lead Data Engineer

UKund ControlBillericay, England, GB
Permanent

Working Hours: 40 hours per week, Monday-Friday.We are seeking a highly skilled Lead Data Engineer to design, build, and optimise our modern data platform using Microsoft Fabric.This is a senior te... Show more

 • Promoted

Data and Analytics Lead

MacGregor BlackEssex, England, United Kingdom
Permanent

Do you wat to be at forefront of driving real insight through Data Strategy?.Do you want to take full ownership of a fragmented data landscape and turn it into a commercial engine?.Can you combine ... Show more

 • Promoted

Data Engineer

WeDoDataEssex, England, United Kingdom
Full-time

WeDoData is working exclusively with a forward‑thinking training provider in Greater London and seeks a multi‑skilled Data Engineer to join their busy MIS function.Manage and optimise the college’s... Show more

 • Promoted

Data Modeller

TESTQ Technologies LimitedBasildon, England, United Kingdom
Full-time

Strong SQL skills DBT model development Snowflake ETL pipelines Data transformation design Data Warehousing Engineer DBT Snowflake.Develop and maintain data warehousing solutions using DBT and Snow... Show more

 • Promoted

Senior Scientist - AI / Machine Learning

Charles River LaboratoriesSaffron Walden, ENG, GB
Full-time

Charles River is seeking a Senior Scientist – AI / Machine Learning to expand our AI‑enabled drug discovery capability.This role is central to turning advanced AI/ML methods into real&... Show more

 • Promoted

Security Data Science Lead (DV-Clearance)

BT GroupChelmsford, England, United Kingdom
Full-time

BT Group is seeking a Data Science Specialist based in Chelmsford, UK, to enhance their data science capabilities within their Security Division.The successful candidate will lead an agile team, me... Show more

 • Promoted

Senior Scientist/Principal Scientist AI/ML

Maxion TherapeuticsPampisford, England, GB
Permanent

Maxion Therapeutics is a biotechnology company developing antibody-based drugs for previously untreatable ion channel- and G protein-coupled receptor (GPCR)-driven diseases, including autoimmune co... Show more

 • Promoted

Senior Data Manager

Lift GreenswardChelmsford, ENG, GB
Full-time +2

LGPS) + additional Lift Schools benefits.We are looking to appoint a Senior Data Manager to play a key role in driving the effective use of data across the school.The successful candidate will lead... Show more

 • Promoted

Principal Engineer

KanzWaterbeach, England, United Kingdom
Full-time

Jobs for Humanity is collaborating with Kier to build an inclusive and just employment ecosystem.We support individuals coming from all walks of life.We're looking for a Principal Engineer to join ... Show more

 • Promoted

Head of Data Science & AI - Advanced Analytics (12m FTC)

TalkTalkMargaretting, England, GB
Full-time

TalkTalk is seeking a professional for a 12 Month FTC role to provide technical and strategic direction in leveraging data science, advanced analytics, and AI.Responsibilities include delivering da... Show more

 • Promoted

DV-Cleared Data Science Lead for Security Analytics

Cyber Security training coursesChelmsford, England, United Kingdom
Full-time

Cyber Security training courses is looking for a Data Scientist to work within BT's Security Division, focusing on maturing data science capabilities.You will support data exploitation on customer ... Show more