AI-Single-Cell-Research.png

Genetic Foundation Models for Target Gene Discovery

Overview

The target gene is a gene which essentially controls the functions of cells that are strictly related to disease. The control of such disease-specific cells by manipulating those target genes is an important step in modern drug discovery. For example, in tumor immunotherapies, the discovery of the genes behind proteins like PD-1 and CTLA-4, which block T-lymphocyte functions to kill tumor cells, paved the way to a cancer therapy called immune checkpoint therapy. Many drugs (checkpoint inhibitors) are developed and approved to effectively control such genes.

These days, advanced RNA sequencers read gene expressions on a single cell level, and we have a considerable accumulation of such single-cell RNA sequence data in the public space. At the same time, large language model training lead to a new and promising research approach by learning “representation” from the huge amount of data. These models are also called Foundation Models (FMs), and are extended to many areas other than natural language processing. FMs in the field of genetics, which learn the representation of gene expression, are being intensively studied. IBM has been developing such FMs of gene expressions, which are called Bio Medical Foundation Models for Targets (BMFM for Targets), to contribute to drug discovery.

The main expectation here is that learned representation of gene expressions, using huge data sets, provides essential and condensed information on cell functions. And thus, such a representation would produce better results of existing base-line gene informatics when using rather small samples including:

  • cell-type annotation,
  • cell-type prediction,
  • gene perturbation prediction,
  • drug response analysis, etc.

For example, gene perturbation prediction, using BMFM for targets, will help find key transcription factors (genes that control the transcription of messenger RNAs, and thus production of proteins) in an “in-silico manner“, which can greatly speed up existing wet-lab experiments.

BMFM_Technologies_for_Drug_Discovery (1).png

Publications

Contributors

Related projects