Project
IBM Research Homepage 
 Research Home  >  Functional Genomics and Systems Biology Group

FG & SB Home
Researchers
Projects
Location
Tools/Downloads
Links
Site Map


DNA Chip Analysis / Gene expression data mining

The emergence of DNA chips allow for the expression levels of thousands of genes to be assayed in parallel. Projected refinements of this technology may allow for whole genomes to be assayed simultaneously or phenotyping of diseases as diagnostic tool. A primary interest to the Functional Genomics and Systems Biology group is how to make best use of the vast amounts of data coming from these high-throughput array technologies. For example, the signal-to-noise ratios of current chip technologies is low so that analysis strategies are needed to extract relevant information from unwanted random variations. Another problem is how to best analyze the coordinated behavior of pairs of groups of genes. For example, Gene A and Gene B are simultaneously up or down regulated under a variety of experimental conditions, then one may infer a functional relationship between A and B. However, the exact nature of this relationship remains to be determined. The Functional Genomics and Systems Biology Group is evaluating a variety of methods to better define the functional relationships between genes. These methods include machine learning algorithms, pattern discovery, pathway analysis, and data integration from multiple sources. We have developed a package to analyze patterns on gene arrays called Genes@Work. New - hear an online web lecture about gene expression array analysis by Gustavo Stolovitzky (requires registration).

Gene selection in microarray data: the elephant, the blind men and our algorithms

Gene expression array data provide shadows of intricate cellular processes. Learning how to make the most of the information present in expression arrays has become a discipline in itself. In recent years, there has been an explosion of methods that analyze gene expression arrays to produce long lists of genes that express differentially in distinct cellular states. These lists will have to be organized, and the algorithms that produced them combined, if we wish to piece together the rich cellular structures probed by this high-throughput technology. Researchers will have to understand the benefits and limitations of the many existing methods to produce the combination of algorithms that best suits their gene expression experiments.

For complete article, see:
.



Publications from our group in this area:

Stolovitzky GA, Gene selection in microarray data: the elephant, the blind men and our algorithms. Current Opinion in Structural Biology, 13:370–376 (2003). (Pubmed)

Lepre, J., Rice, J.J., , Tu, Y., and Stolovitzky, G. Genes@Work: an efficient algorithm for pattern discovery and multivariate feature selection in gene expression data. Bioinformatics. May 1;20(7):1033-44 (2004). (Pubmed)

Klein U, Tu Y, Stolovitzky GA, Mattioli M, Cattoretti G, Husson H, Freedman A, Inghirami G, Cro L, Baldini L, Neri A, Califano A, Dalla-Favera R., Gene expression profiling of B cell chronic lymphocytic leukemia reveals a homogeneous phenotype related to memory B cells. J Exp Med. 2001 Dec 3;194(11):1625-38. (Pubmed)

Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JY, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR., Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature. 2002 Jan 24;415(6870):436-42.


R. Hart, A.K.Royyuru, G.Stolovitzky and A. Califano, Systematic and Fully Automated Discovery of Patterns in PROSITE Families , Proccedings 4th Annual ACM International Conference on Computational Molecular Biology (RECOMB 2000), (2000). In extended form in: Journal of Computational Biology, 7(3-4) : 585-600 (2000).

A. Califano, G. Stolovitzky and Y. Tu, Analysis of Gene Expression Microarrays for Phenotype Classification , Proceedings of the Annual Intelligent Systems in Molecular Biology (ISMB) 2000; 8:75-85 (2000).



Links to other persons and labs doing related work:
 Privacy | Legal | Contact | IBM Home | Research Home | Project List | Research Sites |Page Contact