
|
 |
DNA Chip Analysis / Gene expression data mining
The emergence of DNA chips allow for the expression levels of thousands
of genes to be assayed in parallel. Projected refinements of this technology
may allow for whole genomes to be assayed simultaneously or phenotyping
of diseases as diagnostic tool. A primary interest to the Functional Genomics
and Systems Biology group is how to make best use of the vast amounts of
data coming from these high-throughput array technologies. For example,
the signal-to-noise ratios of current chip technologies is low so that
analysis strategies are needed to extract relevant information from unwanted
random variations. Another problem is how to best analyze the coordinated
behavior of pairs of groups of genes. For example, Gene A and Gene B are
simultaneously up or down regulated under a variety of experimental conditions,
then one may infer a functional relationship between A and B. However,
the exact nature of this relationship remains to be determined. The Functional
Genomics and Systems Biology Group is evaluating a variety of methods to
better define the functional relationships between genes. These methods
include machine learning algorithms, pattern discovery, pathway analysis,
and data integration from multiple sources. We have developed a package
to analyze patterns on gene arrays called Genes@Work. New - hear an online web lecture about gene expression array analysis by Gustavo Stolovitzky (requires
registration).
Gene selection in microarray data: the elephant, the blind men and our
algorithms
Gene expression array data provide shadows of intricate cellular processes.
Learning how to make the most of the information present in expression
arrays has become a discipline in itself. In recent years, there has been
an explosion of methods that analyze gene expression arrays to produce
long lists of genes that express differentially in distinct cellular states.
These lists will have to be organized, and the algorithms that produced
them combined, if we wish to piece together the rich cellular structures
probed by this high-throughput technology. Researchers will have to understand
the benefits and limitations of the many existing methods to produce the
combination of algorithms that best suits their gene expression experiments.
For complete article, see:
.
Publications from our group in this area:
Stolovitzky GA, Gene selection in microarray data: the elephant, the blind
men and our algorithms. Current Opinion in Structural Biology, 13:370–376
(2003). (Pubmed)
Lepre, J., Rice, J.J., , Tu, Y., and Stolovitzky, G. Genes@Work: an efficient algorithm for pattern discovery and multivariate feature selection in gene expression data. Bioinformatics. May 1;20(7):1033-44 (2004). (Pubmed)
Klein U, Tu Y, Stolovitzky GA, Mattioli M, Cattoretti G, Husson H, Freedman
A, Inghirami G, Cro L, Baldini L, Neri A, Califano A, Dalla-Favera R.,
Gene expression profiling of B cell chronic lymphocytic leukemia reveals
a homogeneous phenotype related to memory B cells. J Exp Med. 2001 Dec
3;194(11):1625-38. (Pubmed)
Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JY, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR., Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature. 2002 Jan 24;415(6870):436-42.
R. Hart, A.K.Royyuru, G.Stolovitzky and A. Califano, Systematic and Fully Automated Discovery of Patterns in PROSITE Families , Proccedings 4th Annual ACM International Conference on Computational Molecular Biology (RECOMB 2000), (2000). In extended form in: Journal of Computational Biology, 7(3-4) : 585-600 (2000).
A. Califano, G. Stolovitzky and Y. Tu, Analysis of Gene Expression Microarrays for Phenotype Classification , Proceedings of the Annual Intelligent Systems in Molecular Biology (ISMB) 2000; 8:75-85 (2000).
Links to other persons and labs doing related work:
|
|