|
Data
Management
|
|
|
Computer
Science > Data
Management
> Computer Science Brochure
|
|
| Computer Science Brochure | |
|
IBM Research is recognized as a leading innovator in the field of data management. Our history of pioneering work includes E. F. Codd's seminal work on relational algebra, the System R relational database management system prototype (which led to IBM's DB2®), ARIES transaction recovery and logging, Starburst extensible database technology, DB2 parallel database technology, QBIC® for image querying, and QUEST data mining algorithms. Today, we continue to explore new data management technology in such areas as data warehousing, object-relational features, digital libraries, multimedia content management, and federated databases, as well as emerging areas of e-commerce, Internet, and mobile applications. Advanced Relational Database Research In the relational data management area, we are actively exploring several new technologies to enable scalability, functionality, performance, and usability in our DB2 database systems. In our work on multidimensional clustering and automatic summary tables, we are investigating new clustering, indexing, and query processing paradigms. Our DBCache project is building a DB2 cache closer to IBM's WebSphere® web application server. Performance and availability can be improved by caching entire tables, fragments of tables, and/or query results. We are investigating new algorithms for automatic selection of indexes, materialized views, and other physical structures. Also, we have initiated the LEO (LEarning Optimizer) project for feedback-directed query planning. Federated and Distributed Data Management Managing federated and heterogeneous data sources continues to be an important research topic. Our Garlic project provides a query processing and optimization framework over heterogeneous data sources. We are also investigating new techniques for robust and scalable data replication using persistent messages. XML has emerged as the Internet standard and is generating new opportunities for research. We are investigating storage, indexing, query, and update techniques for XML data over various data repositories in such projects as XPERANTO, XML Access Server, and XML File System, as well as SQL extensions for native XML support in object-relational databases. Content Management In our Content Management work, we deal with text, music, and video, as well as traditional corporate data. Increasingly, these types of data are manipulated as first-class objects by various applications. Hence, they often reside in file systems, object servers, and web sites rather than in databases. The DataLinks project connects this "other" data to databases and provides security, access control and recovery. We are also investigating modeling, ingestion, indexing, searching, and distribution issues. For instance, the SPIRE project extracts, transforms, and massages features from image and satellite data. We have also begun work in the area of long-term data preservation with the challenging goal of being able to read today's digital data in the years to come. Active and Temporal Database Techniques Database techniques are being used for the general problem of event management. Thus, the AMIT project has created a high-level language and an execution model to correlate events based on temporal characteristics. It is currently being used in business process management and customer relationship management. There are many other potential applications in e-brokerage, system management, and real-time monitoring of sensor inputs. The databases used by these applications are often very large in size and require extensive temporal support with full optimization and parallelization. In the TempDB project, we are enriching SQL with extensive temporal constructs and introducing optimization and execution strategies for temporal queries.
Please contact Paridhi Verma to obtain copies of the Computer Science Brochure |