Query Processing, Optimization, and Algorithms

    We have a very strong interest in the areas of functionality and performance
of database engines. For example, the  MultiDimensional Clustering  project is exploring
new clustering, indexing, and database processing paradigms for data warehouses.  We are
also actively pursuing other topics in this general araea such as Sorting, Aggregation, and
block-oriented operations.

    Sorting enhancements

    Traditionally, database systems have employed disk I/O minimizing sorting techniques since
the expectation is for the Sorts to spill to disks. The main focus has been on I/O performance and
less attention has been paid to internal memory algorithms.   In today's systems, it is important
to pay attention to  internal memory sorting due to large memories. We  have explored  and
implemented a general mulitple-datatype Sorting algorithm in DB2 using  very high performance
techniques such as radix sorting, distribution counting, and binary key preparation. We are
also exploring bucket based external sorting for better memory utilization and I/O efficiencies.

Ref:  SIGMOD 96 Paper

    Block-Oriented Aggregation Processing

    We have explored a technique for allocating a block of storage, inserting a group of records, and
then applying very high-performance Block-Oriented operations to perform  Group By operations
and Aggregations. We have demonstrated that this results in very effective utilization of current
superscalar CPU's as well as the large L2 caches and results in large gains on aggregation intensive
queries.

Ref:  Technical Report: Block_oriented Processing
 

    Parallel Database Processing

    We initiated the Parallel database processing effort that led to IBM's DB2 Parallel Edition and
 DB2 Universal Database Enterprise-Extended Edition . Within the scope of this effort, we explored
parallelsim for almost all aspects of database processing.  The Shared-Nothing Architecture  and the
function-shipping paradigm form the basis of our work. We have implemented an elegant parallel
query processing architecture  and the corresponding query optimization to generate optimized parallel
query plans. Also, we explored and implemented parallelism for all
utility functions as well as recovery and transaction processing.
Ref:  DB2 Parallel Edition

We are interested in  follow-up parallel processing issues such as Parallel Interfaces to applications,
Skew-handling,  and global indexing techniques.
 
 
 

[Projects:  Multidimensional ClusteringXML Data Access  |  Tertiary Storage  |   Database Processing  ]
 
 

  Back to the Scalable Database Systems page