|
|
|
|
Database Research at Watson
|
| Information Integration and Business Intelligence |
| We are exploring mechanisms for combining data from different sources with the goal of providing
a richer set of information for data mining and business intelligence applications. |
 |
Semantic Information Integration
This project focuses on two aspects of information integration: adaptivity and fuzziness. We have developed
a family of algorithms for answering top-k queries over multiple attributes. Contrary to previous efforts,
our algorithms can adapt to different memory constraints and query costs. The algorithms can be implemented on
top of an RDBMS or a native XML storage or they can employ specialized index structures to further shorten the
response time.
To capture the semantic uncertainty of data, we are developing efficient methods for query relaxation and knowledge base
access.
Contributors: Yuan-Chi Chang, Christian Lang, Ioana Stanoi, Ke Yi (Duke University), Kevin Chang (University
of Illinois at Urbana-Champaign)
Publications
|
| |
 |
XML Data Federation
XML Data Mediator (XDM) is a lightweight mediator for bi-directional data conversion between XML and structured data
formats such as relational or LDAP data. XDM externalizes the specification of the mapping between XML and relational
databases. Once the mapping is specified at a schema level, the XDM runtime engine automatically converts data to and
from XML through its Store2XML and XML2Store components. The Store2XML component can collect data from one or more
data stores and assemble it into a coherent XML document conforming to the specified schema. We are currently
investigating adding XML Query capabilities for virtual XML views over a distributed collection of XML and relational
data sources. The XML2Store component extracts specific pieces of an incoming XML document and forwards them to one or
more data stores through data modification commands (insert, update, or delete, as specified by the user). The XML2Store
component allows the user to specify transactions, in order to guarantee the consistency of the database when multiple
update operations are generated. XML Data Mediator is available on
Alphaworks.
Contributors: George Mihaila, Joe Zhou (WebAhead), Dikran Meliksetian (WebAhead), Rajesh Bordawekar,
Christian Lang, and Attila Barta (WebAhead)
|
| |
 |
Supporting Efficient Parametric Search of E-Commerce Data
Electronic commerce is emerging as a major application area for database
systems. A large number of e-commerce sites provide electronic
product catalogs that allow users to search products of interest.
Due to the constant evolution and the high sparsity of e-commerce data,
most commercial e-commerce systems use the so-called vertical
schema for data storage. However, query processing for data stored
using vertical schema is extremely slow because current RDBMS, especially
its cost-based query optimizer, is designed to deal with
traditional horizontal schema efficiently.
Most e-commerce systems would like to offer advanced parametric search
capabilities to their users. However, most searches are expected to be
online which means that the query execution should be very fast.
RDBMSs require new capabilities and enhancements before they
can satisfy the search performance criteria against vertical schema.
The tightly-coupled enhancements and additions to a DBMS require considerable
amount of work and may take a long time to be accomplished.
In this project, we study an alternative approach called SAL, a Search Assistant Layer that can be implemented
outside a database engine to accommodate the urgent need for efficient
parametric search on e-commerce data. Our experimental results show that
dramatic performance improvement is provided by SAL for search queries.
Contributors: Min Wang, Yuan-chi Chang, and Sriram Padmanabhan (IBM Silicon Valley Labs)
Publications
|
|