IBM®
Skip to main content
    Country/region [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Computer Science

Innovation Matters


Storage Systems

Fossilization: Compliant reference storage solutions

Enables organizations to maximize the value of their electronic records and minimize liability by providing practical solutions for storing and managing large volumes of electronic records over extended periods of time.

The fundamental purpose of record-keeping is to establish irrefutable proof and accurate details of events that have occurred. However, critical records, such as business communications, financial statements and medical images, are increasingly stored in electronic form, which makes them relatively easy to clandestinely destroy and modify. The threat of intentional and inside attacks is very real, given the extremely high stakes that could be involved in tampering with the records. With recent corporate misconduct and the ensuing attempts to change history, a growing fraction of records is now subject to regulations (e.g. Sarbanes-Oxley Act, SEC Rule 17a-3/4, HIPPA, DOD 5015.2) on how they should be maintained. A 2003 study by the Enterprise Storage Group predicted that the worldwide volume of compliant records will increase by 64 percent per year to almost 2 petabytes in 2006.

overview architecture

Architecture Overview


The current industry practice and regulatory requirements (e.g. SEC Rule 17a-4) rely on storing records in Write Once Read Many (WORM) storage to preserve them. But this is increasingly inadequate to ensure that the records are trustworthy, i.e. able to provide solid proof and accurate details of past events. For example, with the large volume of records and short query response time typical today, the records have to be indexed, but traditional indexing methods allow records, even those stored in WORM storage, to be effectively (logically) altered and deleted. Moreover, many records have long retention periods, requiring them to be periodically migrated to new storage systems, which makes them vulnerable. The records may even be susceptible to alteration during transit to the agent conducting an enquiry.

In this project, IBM Research takes a fresh holistic approach to electronic record-keeping. We have developed a process called fossilization to ensure that records are trustworthy from an end-to-end perspective - where records are kept to where records are received (such as by an agent performing an audit, a legal or regulatory discovery or an internal investigation). Fossilization is composed of three parts: • Fossilization of storage guarantees that all of the records and their associated metadata are not only reliably stored for an extended period of time, but are also securely protected from any modification. • Fossilization of discovery ensures that every preserved record that is relevant to an inquiry can be readily located and retrieved in a timely fashion. • Fossilization of delivery warrants that exactly the retrieved records are delivered to the agent and that they are delivered verbatim.

A key challenge in realizing this vision is that various aspects of today's systems are incompatible with the goal of preserving trust throughout a record's lifetime. Therefore, researchers have developed a comprehensive portfolio of trust-centric technologies cutting across traditional disciplines, such as database, operating system, computer architecture, security and packaging. These include Content Immutable Storage (CIS) to securely protect data from being overwritten, fossilized indices to prevent logical modification of records, trust-preserving migration to guard against record alteration during migration and unified auditing to detect potential inconsistencies across the solution stack. Since records typically contain sensitive information, researchers have also devised ways to preserve the privacy of the information throughout the records' lifetime. Moreover, if records are available, they are subject to discovery, and typically at great expense to the owning organization. Thus, IBM Research also created techniques to effectively dispose of records that are no longer useful so that they cannot be recovered, even with the use of data forensics.

For example, there has been a lot of work on indexing techniques, including several focused on indexing methods for WORM storage, but the previous techniques were not designed for trustworthy record keeping. Specifically, they require dynamic adjustments to the index structure which makes them vulnerable to logical modification of records. In general, any approach that requires the rebalancing of a tree is thus insecure because it allows an adversary to create new paths to records. Trees that grow in a bottom-up fashion are similarly exposed because an adversary could modify records at will by exploiting the provision for creating new versions of tree nodes. In addition, any method that permits index entries to be relocated is inherently not trustworthy because it opens the door for an adversary to create new versions of any entry. In this project, researchers have developed fossilized indices that are scalable and effective without requiring dynamic adjustments to their structure.

A further challenge in this project is to create a record-keeping solution that delivers end-to-end trust in a practical manner by using standard interfaces and building upon existing infrastructure. Researchers have developed a working prototype to demonstrate the feasibility of such a solution.

Selected Publications

Fossilization: A Process for Establishing Truly Trustworthy Records, Windsor Hsu and Shauchi Ong, IBM Research Report RJ 10331, Oct. 2004.

Content Immutable Storage: Truly Trustworthy and Cost-Effective Storage for Electronic Records, Windsor Hsu, Lan Huang and Shauchi Ong, IBM Research Report RJ 10332, Oct. 2004.

Duplicate Management for Reference Data, Timothy Denehy and Windsor Hsu, IBM Research Report RJ 10305, Oct. 2003.

Innovators Corner
Windsor Hsu  
Windsor Hsu
Researcher

What is the most exciting potential future use for the work you're doing?
The solutions we are working on can fundamentally change the way all records are kept. They provide the necessary safeguards to ensure that records remain trustworthy while allowing them to be stored, managed, searched, analyzed and processed conveniently, quickly and cost effectively. This means that we can both keep an accurate account of history and actually put the data collected to good use. Organizations will be able to gain effective control of their records and use them as a vital asset to derive new value.

What is the most interesting part of your research?
Talking to clients, analyzing competitors, understanding our strengths, and thinking hard to come up with effective ways to address client needs and leapfrog the competition. In other words, looking at the real world, figuring out the worthwhile battles, mapping them into technical challenges, solving the challenges, and creating the solution. In the process, we have a lot of fun and at the end of the day, we all feel good because we know we made a real difference. In this particular case, we do not just follow the crowd to provide basic WORM storage. Instead, we leverage our technical breadth and take a holistic approach to enable truly trustworthy electronic record keeping.

What inspired you to go into this field?
This is where the bits are going to be. Computer storage devices have improved so dramatically over the years and we have become so information-driven that it makes sense to keep records of everything electronically. Several studies show that the volume of fixed-content data is growing very rapidly and will soon exceed that of the traditional transactional data.

What is your favorite invention of all time?
Sticky notes. It achieves its function with such elegant simplicity. I once saw somebody carry a fancy PDA that is papered over with sticky notes.


Team Members
Research Team
Ying Chen Wayne Hineman Windsor Hsu
Ying Chen
Wayne Hineman
Lan Huang Xiaonan Ma Shauchi Ong
Xiaonan Ma
Shauchi Ong

Related Links
arrowDiscipline: Computer Science
arrowResearch Area: Storage Systems
arrowResearch Site: Almaden
 

    About IBMPrivacyContact