|
Fossilization:
Compliant reference storage solutions
Enables organizations to maximize the
value of their electronic records and minimize liability by providing
practical solutions for storing and managing large volumes of
electronic records over extended periods of time.
The fundamental purpose of record-keeping is
to establish irrefutable proof and accurate details of events
that have occurred. However, critical records, such as business
communications, financial statements and medical images, are increasingly
stored in electronic form, which makes them relatively easy to
clandestinely destroy and modify. The threat of intentional and
inside attacks is very real, given the extremely high stakes that
could be involved in tampering with the records. With recent corporate
misconduct and the ensuing attempts to change history, a growing
fraction of records is now subject to regulations (e.g. Sarbanes-Oxley
Act, SEC Rule 17a-3/4, HIPPA, DOD 5015.2) on how they should be
maintained. A 2003 study by the Enterprise Storage Group predicted
that the worldwide volume of compliant records will increase by
64 percent per year to almost 2 petabytes in 2006.

Architecture Overview
The current industry practice and regulatory requirements (e.g.
SEC Rule 17a-4) rely on storing records in Write Once Read Many
(WORM) storage to preserve them. But this is increasingly inadequate
to ensure that the records are trustworthy, i.e. able to provide
solid proof and accurate details of past events. For example,
with the large volume of records and short query response time
typical today, the records have to be indexed, but traditional
indexing methods allow records, even those stored in WORM storage,
to be effectively (logically) altered and deleted. Moreover, many
records have long retention periods, requiring them to be periodically
migrated to new storage systems, which makes them vulnerable.
The records may even be susceptible to alteration during transit
to the agent conducting an enquiry.
In this project, IBM Research takes a fresh holistic
approach to electronic record-keeping. We have developed a process
called fossilization to ensure that records are trustworthy from
an end-to-end perspective - where records are kept to where records
are received (such as by an agent performing an audit, a legal
or regulatory discovery or an internal investigation). Fossilization
is composed of three parts: • Fossilization of storage guarantees
that all of the records and their associated metadata are not
only reliably stored for an extended period of time, but are also
securely protected from any modification. • Fossilization of discovery
ensures that every preserved record that is relevant to an inquiry
can be readily located and retrieved in a timely fashion. • Fossilization
of delivery warrants that exactly the retrieved records are delivered
to the agent and that they are delivered verbatim.
A key challenge in realizing this vision is that
various aspects of today's systems are incompatible with the goal
of preserving trust throughout a record's lifetime. Therefore,
researchers have developed a comprehensive portfolio of trust-centric
technologies cutting across traditional disciplines, such as database,
operating system, computer architecture, security and packaging.
These include Content Immutable Storage (CIS) to securely protect
data from being overwritten, fossilized indices to prevent logical
modification of records, trust-preserving migration to guard against
record alteration during migration and unified auditing to detect
potential inconsistencies across the solution stack. Since records
typically contain sensitive information, researchers have also
devised ways to preserve the privacy of the information throughout
the records' lifetime. Moreover, if records are available, they
are subject to discovery, and typically at great expense to the
owning organization. Thus, IBM Research also created techniques
to effectively dispose of records that are no longer useful so
that they cannot be recovered, even with the use of data forensics.
For example, there has been a lot of work on
indexing techniques, including several focused on indexing methods
for WORM storage, but the previous techniques were not designed
for trustworthy record keeping. Specifically, they require dynamic
adjustments to the index structure which makes them vulnerable
to logical modification of records. In general, any approach that
requires the rebalancing of a tree is thus insecure because it
allows an adversary to create new paths to records. Trees that
grow in a bottom-up fashion are similarly exposed because an adversary
could modify records at will by exploiting the provision for creating
new versions of tree nodes. In addition, any method that permits
index entries to be relocated is inherently not trustworthy because
it opens the door for an adversary to create new versions of any
entry. In this project, researchers have developed fossilized
indices that are scalable and effective without requiring dynamic
adjustments to their structure.
A further challenge in this project is to create
a record-keeping solution that delivers end-to-end trust in a
practical manner by using standard interfaces and building upon
existing infrastructure. Researchers have developed a working
prototype to demonstrate the feasibility of such a solution.
Fossilization:
A Process for Establishing Truly Trustworthy Records, Windsor
Hsu and Shauchi Ong, IBM Research Report RJ 10331, Oct. 2004.
Content
Immutable Storage: Truly Trustworthy and Cost-Effective Storage
for Electronic Records, Windsor Hsu, Lan Huang and Shauchi Ong,
IBM Research Report RJ 10332, Oct. 2004.
Duplicate
Management for Reference Data, Timothy Denehy and Windsor Hsu,
IBM Research Report RJ 10305, Oct. 2003.
|
 |
 |
|
|
What is the most exciting potential
future use for the work you're doing?
The solutions we are working
on can fundamentally change the way all records are kept.
They provide the necessary safeguards to ensure that records
remain trustworthy while allowing them to be stored, managed,
searched, analyzed and processed conveniently, quickly and
cost effectively. This means that we can both keep an accurate
account of history and actually put the data collected to
good use. Organizations will be able to gain effective control
of their records and use them as a vital asset to derive
new value.
What is the most interesting part
of your research?
Talking to clients, analyzing competitors, understanding
our strengths, and thinking hard to come up with effective
ways to address client needs and leapfrog the competition.
In other words, looking at the real world, figuring out
the worthwhile battles, mapping them into technical challenges,
solving the challenges, and creating the solution. In the
process, we have a lot of fun and at the end of the day,
we all feel good because we know we made a real difference.
In this particular case, we do not just follow the crowd
to provide basic WORM storage. Instead, we leverage our
technical breadth and take a holistic approach to enable
truly trustworthy electronic record keeping.
What inspired you to go into this
field?
This is where the bits are going to be. Computer
storage devices have improved so dramatically over the years
and we have become so information-driven that it makes sense
to keep records of everything electronically. Several studies
show that the volume of fixed-content data is growing very
rapidly and will soon exceed that of the traditional transactional
data.
What is your favorite invention
of all time?
Sticky notes. It achieves its function with such elegant
simplicity. I once saw somebody carry a fancy PDA that is
papered over with sticky notes.
|
| Research Team |
 |
 |
 |
Ying Chen |
Wayne Hineman |
|
 |
 |
 |
|
Xiaonan Ma |
Shauchi Ong |
|
|