Overview
The IBM DataHidingTM technology for digital audio is a digital
watermarking technology based on signal processing,
probability theory, and the psycho-acoustic
model. It allows record companies and studios
to embed highly inaudible, robust, and secure
signals directly into DVD- and CD-quality
digital audio data streams such as PCM 96/48/44.1
kHz, 24/16 bits, and 6 channels/stereo/mono.
The embedded watermark is detected in the
uncompressed linear PCM domain. The degree
of manipulation for embedding is held to
be sufficiently small to be inaudible to
the listener, but sufficiently large to yield
a very low false positive error ratio when
the Copy Control Information (CCI) is detected,
and to show high ability to survive commonly
employed or well-known processes of transmission,
compression, filtering, and signal conversion.
This is why the technology should be based
on probability theory and the psycho-acoustic
model in addition to signal processing.
This technology has been certified as excellent and practical for business
purposes at STEP 2000 and STEP2001, an international evaluation project for digital watermark technology
for music in 2000 and 2001.
The technology can realize any predetermined
low false positive ratio at the time of detection
while maintaining high transparency if an
appropriate threshold and time for the detection
window are set. The false positive error
ratio of the submitted detector in detecting
the CCI data is less than 10 for each 15
seconds of the detection window. Deliberate
unauthorized removal or updating of the embedded
data is extremely difficult without serious
degradation of the audio quality.
Basic characteristics of IBM Audio DataHiding
- Flexibility in optimizing the overall trade-off
The data structure of a watermark allows
flexible optimization of the trade-off among
the competing requirements of transparency,
data capacity (payload), detection reliability,
survivability, and security,
- Multiple layers of watermarking
Up to three layers of watermarks can coexist
without interfering with each other.Unrelated
watermarks can be embedded at different times
and locations.
- Low false positive error
- High transparency
High transparency is achieved on the basis
of a tuned psycho-acoustic model.
- High survivability
- Two successive D/A and A/D conversions
The embedded signal survives the sequence
of the processes D/A, A/D, MiniDisc-recording,
MiniDisc-playback, D/A, A/D.
- FM radio broadcasting
The watermark can be retrieved after successive
D/A, frequency-modulated radio broadcasting,
reception by a consumer receiver, and A/D
processes.
- Resampling and down conversion of channels
The watermark can be detected at any sampling
rate down to 16 kHz.
- Time compression/expansion with pitch shift
and pitch preserved
Linear speed changes within 10% and pitch-invariant
time scaling within 4%
- Data reduction coding
MPEG-1 L1/L2/L3, MPEG-2 AAC, and ATRAC-1
- Nonlinear amplitude compression
- Additive or multiplicative noise
For example, the embedded signal survives
white noise addition with a constant level
of 30 dB lower than the long-term averaged
music power.
- Frequency response distortion such as equalization
- Addition of echo
For example, maximum delay: 100 msec, feedback
coefficient: around 0.5.
- Band-pass filtering; for example, 8 kHz low-pass filtering.
- Flutter and wow: 0.5% rms., from DC to 250 Hz
- Overdubbing.
Applications
- Internet distribution of music (one application of this is the EMMS)
- Counting of commercial film broadcasts
How many times a commercial film has been broadcast can be automatically
counted if its sound track contains an Audio DataHiding watermark in it.
- Playback counting
Artists can find out how many times their
music has been played on a particular TV
or radio channel if they embed identification
information into their music in advance.
See also four application models, which are applicable to various content
businesses.
Publication List
DataHiding Home |