IBMSkip to main content
  Home     Products & services     Support & downloads     My account  
  Select a country 
Journals Home 
 Systems Journal 
Journal of Research
and Development
 ·  Current Issue 
 ·  Recent Issues 
 ·  Papers in Progress 
 ·  Search/Index 
 ·  Orders 
 ·  Description 
 ·  Patents 
 ·  Recent publications 
 ·  Author's Guide 
 Staff 
 Contact Us 
 Related link: 
    IBM Tape Storage
   Systems
 
IBM Journal of Research and Development 
Volume 47, Number 4, 2003
Tape storage systems and technology
 Table of contents: arrowHTML arrowPDF   This article: HTML arrowPDF          DOI: 10.1147/rd.474.0445arrowCopyright info
  

Innovations in tape storage automation at IBM

by D. J. Hellman, R. Yardy, and P. E. Abbott

In the mid-1980s, tape storage appeared to be heading toward the graveyard. Storing information on large reels of tape that had to be manually mounted was expensive, inefficient, and prone to human error. In addition, competition from hard disk drives (HDDs) was growing. Data access on HDD was faster because the data was stored in a manner for rapid random access, not linearly as on tape, and HDD capacity was rapidly growing. Then, in 1984, IBM introduced the half-inch tape cartridge (3480), which was novel in its small form factor. In 1986, an automatic loading mechanism followed. In 1987 and 1988, competitors introduced the first automated tape libraries to use the half-inch tape cartridge. Suddenly, tape storage became considerably easier to use and more efficient. With automation, it also represented the least expensive way to store data. In 1993, IBM introduced the 3495 tape library, and now offers several different automated libraries ranging from the smaller 3575 MP and 3581/3583 models to very large libraries such as the 3494 and 3584 models. This paper examines the history of IBM tape storage and automation products, including the engineering challenges that were met in response to users' pressing requirements; it also examines the future of automated tape storage.

Background

In May 1952, IBM introduced tape as a data storage medium—the first available alternative to bulky, slow punch cards. Four years later, in 1956, IBM introduced its new invention, the hard disk drive (HDD), giving customers new storage options with a potential to compete with tape storage. However, through the early decades of IBM tape history, tape storage remained the most cost-effective and reliable storage option, and offered significantly higher capacity than any competitive technology.

Tape is viewed as a reliable method of storing data. The stability of the media is such that a customer can typically expect to read a tape that was written and locked away more than ten years ago, whereas HDDs must be activated periodically to confirm that the lubricants remain evenly distributed and that the head is still mobile. In addition, the tape reel can be stored in a completely separate, isolated location, providing customers with additional confidence that their data is secure. Off-site storage is a huge benefit in the event of potential disasters such as earthquakes or fire that can destroy valuable data.

Because of the advantages offered by tape, companies installed large banks of tape drives and produced endless rows of tapes in their computer rooms (Figures 1 and 2). Until recently, film and TV directors relied upon the powerful symbol of spinning tape reels to represent the computer age and cutting-edge technology.

Figure 1 Figure 1   Figure 2 Figure 2

However, in the early 1980s, breakthroughs in HDD technology led to disks with greater capacity and increased reliability. Disk storage increased in popularity because it offered rapid and random access to data, features that tape could not hope to match. Then, in the late 1980s, optical disk storage was introduced as a competitor to HDD. Optical storage offered higher capacities and potential lower costs than HDD. In addition, optical storage offered removability, thus providing for off-site storage—the same major advantage tape has over HDD.

Soon, HDD prices began to drop rapidly, and developers began to integrate HDD into arrays, providing even greater capacity and eliminating the need for workers to manually load and unload storage devices. In tape, accessing stored files or saving data required a worker to find the appropriate reel and manually mount it. Because of the size of many tape storage rooms and the weight of tape reels, this was not always a simple task. Companies needed full-time tape-storage operators for all shifts. In HDD, also called direct-access storage devices (DASD), files are stored locally, directly on disk, and require no human intervention.

At that time, tape seemed destined to suffer the same fate as the punch card. IBM saw the problems with the current methods of storing data on tape and developed the first commercially available automated tape library, the Mass Storage System (MSS), in 1974. In the MSS, the form factor was altered from reels to 1.86-inch-diameter, 3.49-inch-long cylinders, as shown in Figure 3 [1, 2]. The library could hold up to 7000 of these cylinders, each containing 100 MB of uncompressed data. They were stored in a honeycomb-like structure (also shown in Figure 3), and a robot would move the cartridges to and from the storage cells and drives.

Figure 3 Figure 3

While the automation process was both novel and elegant, the file architecture was complicated. Ironically, HDD was used to keep track of files and to store data in quickly accessible buffers, referred to as cache. This marriage was intended to provide the functionality of DASD with the expandability and cost advantages of tape. However, if the disks malfunctioned and/or there were difficulties reading or writing data, the system had to reread all of the tapes in order to rebuild the file index system. However, the system's heavy reliance on disk drove the system cost higher than most customers were willing to pay; the automation concept was ahead of its time.

In 1984, IBM again attempted to fix the problem of the unwieldy tape reels, with considerably more success. This time IBM introduced the 3480 tape cartridge, with a smaller form factor. It was the first rectangular half-inch tape cartridge that is still common today [3]. This cartridge, shown in Figure 4, held 200 MB of uncompressed data—twice the capacity of the MSS cartridges and more than twice the capacity of the reels.

Figure 4 Figure 4

The next advance occurred in 1986, when IBM introduced an autoloader to help reduce the need for human intervention in tape storage. This autoloader attached to the front of the drive and could hold up to seven tapes. It automatically exchanged the current tape cartridge with a new, queued cartridge from the bottom of a six-cartridge stack loader. The host computer dictated when a tape should be removed or inserted and determined when there was enough data on the tape. Although this autoloader was not a completely automated library, it reduced the time needed to attend each tape drive. However, because an operator still needed to place a new batch of tapes in the autoloader and find a particular tape when a certain file was necessary, the speed and reliability of the system were not optimal.

The rectangular cartridges made a significant impact, leading to the first commercially available fully automated tape libraries, produced by competitors in 1987 and 1988. These fully automated tape libraries made tape storage considerably less expensive than any other form of storage, including HDD. Even though optical disks could be stored in a similar type of library, tape was still less expensive per megabyte of data storage. Just as it appeared that tape storage was going to fade away, interest was reignited, and demand began growing rapidly once again. The automation revolution had begun. Not only was tape being used to store programs and data formerly on punch cards, but it also became an inexpensive and reliable backup tool for both long- and short-term storage. Over time, cartridge capacity grew from 200 MB on the 3480 cartridge to 40 GB uncompressed on the most recent 3590 cartridge and to 200 GB uncompressed on the second-generation 3580 cartridge. With automation, tape became an inexpensive, efficient, and reliable way to back up data stored on hard disk.

IBM contributions

Observing how automation was changing the face of tape storage, IBM introduced two robotic libraries in 1993. The first was the enterprise-class 3495 library, and the second was the mid-range 3494 library.

The 3495 library

The 3495 library design used the enterprise-class 3490 drive in a model with four drives per unit and autoloaders on each drive. The library was a rectangular machine with a track down the middle of the frames (Figure 5). The frames held the tapes, and more frames could be added to the library to expand the customer's storage capacity. Expanding capacity by simply adding frames is considerably less expensive and requires less floor space than adding new libraries. Also, new frames extend the library linearly, so the same robot can simply travel a longer path. This expansion method does not require more robots or pass-through options. The concept of expansion frames is a simple and elegant way to add more space—from 23 frames up to 46 frames for the 3495 library. The library could hold up to 18,900 cartridges and 64 drives.

Figure 5 Figure 5

The 900-pound robot in the 3495 library was supplied by another manufacturer to aid IBM in its rapid development of the system. Because the robot was initially intended to serve on an automotive assembly line, it was a stationary robot. IBM designed a custom vehicle to transport the robot through the library on a track. The robot had six degrees of freedom, necessary because tapes were stored vertically but inserted horizontally into drives1 [4]. The robot moved on its track at 2.5 meters per second (m/s) and accelerated at 2.5 m/s2. Although it could mount 120 tapes per hour, it was not fast enough to keep the tape drives working 100% of the time. For the library to work more efficiently, the tape drives required autoloaders to buffer the time needed for the robot to retrieve and load tapes and then return tapes from the autoloader to the storage cells. The robot/autoloader combination provided the most efficient use of drives during high-demand periods and allowed the library to achieve burst-mount rates in excess of 500 mounts per hour.

The 3494 library

During the development of the 3495 library, IBM also produced the 3494—a mid-range/open-systems library. It also used the 3490 drive, but had only two tape drives and no autoloaders. The 3494 library used large portions of the same microcode as developed for the 3495 library. Consequently, because most of the microcode development was already completed, more resources could be devoted to the robot design. Because developers had determined that tapes could be stored horizontally, the required degrees of freedom were reduced to only four. This allowed IBM to use a simpler, smaller robot modified to improve the gripper performance, cartridge handling, and overall speed (Figure 6). This robot could perform 250 mounts per hour; thus, drives could be kept busy without having to attach autoloaders.

Figure 6 Figure 6

The 3494 saw great success, and over the last eight years, several additions to the library have increased its capabilities. It debuted in 1993 with a maximum machine length of two frames. In 1994, IBM increased the maximum machine length to eight frames. The following year, the capability to use the new direct-attach open-system 3590 tape drive was added. Then, in 1996, the library was elevated to the enterprise level with the ability to attach the 3590 tape drive to enterprise systems by including a control unit to communicate with the mainframe host. Also in 1996, the maximum machine length was extended to 16 frames. A year later, IBM introduced dual accessors, increasing the throughput and availability of the machine. Today, the 3494 can hold 160 to 6240 tape cartridges, up to 32 SCSI or Fibre Channel drives, or up to 76 control-unit-attached drives, and it can perform 610 mounts per hour with the dual-accessor option.

In addition, in 1997 a design enhancement based on a novel concept was added to the 3494 tape library. Prior to 1997, enterprise hosts (multiple virtual storage) stored one job per tape, with no minimum quantity of data. One study showed that customers actually used, on average, only 10% of the possible storage capacity per cartridge. Because the host controlled how the data was written, upgrading to higher-capacity cartridges made no sense when customers were unable to fully use their current cartridge capacity. The solution to this problem was the IBM Virtual Tape Server (VTS), introduced in 1997 [5, 6].

The VTS places a large disk cache in front of the 3590 media. The server associated with this cache is then responsible for managing the data and tape media. Virtual tape volumes and drives are created and used like real tapes and drives. This technique keeps the most recent data available for faster data access while making full use of high-capacity tapes, reducing the number of cartridges needed.

Another feature of the VTS is that it simplifies migration from one storage media to another because its interface with the host can remain the same while actual tape technology evolves. This means that a user can upgrade tape drives (for example, from a 3480 to a 3590), while the VTS interface continues to emulate the 3480 drives. Thus, the user and host software can continue using the same data structure. The host simply sees a larger number of virtual cartridges and drives. The VTS eliminated the rows of 10%-filled tapes and provided customers with the means to easily upgrade to newer, higher-capacity tapes and drives, thus growing their storage capacity without adding more libraries. More details and information about the VTS can be found in [6].

LTO and the 3584 library

In the mid-1990s, IBM formed a consortium with tape-drive manufacturers Hewlett-Packard and Seagate and developed the Linear Tape-Open (LTO) format [7] and hardware based on the new specification. The new linear-format tape is an alternative to and provided functionality beyond that offered by existing tape formats, including Digital Linear Tape (DLT). With the development of the LTO format and drive, IBM also developed a new large tape library, the 3584, as well as a smaller library and an autoloader.

The 3584 library, shown in Figure 7, offers customers more options in automated large-capacity storage. The library can handle LTO as well as DLT drives, and many more drives per frame are possible. For example, a 3494 L-frame (the first frame for the library, as opposed to a D-frame, which is an expansion frame) could have two drives, compared with a maximum of 12 drives in a 3584 L-frame. At present, the library can be from one to 16 frames long, hold a maximum of 192 drives, and contain a maximum of 6881 200-GB (uncompressed) second-generation LTO tape cartridges. It is capable of 550 mounts per hour with only one accessor. The 3584 library offers both Fibre Channel and SCSI open-systems connectivity and can be configured with multiple logical libraries so that it can communicate with multiple hosts. In addition to these features, the 3584 is designed to reduce the manufacturing cost and improve the reliability of the machine. In fact, the library is designed to exceed one million mean swaps between failures (MSBF).

Figure 7 Figure 7

The next step

Demands in storage are increasing daily, not only for capacity, but for higher speed, lower cost per square foot, and higher continuous reliability and availability. Large quantities of data are being generated at accelerating rates. While large volumes of data have always been generated by credit card companies, banks, brokerage firms, engineering firms, geological surveys, and the like, the rate at which data is generated is increasing daily as more and more individuals and small businesses go online. To continue meeting storage demands, tape automation will have to address key issues such as reliability, cost, performance, partitioning, virtualization, transparent migration to new formats and technologies, and increasingly autonomic functions.

Reliability is an increasingly important issue. With customers depending heavily on their data, downtime—which can result from natural disasters, power outages, computer crashes, a library failure, or even normal maintenance such as loading new microcode onto the library—can have a significant impact on a customer's operations. Although customers are starting to install duplicate or mirror-image systems, any nonworking machine, even for backup data centers, represents a potential loss of system performance or access to data. One way to mitigate this exposure is to design libraries that can stay online when code is updated and when maintenance, upgrades, or migrations occur. Additions and removals of major components such as drives, power supplies, or even expansion frames while the library is still connected and running, known as hot swaps, are now required to reduce or completely eliminate downtime. Firmware and software must also be upgraded much more rapidly and preferably without taking the library offline. Redundant components, such as dual grippers or dual data paths, increase availability. However, it is important to realize that availability also means that all of the data must be reachable all of the time; in dual accessors, if one of the accessors is down, the other accessor must still be able to reach all of the data. Thus, if one accessor is not functioning, the other one must be able to move the defective one out of the way, and this feature is now offered as an option on the 3494.

Storage cost is always a vital customer concern. Cost, often measured in dollars per megabyte or gigabyte, is the main reason that tape is still so popular, particularly now due to tape automation. Before automation, drives required constant manual tending, with its associated employee expense. Robots can be faster than people and considerably less expensive than 24-hour human coverage. The cost of automated tape storage has decreased dramatically since 1993, when the 3494 first became available. A customer would initially pay $71.93 per gigabyte for a two-frame 3494 machine with four drives. In 2002, a 3584 library with a similar configuration averaged out to about 52 cents per gigabyte.2

Because tape is relatively inexpensive, it is very affordable to keep multiple copies in separate locations, thus providing more protection against data loss. Users are more aware and concerned with off-site storage than ever before, particularly given recent world events. Consequently, in the future, more backup systems will have their own backup systems. Also, as the storage capacity of each cartridge increases over time, the required library size will shrink. Companies will invest in keeping multiple smaller libraries in different locations.

Another technique that can affect both storage costs and reliability is remote management of the library through a World Wide Web interface. This ability—available today on the 3584—allows the library to notify customer engineers and any other appropriate person when the library is having problems. In the future, it may even be possible to perform certain repairs without an actual site visit.

Library system performance, which includes data rate and mounts per hour, is also important to customers. Customers backing up data on multiple systems want identical copies of data with, ideally, no time lags. With improved connectivity, the time lag has decreased, and this will continue to advance with technology. One major step toward improving performance was to offer native Fibre Channel connectivity as an alternative to previously popular SCSI for interfacing with the host. The current native Fibre Channel rate is 1 Gb/s, and it will soon be doubled. In addition, the 3584 establishes connectivity through the drives rather than through the library, which makes it considerably easier to upgrade the connectivity. As drive connectivity evolves, the library evolves with it, without anything else having to change (in contrast, when connectivity is through the library, the entire library connectivity must be upgraded each time technology changes).

Performance can also be improved by increasing both the rate at which tapes are moved to and from the drives and the read/write speed of the drives themselves. The IBM 3584 library is currently capable of performing up to 550 mounts per hour. In the future, libraries will be able to increase their own efficiencies by looking ahead at the requested moves and then optimizing motions. This will help increase library efficiency.

IBM has recognized these needs and has made numerous improvements to our libraries to meet them. This is most evident in the 3584 library, which was designed to lower manufacturing cost while improving reliability. For example, the gripper, used every time a cartridge is moved, is now fabricated from injection-molded plastic parts, improving its speed and reliability. The basic design has minimized the number of moving parts to decrease the risk of wear and fatigue failures, particularly by eliminating flexing cables. Also, the digital servo loop provides better control over all motions. With this technology, the library contains the exact positions of the accessor and grippers at all times, and the firmware can control the maximum speeds and accelerations. The improved calibration technique on the 3584 also increases the accuracy of the accessor. The new calibration technique examines the top and bottom of every column of cells as well as the location of every drive. With this advanced calibration, giving greater accuracy, the machine can perform faster.

Libraries will also need to easily and constantly manage multiple types of media. This is often accomplished by having different grippers for different media types. Universality will be important in the future. A single, robust, relatively inexpensive gripper that can handle all media types with no loss of reliability will be the next step in gripper technology.

Another important aspect for libraries in the future will be to make them autonomic. One way to accomplish this is to provide the ability for a library to recognize that a certain area is not accessible, and have the accessor completely avoid it—a no-fly zone. Other autonomic attributes could include automatically calling home, retrieving the latest microcode for drives and libraries, and integrating it seamlessly into the hardware; constantly monitoring velocity, power consumption, and acceleration, predicting early failure, and notifying the customer engineer and customer that it is time to change a component [8]; monitoring and regulating the temperature and humidity within the library; and operating in lower-power modes to reduce energy consumption, for example, by powering down drives when they are not in use.

Summary

Tape storage has improved dramatically over the last 50 years, but the most significant event in the revitalization of tape was the advent of automation. When IBM introduced the 3480 rectangular half-inch tape cartridge, it enabled automation, thereby launching the revolution. Libraries became visual centerpieces of data centers, with IBM designing windows into its library so that customers could easily see the robots moving. Now, as data is being produced at an incredible rate, inexpensive, dependable storage is more important than ever. Users are demanding more reliable, faster, larger, less expensive storage, and will always consider competitive formats such as HDD. However, cost considerations still drive many users to choose tape over HDD or optical disks. As tape technology improves, virtual tape systems make it possible for data to be migrated easily and invisibly to newer, higher-density media, while upgrade costs and difficulties are reduced.

In addition, tape has two major advantages over HDD. The first is the ability to remove a cartridge to a completely separate location for safekeeping, and the second is the stability of the storage media over time. Tape is considerably more stable over long periods (years) of inactive time, whereas HDD mechanisms must be activated periodically (every few months) to maintain viability. Tape libraries can help keep data secure and readable for years to come. It is anticipated that tape automation will continue to advance and make storage even more inexpensive, more dependable, and simpler for customers.

Acknowledgments

The authors thank R. Bradshaw and B. Slawson for their help in the preparation of this manuscript.

References

Footnotes

1At the time, it was believed that the tapes had to be stored this way, but later research determined that horizontal storage did not damage the tape.
2R. Spackman, personal communication, April 2002.
Linear Tape-Open, LTO, and Ultrium are registered trademarks of International Business Machines Corporation, Hewlett-Packard Corporation, and Seagate Corporation in the United States, other countries, or both.

Received May 17, 2002; accepted for publication January 27, 2003; Internet publication June 25, 2003