|
Storage
Systems
|
|
|
Computer
Science > Storage
Systems
> Computer Science Brochure
|
|
| Computer Science Brochure | |
|
Storage Systems The explosive growth of information and the rising cost of managing data increase the complexities of determining where to store data and how to access and protect it. IBM Research works on exciting new projects to revolutionize data storage and management. For example, work is in progress on the next generation home portal system that not only will change the way people store and access personal information, but will provide innovative services for audio, video, broadcast TV, networking, and Internet connections and interactions. We are also designing enterprise-level storage systems that leverage new, high-speed network protocols to centralize storage management, dramatically reduce storage management costs, make data sharing easier across heterogeneous platforms, provide incremental scalability that can handle petabytes of data, and ensure high system availability and reliability. iSCSI IBM Research pioneered the development of a new storage networking technology called iSCSI. IBM is the first major storage system vendor to bring an iSCSI product to market. iSCSI is now an Internet Engineering Task Force (IETF) standard that enables simple and inexpensive TCP/IP Ethernet networking to be used for storage interconnect. iSCSI gives small and medium-sized enterprises the opportunity to use storage technology that was, until now, available only to large enterprises. Network Attached Storage In the area of Network Attached Storage (NAS), we are creating a production-level system using mostly open-source components and integrating IBM's General Parallel File System (GPFS). GPFS allows enterprises to add storage to their NAS systems, while maintaining a single-system NAS image for their users. GPFS was developed by IBM Research for clusters and Storage Area Networks (SANs). It is based on the shared disk model and implements full parallelism for both data and metadata. GPFS has demonstrated scaling up to 512 nodes (8192 processors) with throughputs of several gigabytes per second to an individual file and to individual file systems of more than 120 terabytes. In addition, using GPFS, we broke the world record for sorting by a factor of almost 3. Storage Tank Storage Tank is a distributed file system that exploits SAN technology. It provides concurrent data sharing across multiple heterogeneous platforms with performance that is comparable to that of native file systems built on bus-attached, high-performance storage. Storage Tank provides high availability, increased scalability, load balancing, and fail-over processing. In addition, it provides industry-leading, integrated, policy-based storage management functions that reduce storage management costs by simplifying policy specifications and reducing the need for highly trained storage administrators. UFiler Our UFiler project provides a virtual disk space on the Internet to which all Internet users can connect. With UFiler, users can access their files from anywhere and share files with other UFiler users. Internet service providers, corporations, universities, and other public organizations manage and back up files stored on UFiler. Individuals can also set up their own UFiler servers. The UFiler name space will be as freely federated as that of the web. TurtleFarm Our TurtleFarm project focuses on scalable, fault-tolerant storage systems. Applications for such systems range from Internet companies providing services that require large amounts of inexpensive storage capacity with quality-of-service guarantees. TurtleFarm centers on developing a self-managing, modular, multi-petabyte storage system that provides automatic load balancing, scalability, high availability, and data security. Data Sharing Facility (DSF) As modern processors and networks become faster, centralized (server-based) storage access becomes a performance and reliability bottleneck. The DSF design aims to provide scalable noncentralized ("serverless") high-performance distributed storage access that fulfills the storage access needs of a small office as well as of large SANs. Improved performance and reliability is achieved by separating storage management from file management and by making use of cooperative caching, dynamic distribution policy for improved load balancing, and a recovery mechanism based on metadata shadowing and on atomic transactions at the storage level. Please contact Paridhi Verma to obtain copies of the Computer Science Brochure |