Cloud Object Store for Fixed Content
Here are the basic characteristics of our cloud object store.
Data model. Cloud storage services are heading away from traditional file systems. The new cloud data model manipulates immutable objects that are write-entirely and never modified. Objects are organized within containers and are accessed through the web via a simple HTTP RESTful protocol. This protocol is naturally suited for cloud use cases and is easy to extend with new features such as deduplication and end-to-end integrity.
Standard-based. Our object store implements the Cloud Data Management Interface (CDMI), an emerging standard interface for storage cloud defined by SNIA (Storage Networking Industry Association).
Rich metadata support. Our objects are smart data objects with extendible and rich (system- and user-defined) metadata. We support not only storing the metadata but also querying over it. The metadata enable to query, group, describe the objects, as well as develop relationships among the objects.
Data placement and replica management. We support data placement obeying geographic constraints. Objects are replicated via an automated and transparent replication mechanism. The replication is transparent to users and does not disrupt regular I/O traffic.
Active-active multi-site, multi-datacenter cloud. Our architecture runs over multiple datacenters on multiple sites that are geographically distributed. We support symmetric replication via an optimistic replication mechanism (aka "eventual consistency"), therefore requests can be served from any replica at any time. This enables load balancing and availability during network partitioning.
Built-in resiliency. Our assumption is that failures occur all the time and therefore the object store must support both data- and process- resiliency. When failures such as disk, node, or datacenter failures occur, our object store automatically detects the failure and restores the desired degree of replica, using the built-in redundancy.
Secure access. We are developing a new authorization model that will ensure secure access in the cloud environments.
Computational Storage (Storlets). We introduce "stored procedures" for storage cloud which provide the ability to run computations ("storlets") safely and securely, close to the data in the cloud. Storlets typically run in a sandbox, loaded as objects and triggered by events on objects (e.g., put/get) or on their associated metadata attributes.
Client-side over-the-WAN deduplication. Storage deduplication technologies eliminate the inherent redundancy and repetition within the data, by storing only a single copy of repeating data. Client-side deduplication attempts to identify deduplication opportunities at the client and save the bandwidth of uploading copies of existing objects to the server. Our cloud object storage implements client-side deduplication of full objects, by extending the CDMI protocol and API and implementing full-object deduplication on the cloud, thus achieving savings of storage capacity and bandwidth.