Skip to main content

IBM Research - Haifa

Networking & Systems Architecture

The Networking & Systems Architecture group specializes in operating system and hypervisor R&D, with a strong focus on I/O technologies. We are currently engaged in a number of projects in diverse areas such as I/O performance for virtual machines, nested virtualization in the Linux Kernel-based Virtual Machine (KVM) hypervisor, application-aware memory overcommitment in clouds, and networking in virtualized mega-scale datacenters.

The Linux Kernel-based Virtual Machine (KVM) hypervisor has gained a lot of attention in recent years and is an attractive open source hypervisor which is included in all Linux distributions. Our team is engaged in hypervisor research and development, with KVM as our platform of choice.

One of our focus areas is efficient I/O virtualization. During 2008 we worked with the KVM community on direct assignment of PCI devices to virtual machines. Direct assignment enables an unmodified virtual machine to use its own driver to directly interact with a real device. The main advantages of direct assignment is performance improvement compared to other I/O methods that require hypervisor involvement in the I/O path, and the ability to support any odd-ball device that does not have emulation support in the hypervisor. We contributed direct assignment support to KVM, and it is now included in the Linux kernel. Using a combination of polling schemes and core affinity we showed that it is possible to achieve less than 5% overhead for I/O-intensive workloads with both device assignment and para-virtualized devices.

Another focus area is nested virtualization. In classical machine virtualization, a hypervisor runs multiple operating systems simultaneously, each on its own virtual machine. In nested virtualization, a hypervisor can run multiple hypervisors with their associated virtual machines. As operating systems gain hypervisor functionality---Microsoft Windows 7 already runs Windows XP in a virtual machine---nested virtualization will become necessary in hypervisors that wish to host them. The IBM Turtles Nested Virtualization project, which is part of the Linux/KVM hypervisor, runs multiple unmodified hypervisors (e.g., KVM and VMware) and operating systems (e.g., Linux and Windows). Despite the lack of architectural support for nested virtualization in the x86 architecture, it can achieve performance that is within 6-8% of single-level (non-nested) virtualization for common workloads, through multi-dimensional paging for MMU virtualization and multi-level device assignment for I/O virtualization.

Yet another focus area is data-center networking. For the zEnterprise Platform Tunneling Service project, we are designing and prototyping a network extension service allowing a zEnterprise System ensemble to span multiple remote sites. This network extension service is deployed and managed as part of the platform, providing a secure and highly available network extension.

We are also interested in cloud computing. Memory is the most valuable sharable resource and it limits cloud providers' ability to host more VMs in a single physical machine. Almost all commercial hypervisors implement memory over-commit mechanisms such as ballooning, page sharing, host swapping, and compression. However, they lack a policy to manage these mechanisms without noticeable performance degradation.

To solve this problem we are researching and developing Ginkgo, a memory over-commit policy framework to achieve little or no performance degradation while minimizing memory consumption, thus allowing cloud providers to run more virtual machines in a single physical host. Ginkgo monitors the applications, learns the performance under different memory allocations, and decides how much memory every virtual machine needs.

Yet another focus area is the intersection of multi-core architectures and I/O stacks. Data storage technology today faces many challenges, including performance inefficiencies, inadequate dependability and integrity guarantees, limited scalability, loss of confidentiality, poor resource sharing, and increased ownership and management costs. Given the importance of both direct-attached and networked storage systems for modern applications, it becomes imperative to address these issues. Multicore CPUs offer the promise of dealing with many of the underlying limitations of today's I/O architectures. However, this requires careful consideration of architectural and systems issues and complex interactions in the I/O stack, all the way from the application to the disk.

The IOLanes EU project aims to analyze and address these challenges throughout the I/O path. Our approach breaks down the I/O stack in four important layers: (a) application and middleware, (b) virtual machine, (c) host operating system, and (d) embedded storage controller. The proposed work analyzes and addresses the inefficiencies associated with these layers on multi-core CPUs, by designing an I/O stack that minimizes unnecessary overheads and scales with the number of cores. Since storage systems are perhaps the most critical component of modern computing infrastructures, the proposed work will benefit many I/O-intensive applications that support activities of businesses, organizations, and individuals alike.

Previous projects we have worked on include TCP/IP stack acceleration, the internal networking architecture of a distributed storage controller, intrusion detection for storage, the development of the iSCSI standard and boot-over-iSCSI functionality (iBoot), the design and development of IBM's Blade Center Open Fabric Manager, investigating IOMMU support in operating systems and hypervisors, studying IOMMU performance, devising a scalable architecture for I/O virtualization, and building the IP Only Server.

Links

Manager

Benny Rochwerger, Manager Networking & Systems Architecture, IBM Research - Haifa

: Manager Networking & Systems Architecture, IBM Research - Haifa

Links

SYSTOR 2012