Introducing IBM Haifa Research Lab
May 10, 2009
Organized by TAU/CS and IBM Haifa
Research Lab
Abstracts
Activities for healthcare and life sciences
Ohad Greenshpan
The Healthcare and Life Sciences group is unique in its focuses on an industry sector. The group was established about 10 years ago, and since then has dealt with various directions in the IT space. The more traditional directions deal with mechanisms to archive, store, manage, transfer, and access health-related data. Over the time, the research focus has evolved to other directions such as data analytics, web data management and integration, business process management, and others. In my talk I will give an overview of our group’s activities, discuss the challenges in the healthcare domain, and drill down to several interesting projects. I will conclude by giving some examples of how research is being done, trying to give a feeling of the role of a researcher in our labs, of our relationship with academia, and ways to initiate collaboration with one another.
Machine learning analysis of clinical genomic data -
from HIV positive to hypertensive patients
Michal Rosen-Zvi
In my talk I will illustrate how advanced machine-learning
and data mining techniques available today, along with the
abundance of medical data, can provide a powerful decision
support system and contribute to the emerging area of
information based medicine.
In recent years there is a lot of focus on the personalized
treatment approach where therapy provided to a patient can
be optimized based on the individual’s personal
clinical-genomic factors. In the past few years we have
carried out clinical genomic analysis that falls under this
broad approach. This research has been carried out as part
of two different EU-funded projects. In the EuResist
project we learnt to optimize cocktails for HIV patients
based on the virus genome and other factors. The main result
is a freely available decision support system -
http://engine.euresist.org/.
In the Hypergenes project we are learning to correlate
between SNPs (single nucleotide polymorphism) and
hypertension.
Constraint satisfaction: from theory to solving complex
industrial problems
Merav Aharoni
Constraint satisfaction involves finding a solution for a
given set of constraints over a set of variables with finite
domains. Typical problems come from the fields of artificial
intelligence and operations research. They vary from
riddles, such as Sudoku, to industrial applications, such as
finding an optimal assignment for thousands of professionals
to open positions.
We have developed two powerful engines for solving general
constraint satisfaction problems at the IBM Haifa Research
Lab. The systematic constraint solver, GEC, is based on the
maintain-arc-consistency (MAC) algorithm that involves
reducing the variable domains by removing values which
cannot partake in any valid solution. The stochastic
constraint solver, Stocs, is based on stochastic
local-search.
In this talk, we outline these two algorithms with an
emphasis on active areas of research. We will also describe
a few of the applications that use these tools as their
solving engine.
Static analysis of programs and models
Yishai Feldman
Many large corporations have huge complex systems in
operation, and maintaining them is becoming more and more
difficult. I will describe two directions of research on
the static analysis of legacy systems, aimed at building
tools to assist legacy modernization.
We are building an infrastructure for static analysis of
enterprise systems, based on a language-independent internal
representation. On top of that infrastructure we are
building tools for program understanding and
transformation. As part of this work, we have designed a
family of new program-slicing algorithms, which produce more
accurate slices than state-of-the-art algorithms, especially
for unstructured languages. We are also developing flexible
and reliable code-motion refactoring algorithms. For the
most fundamental refactoring, Extract Method, these do not
require that extracted code be contiguous.
On a different level, System Grokker is a tool that extracts
models and analyzes them to increase the level of
abstraction. System Grokker improves system understanding
by allowing the representation of various system
organization views, calculation of metrics, and discovery of
high level abstractions and patterns. It assists system
validation by exposing problematic relationships and
anti-patterns, and supports software evolution process by
suggesting architectural improvements and simulating
architectural changes.
Cloud computing: automating service elasticity in
RESERVOIR
David Breitgand
In the first part of my talk, I will give a brief overview
of RESERVOIR, a project funded by EU in the framework of the
7th Programme, led by IBM Haifa Research Labs. RESERVOIR
explores novel technologies for federated cloud computing,
aiming at sharing IT resources "without
borders".
In the second part of my talk I will present a simple, yet
powerful, methodology for application-independent diagnostic
and remediation of performance hot spots in elastic
multi-tier client/server applications, deployed as
collections of black box Virtual Machines (VM) in an IaaS
Cloud such as RESERVOIR. Our out-of-band black-box
performance management system, Network Analysis for
Remediating Performance Bottlenecks (NAP), listens to the
TCP/IP traffic on the virtual network interfaces of the VMs
comprising an application and analyzes statistical
properties of this traffic.
From this analysis, which is application independent and
transparent to the VMs, NAP identifies performance
bottlenecks that might affect application performance and
derives application resizing decisions that are most likely
to alleviate performance degradation.
We prototyped our solution for the Xen hypervisor and
evaluated it using the popular Trade6 benchmark that
simulates a typical e-commerce application. Our results show
that NAP successfully identifies performance bottlenecks in
a complex multi-tier application setting, while incurring
negligible performance overhead.
Muli Ben-Yehuda, David Breitgand, Michael Factor, Hillel
Kolodner, Valentin Kravtsov, Dan Pelleg, "NAP, a
Building Block for Remediating Performance Bottlenecks via
Black Box Network Analysis", to appear in 6th IEEE
International Conference on Autonomic Computing (ICAC'09),
June 15-19, Barcelona, Spain.
Green storage and beyond - new challenges in today's
storage systems
Dalit Naor
The landscape of storage technology is changing - performance is no longer the main objective. Storage systems today need to efficiently cope with data deluge and need to support new cost and business models for storing and preserving the data. In this talk I will review some emerging challenges in this area and will concentrate on one particular challenge - energy efficient storage systems. I will present two distinct approaches that were studied in our group at IBM Haifa Research Lab to deal with power reduction in storage systems. The first approach is based on turning off storage units while the other approach is based on reducing the power consumption by tuning the I/O workload.
For the first approach we consider large scale, distributed storage systems (a la Cloud) with built-in redundancy mechanisms. We investigate how such systems can reduce their power consumption during low-utilization time intervals by operating in a low-power mode, whereby a subset of the disks or nodes are powered down. We investigate the power savings attainable under various scenarios.
For the second approach, we must deal with the fact that real power measurements in storage are hard to come by. We developed a scalable power modeling method that estimates the power consumption of storage workloads. The modeling concept is based on identifying the major workload contributors to the power consumed by the disk arrays. Our power estimation results are highly accurate compared to real measurements conducted in a lab setting.
My talk will be based on: [1] Low Power Mode in Cloud Storage Systems, by Danny Harnik, Dalit Naor and Itai Segall, to appear in SMTPS 2009. and [2] Storage Modeling for Power Estimation, by Miriam Allalouf, Yuriy Arbitman, Michael Factor, Ronen Kat, Kalman Meth and Dalit Naor, to appear in SYSTOR 2009.