|
|
Autonomic computing and IBM System z10 active resource monitoring
|
|
|
by T. B. Mathias
and P. J. Callaghan
|
|
|
|
Among the essential components of the IBM System z10™
platform is the hardware management console (HMC) and the
IBM System z™ support element (SE). Both the SE and the HMC
are closed fixed-function computer systems that include an
operating system, many middleware open-source packages, and
millions of lines of C, C++, and Java™ application code developed
by IBM. The code on the SE and HMC is required to remain
operational without a restart or reboot over long periods of time. In
the first step toward the autonomic computing goal of continuous
operation, an integrated, automatic software resource monitoring
program has been implemented and integrated in the SE and HMC
to look for resource, performance, and operational problems, and,
when appropriate, initiate recovery actions. This paper describes
the embedded resource monitoring program in detail. Included are
the types of resources being monitored, the algorithms and
frequency used for the monitoring, the information that is collected
when a resource problem is detected, and actions executed as a
result. It also covers the types of problems the resource monitoring
program has detected so far and improvements that have been
made on the basis of empirical evidence.
Full paper
|
|
|
|
|
|