IBM®
Skip to main content
    Country/region [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    

IBM Journal of Research and Development

IBM BladeCenter Systems   Volume 49, Number 6, 2005
Table of contents: HTMLPDF This article: HTML PDFDOI: 10.1147/rd.496.0873Copyright info

BladeCenter T system for the telecommunications industry

by S. L. Vanderlinden,
B. O. Anthony,
G. D. Batalden,
B. K. Gorti,
J. Lloyd,
J. Macon, Jr.,
G. Pruett,
and B. A. Smith

This paper describes the IBM eServer™ BladeCenter® T system, an extension of the BladeCenter platform designed for the specific and rigorous requirements of the telecommunications industry, such as compliance with Telcordia Technologies Network Equipment – Building Standards (NEBS™) specifications. The Telcordia NEBS documents and the analogous documents written for the European marketplace by the European Telecommunications Standards Institute define the range of environmental and electrical parameters presented by this market segment. A key characteristic of the telecommunications industry is its focus on the availability and redundancy of the equipment used to provide service to its customers. These requirements imposed significant design changes on the BladeCenter platform and software, while maintaining a solution compatible with the original BladeCenter architecture. This paper provides details of the design changes that were made in the areas of hardware, software, systems management, and integration, and concludes with examples and a discussion of customer solutions with the BladeCenter T system.

Introduction

In North America, requirements for telecommunications products are published in a set of design documents originally written by Bellcore (Bell Communications Research) and later transferred to Telcordia Technologies, Inc., collectively called NEBS** (Network Equipment–Building Standards) [1]. These documents, embodied in numerous generic requirements (GRs), special reports (SRs), and technical references (TRs), represent decades of experience in delivering reliable, safe, quality communications to millions of users who expect a dial tone when they pick up their telephone. The primary documents that make up the NEBS publications deal with the spatial, power, and cooling requirements of the equipment and the performance of the equipment both during normal operation (e.g., radiated electromagnetic emissions and immunity, conducted emissions and immunity, normal temperature and humidity ranges) and during abnormal conditions (e.g., extreme temperature or humidity conditions, seismic activity, airborne particles or chemicals, fire, electrostatic discharge, power transients, or abuse).

A product being introduced into the telecommunications marketplace cannot claim to meet these requirements on its vendor's good word. In fact, because most service providers require that an independent third-party company perform testing, hardware developers employ independent test laboratories (ITLs) to test their products. ITLs must study the hundreds of pages of GRs, SRs, and TRs and understand the accompanying test procedures and carrier checklists. They invest millions of dollars in test equipment and provide the engineering expertise to perform the tests required to ensure that products developed by vendors meet the requirements of the carriers and service providers, who in turn strive to meet the requirements for reliable communications in a highly competitive marketplace.

Telecommunication equipment can be found in underground bunkers, in outdoor shelters, in small concrete buildings on dirt roads, in central offices (COs) just off the freeway, and in multistory buildings in thousands of large cities around the world. The NEBS documents—and the analogous documents written for the European marketplace by the European Telecommunications Standards Institute (ETSI) [2]—define the range of environmental and electrical parameters found in these locations, which literally extend to the ends of the earth.

To complement the system hardware, a variety of software components are required to integrate a product into the existing telecommunications infrastructure, including systems management and operational software. Systems management integration requires some alterations to the standard, enterprise-oriented, IBM eServer* BladeCenter* components to align with existing applications in a manner similar to the adaptations made for the hardware to support NEBS specifications. The support for operating software associated with the BladeCenter T system exceeds that for typical IBM xSeries* servers, since no one company delivers and supports a complete suite of telecommunications-oriented software. Investments in architecture and participation in standards organizations have enabled IBM to provide the level of software support needed for the BladeCenter T system to be a rich, stable platform upon which telecom customers can build their solutions.

Adapting hardware to the telecom environment

The BladeCenter T system (BCT) is a product of the partnership between IBM and the Intel Corporation to develop the BladeCenter platform into a NEBS-compliant product. Leveraging extensive experience with telecom server design, the Intel team used BladeCenter design data to develop the BCT chassis and circuit boards within the parameters defined by IBM.

Chassis physical design considerations

When the original BladeCenter system was launched, it represented not only a significant investment in the product platform itself, but a commitment by IBM to the future investment in a standard form factor for processor blades and switch modules by IBM and future partners. This commitment extended to the migration of the firmware from 1U (1.75-in.) rack servers to processor blades and from the remote supervisory adapter service processor family to the BladeCenter chassis management module. Therefore, in deciding to convert the BladeCenter platform to meet the telecom requirements embodied in the NEBS specifications and other telecom carrier checklists, the basic tenets were to preserve the existing and future investment in the blade and switch module form factors and to build upon the rapidly evolving firmware library. At the same time, the challenge put before the team was to ensure that certain telecom-specific design criteria were met:

  • Conform to the Telcordia NEBS Level 3 and ETSI standards:
    • Operational environment range of +5°C to +40°C, with short-term excursions from −5°C to +55°C.
    • Ability to withstand seismic vibrations as defined by Zone 4 levels.
    • Expanded range on electromagnetic interference emissions and electrostatic discharge (ESD) immunity levels.
  • Confine the overall depth including all mechanical features to 600 mm and a minimum of 70-mm cable bend radius in the rear of the chassis.
  • Confine the width to 17.5 in. for ease of mounting in 19-in. or 23-in. two-post racks.
  • Maintain a ratio of one processor blade per rack unit (RU, or 1.75 in.) in height.
  • Allow either 220 VAC worldwide or −48 VDC operation.
  • Augment the base BladeCenter firmware library to include telecom-specific features.
Modifications to the base BladeCenter design

The BladeCenter architecture is thoroughly described in [3] and the other BladeCenter papers referenced therein. While the chassis underwent radical modifications to meet the telecom dimensional requirements given above, the architecture required little, if any, modification to meet telecom industry redundancy requirements. The original system already possessed redundant power and cooling, redundant systems management, redundant communications links, and redundant power inputs.

Hence, BCT design decisions were reduced to determining the number of processor blades that could be accommodated in the same number of RUs, and how to provide adequate power and airflow given the remaining three-dimensional space in the resulting envelope. Finally, consideration had to be given to the fact that the trace lengths for the SerDes (serializer/deserializer) links between the blades and switch modules could not significantly exceed that of the BladeCenter system or the signal integrity of the solution could be called into question.

Figure 1 depicts the BCT front and rear views. The most radical BCT change from a chassis perspective was to accommodate the same blades and switches as the BladeCenter system within the required 600-mm front-to-back dimension. Given that the blades themselves are nearly 450 mm long and that 70 mm had to be allocated to cable bend radii, this did not leave sufficient space to maintain the BladeCenter packaging approach with all of the switches, blowers, and power supplies in the rear of the unit. Several key changes were made to enable BCT to fit as required. The blowers and power supplies were repositioned to either side of the blades, rather than behind them. Fans were placed inside the power-supply enclosures with dedicated exhaust channels to allow for independent cooling of the supply electronics. Two blowers with individual speed control and monitoring capabilities were added to allow the management module to more aggressively manage the airflow within the chassis, particularly in harsh conditions.

Figure 1 Figure 1

The management module input/output (I/O) connections [4] were decoupled from the modules and distributed across the local area network (LAN) and keyboard/video/mouse (KVM) modules in the back of the chassis. In addition, multiplexers were added to the backplane to link the active management module with the LAN and KVM modules. A Z-shaped flex circuit was added between the switch modules and the backplane to allow the switch modules to be recessed into the chassis. This was a critical step in reducing the overall depth. As an added benefit, the switch modules receive ambient air directly, rather than air that has been preheated by other electronics; this is key to BCT supporting the NEBS short-term high-temperature limits. Careful layout, extensive simulation, and validation of the flex circuit enabled the chassis to support the BladeCenter switches and processor blades without modification.

Finally, as detailed in a later section, the processor blade handles were redesigned so that the distance they protrude from the front of the chassis was reduced from 31 mm to 10 mm.

Chassis identification

Although the processor blades and switch modules are interchangeable between the BladeCenter and BCT systems, in some instances the processor blades and switch modules must be able to detect the type of chassis in which they are installed, preferably via a method that does not require the user to provide that information. Switch modules in particular must be able to detect the chassis type in order to ensure that the images presented by the setup-and-control software match what the system administrator sees, i.e., the correct number and orientation of the processor blades.

Every time a processor blade or switch module is installed in a BladeCenter system, the management module writes a string of information into a circular history log in the vital product data (VPD) area of the installed module. This information contains the chassis universal unique identifier (UUID) and an identifier of the chassis type. By interrogating the chassis type, the installed module can modify any required configuration information, graphical user interfaces or text menus, and operational parameters, such as thermal thresholds.

Environmental considerations

The BCT chassis, blades, and switch modules are designed to withstand exposure to a wide range of environmental conditions, defined by the NEBS specifications, that potentially could be encountered in a CO or other telecom environments. The primary concern, of course, is the temperature range; temporary outages of air conditioning equipment must be tolerated, both for high temperatures (up to 50°C ambient) and temperatures below freezing (−5°C).

This temperature range, compared with the 10°C to 35°C range specified for an enterprise, requires particular attention to board layout designs and the design requirements for component heat sinks. Additional heat sinks may have to be added or existing heat sinks may have to be redesigned in order to take increased heat dissipation into account. To allow for temperature extremes during processor blade development projects, the thermal modeling and unit-level test efforts have been expanded to incorporate the NEBS specifications.

Additional consideration must be given to increased range for radiated emissions (upper limit increases from 103 MHz to 104 MHz) as well as immunity to higher electrostatic discharge levels (e.g., increase from 4 kV to 8 kV for direct contact discharge). Radiated emissions are covered during BCT component qualification by including “sniff testing” over the extended range to gain confidence prior to the formal NEBS testing. To reliably pass the NEBS ESD immunity tests, some modifications to the blade bezel and associated grounding points have been required.

Firmware design considerations

Although the majority of the management module firmware in the BCT system has been reused from the BladeCenter system, some modifications were necessary to support the hardware variations and telecom environment. The modifications occurred in the areas of alarm management, blower management, fouled filter detection, power management, and blade ordering.

For BCT to support GR-474-CORE requirements [5], the event processing in BladeCenter product firmware had to be modified. The original firmware classified events as system error, warning, and informational. The BCT firmware must classify events as critical alarm, major alarm, minor alarm, and informational. Alarms are defined in GR-474-CORE as follows:

  • Critical alarms are used to indicate that a severe, service-affecting condition has occurred and that immediate corrective action is imperative, regardless of the time of day or day of the week.
  • Major alarms are used for hardware or software conditions that indicate a serious disruption of service or the malfunctioning or failure of important circuits. These troubles require the immediate attention and response of a craftsperson to restore or maintain system capability. The urgency is less than in critical situations because of a lesser immediate or impending effect on service or system performance.
  • Minor alarms are used for troubles that do not have a serious effect on service to customers or for problems in circuits that are not essential to network element operation.

The firmware component telecom alarm manager (TAM) extends the event-processing function found in the BladeCenter design to provide the event management required by BCT. TAM maps events into three alarm-severity classifications: minor, major, and critical. These alarm severities correspond directly to the three light-emitting diodes (LEDs) and relays on the BCT chassis telecom alarm panel. Events that do not map to an alarm are considered informational.

Mapping of events to alarms is done by a combination of three methods:

  1. User configuration, which takes precedence over the following two methods.
  2. A table built into the firmware directly maps events to alarm severity.
  3. Several tables in BladeCenter firmware are interpreted to map BladeCenter errors into either critical or major BCT alarms and to map BladeCenter warnings into minor BCT alarms.

In addition to alarms being represented on the telecom alarm panel LEDs and relays, users are notified of alarms through interfaces such as IBM Director, Simple Network Management Protocol (SNMP), HyperText Markup Language (HTML), and the command-line interface. In addition, to meet worldwide requirements, the active color of the critical and major alarm LEDs (either amber or red) is selectable through these interfaces.

BladeCenter blower management firmware was also modified to support the BCT chassis and meet telecom industry requirements. BladeCenter firmware reacts almost instantaneously to data from processor temperature sensors to set blower speeds. However, the BCT firmware uses averages and trends in blade processor temperature to determine a blower speed that maintains the operating temperature within set limits. It does this in a way that minimizes blower noise and optimizes blower life. The BCT algorithm does this for four blowers as opposed to the two blowers in the BladeCenter system. The individual speed controls on the blowers are used to operate the blowers in pairs, with the top two forming a pair and the bottom two forming another pair. The failure of one blower in a pair causes the other to operate at its maximum rated revolutions per minute until the fault is removed.

In April 2002, new NEBS requirements were added that included the performance of new tests related to airborne contaminants. The level of contamination to be tested was significantly higher than that called for in the BladeCenter specifications. To meet this requirement, an air filter was mounted on the front of the BCT chassis. The NEBS specification not only requires air filtration, but also recommends that there be a mechanism to detect the performance of the filter. To support both the requirement and the recommendation, BCT firmware monitors the flow of air through the filter to determine whether the filter has become clogged. To accomplish this, BCT firmware measures the time required to cool a blade processor at a given ambient temperature. If the time is greater than a predetermined limit, it is assumed that a clogged filter is restricting the flow of cool air into the chassis. The firmware then generates an alarm indicating that the filter must be replaced.

Additional firmware modifications were required to account for the basic construction of the BCT chassis. The BCT chassis framework was designed primarily around the cage holding eight processor blade slots in a horizontal orientation, which allows the major components on the processor blade to face up for cooling and to meet long-term vibration considerations. The 90-degree rotation and associated blade numbering differences are shown in Figure 2. Careful consideration was given to the top-to-bottom rather than bottom-to-top numbering schemes, though no clear usability benefit or customer preference for either was identified. However, this rather benign change did ripple through the management module and higher-level software stacks, which were all based on the assumption that multislot blades would utilize slots n, n + 1, etc. The final required firmware change handles the differences between the BladeCenter and BCT power management. In the BCT blades, 1 through 4 are in power domain A as opposed to 1 through 6 in BladeCenter blades. Also, switch module [67] placement within power domains varies from the BladeCenter design, as illustrated in Table 1. The firmware must take these changes into account when determining whether there is enough available power to allow a blade to be fully enabled.

Figure 2 Figure 2


Table 1 Power-supply differences in BladeCenter and BCT systems.
Power domainBladeCenterBCT

AManagement modules 1 and 2Management modules 1 and 2
Media trayMedia tray
Blade slots 1–6Blade slots 1–4
Switch modules 1–4Switch modules 1 and 2
BBlade slots 7–14Blade slots 5–8
Switch modules 3 and 4

Blade and switch module designs for BCT

Blades and switch modules installed in the BCT for telecommunications deployment must obviously weather the same conditions as the chassis. Along with the environmental, electromagnetic, and seismic requirements of NEBS, the modules were reviewed for the following additional requirements:

  • NEBS precompliance testing: IBM test processes do not cover all NEBS environmental and physical stress conditions. IBM test plans had to be expanded, and extra test equipment, human resources, and time had to be budgeted for the additional tests needed to ensure that the NEBS compliance testing at the independent NEBS test laboratory did not result in a failure.
  • Low-profile handles for processor blades: The normal extraction handles on the front of BladeCenter blades require a depth of 31 mm. Because of the reduced depth of the BCT chassis, the extraction handles were redesigned to a depth of only 10 mm. This 21-mm reduction translated directly to an increase in the bend radius of cables in the rear of the chassis from 50 mm to 71 mm.
  • Increase in the thermal warning setting on switch modules: It was discovered that most switch modules had a thermal warning that was tuned to 35°C, which is the maximum temperature supported by the normal BladeCenter environment. This warning limit had to be increased to match the maximum of 40°C ambient required to conform to the NEBS specification for BCT.
  • Increase in the processor throttling temperature for processor blades: Blades are expected to operate at 100% performance up to the 40°C ambient temperature of the NEBS environment, rather than the maximum of 35°C for the enterprise environment.
  • Modification of graphical user interfaces on which chassis images are displayed: Particularly in the case of switch modules, the setup-and-control software graphics images depict the active links to the internal blade connections. These were modified to dynamically represent the horizontally oriented blades in the BCT system and the vertically oriented blades in the BladeCenter system.
  • Extended lifecycle management: Deployment in a telecommunications environment requires that the supplier have a manufacturing plan in place that calls for the blade or switch to be in production for five to seven years. This calls for a more watchful eye on end-of-life components, resulting in either last-time buys of critical components or designing in and testing part substitutions.
  • Telcordia CLEI** codes [8]: The telecom industry typically requires the assignment of a common language equipment identifier, or CLEI code, to the module. This code is entered into a database maintained by Telcordia and is used for inventory purposes and to indicate the capabilities of the registered module. The code is manifested by a barcode label on the module and is included in the VPD (if available) for querying by systems-management applications.

Despite this lengthy list, the actual experience of taking the processor blades and switch modules from the enterprise environment to the telecom environment did not require any board modifications, as many designers had anticipated in the beginning. We felt that this attested to the rigor of the IBM design and test community, particularly in the power, packaging, and cooling disciplines, aided by electromagnetic compatibility engineers with many years of design and test experience.

Adapting software to the telecom environment

For decades, telecom hardware and its associated software have been designed from scratch. Each equipment supplier has had its own unique set of proprietary systems and software, with little if any commonality. Over the years, service providers have adopted some portions of those proprietary designs and standardized on certain interfaces to allow for efficient monitoring and control of network equipment. The desire for standardized interfaces on the part of both service and equipment providers has increased in recent years, driven in part by economic conditions that required workforce reductions. The equipment supply chain can no longer afford to start with a blank sheet of paper. Common software building blocks are now recognized as key to the rapid deployment of new, revenue-generating services. With this new wave of standardization come requirements that the telecom supply chain support open-standards-based application programming interfaces (APIs) and maintain certain legacy APIs.

These requirements have been addressed in the BCT systems management software stack by relying on the IBM eServer xSeries systems management portfolio, with emphasis on the telecom-specific features incorporated in the IBM Director BladeCenter enterprise systems management, Cluster Systems Management (CSM), and management via SNMP.

The IBM xSeries and BladeCenter systems management strategy is one that encompasses many different hardware platforms and many different customer environments. Thus, not many changes were required to accommodate the systems management features related to BCT, since the portfolio of existing systems management products provides significant functionality that can be leveraged by BCT customers. However, some elements of the systems management stack are of particular interest in a telecom application.

There are fundamentally two systems management interfaces to consider: in-band management and out-of-band management. The in-band management interface is provided by agents that are operating system (OS)-specific applications that execute on the processor blade when it is operating. Out-of-band management uses a path that is external to the blade OS [in the case of BCT, via the management module and baseboard management controller (BMC) [4] interface, both of which can be accessed whether or not the blade is operational]. Figure 3 illustrates in-band and out-of-band management interfaces in a BCT environment.

Figure 3 Figure 3

The in-band management interface is critical for telecom solutions because it supports high-availability middleware that executes on the blade and makes decisions related to task deployment. The decisions are based on the availability or status of blade components. Out-of-band management is important for several reasons. First, in BCT it is the only architected mechanism that provides some of the blade hardware management functions, such as power up, power down, and throttling controls. Second, the out-of-band path provides access to the blade health and asset information irrespective of the OS or hardware operating condition, thus providing a more fail-proof mechanism to manage the hardware. Architecturally, the BCT in-band and out-of-band management interfaces are compatible with the BladeCenter interfaces; hence, the existing systems management capabilities can be leveraged with only minor modifications related to specific BCT features.

IBM Director systems management

Asset and configuration management

Operations support services (OSSs) are used in the telecommunications industry to manage tasks related to billing and inventory management. It is vitally important to know what equipment is available and fully operational. IBM Director [9] provides a robust inventory task that stores all hardware information in a Structured Query Language (SQL) database on the management server. The OSS applications can mine the SQL database to retrieve inventory data from the IBM Director database and consolidate it with their other inventory information.

Upward integration

Customers in the telecom industry have historically deployed enterprise-level systems management applications such as the HP OpenView** line of products [10], IBM Tivoli* Enterprise* Management [11], or Computer Associates Unicenter** [12] in their environments, with OpenView being the market leader. The expectation would be for IBM Director to be used as a complementary management application to collect detailed information on BCT and installed components and relay it to OpenView. The upward integration support feature of IBM Director is designed specifically to allow the integration of BCT inventory, alerts, and configuration with enterprise management applications used by the telecom customers.

Problem and fault management

Accurate and timely problem determination of hardware faults down to a failing field-replaceable unit (FRU) is critical to meet the rigorous telecom industry availability goal of 99.999% uptime (referred to as the 59s). This equates to a maximum downtime of 5.3 minutes per year, including problem determination and FRU replacement.

IBM Director and BCT provide highly relevant tools to quickly focus on problem FRUs. IBM systems are among the industry leaders in the number of potential predictive failure analysis (PFA) alerts that can be generated and can give telecom operators advanced warning of impending failures. A variety of components, when matched with IBM firmware, are capable of generating PFA alerts. These include Small Computer System Interface (SCSI) drives, memory modules, central processing units, fans, voltage regulator modules, and power supplies. For example, if a blower is likely to fail within 16 hours, IBM Director can respond to a PFA notification by generating an alert that is sent to the telecom network administrator via an e-mail, a page, or another preconfigured mechanism. There are other notifications that IBM Director monitors that may lead to alerts being generated that contain the FRU numbers. These are particularly valuable to telecom service personnel by identifying the failing component in a system that may be in a remotely located CO. Additionally, IBM Director contains a rich set of action handlers that can provide the administrator with a choice of actions that can be taken when a critical alert occurs, for example, sending an e-mail or an alphanumeric page, running a program on the failing server, or running a script on the management server.

Real-time diagnostics (RTD) are tasks within IBM Director that provide the capability to run diagnostics against the BCT components while the blades are running live applications. RTD can detect problems with the various buses in BCT, such as the Inter-Integrated Circuit (I2C) and RS-485 buses, which are used for communications among the management modules, switch nodules, and processor blades. The diagnostics can indicate which components have the light-path diagnostics LED on without requiring a technician to be present at the BCT chassis. The hardware status is presented by IBM Director via a traditional color-coded system-health user interface enabling administrators to quickly ascertain the health of critical components. RTD and the IBM Director dashboard are key enablers to the ability of BCT to satisfy the remote management requirements of the telecom industry.

Cluster System Management

As the telecom industry shifts toward commercial-off-the-shelf (COTS) components, Carrier Grade Linux** has been selected as the preferred operating system. Given the nature of COTS hardware and software, the favored implementation has been to use clusters of low-cost systems to meet the desired availability goals, such as 5–9s. To fully leverage the advantages of clusters for Voice over Internet Protocol (VoIP), softswitch, and other telecom applications, the capabilities of Cluster System Management (CSM) become key enabling technologies. IBM has deployed a variety of Linux cluster-based solutions and gained a thorough understanding of the special requirements presented by those configurations—experience that is directly applied to telecom applications.

Linux as an OS has some unique systems-management requirements that are fundamentally different from those of Microsoft Windows**: for example, the bulk of the system and application configuration is performed using files stored in the /etc subdirectory. In a Linux cluster environment, it is therefore important to provide Linux operating-systems-management support across groups of nodes or across the entire cluster. For example, CSM provides the capability to synchronize files across the cluster or selectively update files on a per-machine basis from the management server under the direction of the configuration file manager.

Remote policy-based systems management via command-line tools is a key requirement of telecom network administrators. CSM satisfies this by surfacing 100% of its features and functions via the command-line interface, with cluster-specific enhancements as part of the distributed command execution manager (DCEM). DCEM provides the capability to run commands on multiple machines in the cluster simultaneously with real-time execution status based on fixed or transient system groupings.

Hundreds of systems can be managed effectively by using CSM tools to ensure that the high-availability requirements of the network are met. An additional level of high availability is achieved by utilizing core technologies that sit on top of the Tivoli System Automation Manager. In the event that a management server fails or requires special maintenance, a backup management server takes over monitoring the cluster until the primary server is returned to service.

Management via SNMP

For decades, the telecom industry has depended on SNMP to manage disparate proprietary systems provided by the various network equipment manufacturers. Newer management interfaces and protocols, such as the Common Information Model (CIM) [13] and the Service Availability** Forum Hardware Platform Interface (HPI) [14], have begun to emerge. However, these methodologies are not expected to gain wide acceptance in the telecom industry for several years. Until a new standard is embraced, SNMP support will be required to effectively manage BCT platforms.

In the context of in-band management, using SNMP is accomplished by deploying an IBM Director agent that transforms the native CIM-based events into SNMP traps that can be sent upstream to any SNMP-oriented systems-management application. When using the out-of-band management approach, the management module supplies the SNMP agent, which performs a similar function by mapping internal events to SNMP traps that are sent upstream to an appropriate SNMP application.

In both cases, detailed management information bases (MIBs) are available for use in the SNMP manager application to interpret the contents of the traps. The IBM Director MIB is located on the blade after installation, whereas the management module MIB can be downloaded from IBM over the Web. The BCT management module MIB not only accounts for the chassis differences with respect to the BladeCenter system, but it also makes available telecom-specific features, such as the telecom alarm panel.

Management via IPMI

Another legacy management interface required by some telecom customers is the Intel** Intelligent Platform Management Interface (IPMI) [15], which defines standardized interfaces used primarily to monitor and control server system health. The key characteristic of IPMI is that the functions are available independently of the main processor. Access to information such as temperatures, voltages, fan speeds, and power supply status is provided via the IPMI sensor model.

The Advanced Telecom Computing Architecture (ATCA**) [16] specification, developed for building modular computing platforms by the PCI Industrial Computers Manufacturers Group (PICMG), is similar in function to the BladeCenter system but oriented more toward telecom input/output interfaces. ATCA extends the functionality of IPMI using PICMG-specified extensions. These additional commands primarily adapt IPMI for modular systems in support of discovery and management of hot-swappable FRUs.

The BCT management module firmware provides an IPMI management feature, hereafter referred to as IPMI Northbound (IPMI-N). IPMI-N enables out-of-band platform management of various BCT components via the management module external Ethernet port. To a reasonable extent, IPMI-N makes BCT appear like an ATCA shelf to any industry-standard IPMI-based system-management application.

IPMI-N implements an ATCA abstraction layer that uses existing BladeCenter internal mechanisms to manage system health via the management module and blade service processors. The ATCA abstraction layer consists of the shelf manager, logical intelligent platform management bus (IPMB), and the blade intelligent platform management controller (IPMC) instances. Among the important functions provided by IPMI-N are the following:

  • Single point of IPMI-over-LAN systems-manager interface.
  • Ability to transparently manage processor blades that contain either an IPMI BMC or a legacy H8 service processor.
  • Sensor data record repository for chassis sensors and blade controllers.
  • Centralized system event log to capture chassis and blade sensor events.
  • FRU inventory.
  • Chassis-level sensor monitoring and event generation.
  • LAN messaging using the primary Remote Management Control Protocol, User Datagram Protocol (RMCP UDP) port.
  • Bridging support to allow addressing of commands to blades.
  • Platform event filtering and alert policies.
  • LED controls (includes telecom alarm panel).
  • Active and backup management module instances for redundancy.
  • Support for FRU hot-swap sensors on behalf of blades.

The shelf manager uses a logical IPMB address to identify a processor blade in the IPMI SEND MESSAGE command that it exchanges with the shelf manager. There is a fixed mapping of the IPMB address to the slot number defined by ATCA. This mapping is used by IPMI-N to assign the IPMB addresses to represent each blade IPMC. The actual processor blade IPMI command is encapsulated within the SEND MESSAGE command. These commands are handled either internally by IPMI-N or translated and passed on to BMC blades over the internal RS-485 bus.

Integrated platform for telecommunications

Telecommunications service providers are struggling with the transformation of their revenue model. Traditionally, the largest percentage of their revenue came from hardware and voice communications. Today, indicators from telephone and cable bills portend that the largest slice of future revenue will be derived from communications-related services. Hence, the service providers must look for ways to reduce their expenses in delivering these new services to the market.

Network equipment providers make systems that are highly available (5–9s) for the telecom industry. This level of availability cannot be achieved through reliable hardware alone; software that is able to failover to a backup application in a short period of time (0.0001 seconds or less) is necessary. To do this, the software must be developed with detailed knowledge of the hardware. The base platform necessary to accomplish this from IBM is the eServer Integrated Platform for Telecommunications (IP–T), based on open standards.

IP–T integrates into a single platform four open-standards-based product categories that require knowledge of the system topology: hardware, high-availability services, systems management, and OS. This enables telecom equipment manufacturers to concentrate on advancing their applications, developing new value-added applications, and penetrating new market opportunities (e.g., Internet Protocol-based communications in the enterprise), by insulating them from the churn of platform advances.

With IP–T, IBM has specifically targeted the following five key factors that challenge today's telecommunication equipment manufacturers:

  • Lower cost: By combining the larger volumes and highly competitive nature of the enterprise marketplace with our IP–T hardware and systems management products, we can lower product cost. Linux is the lowest-cost highly reliable OS. Our high-availability software, developed to the open standards of the telecommunications marketplace, will be offered to the industry on other platforms, spreading the development cost.
  • Decreased complexity: IP–T is an integrated product, which means that it offers high assurance that all components work together. The interfaces with these products are industry-standard, so IP–T can remain current with evolving industry standards and technology updates. By building new applications on the IP–T platform level instead of hardware-dependent interfaces, telecom equipment manufacturers can be shielded from the complexities of technology advancement and be able to focus their energy on the new potential of Internet Protocol communications, such as VoIP, messaging, conferencing, and location-based services.
  • Increased revenue: With platforms that support IP communication systems, telecom equipment manufacturers will be able to offer new products and services that complement and extend their current voice products. Because the IP–T spans both the enterprise and telecom marketplaces, telecom equipment manufacturers may offer their new products and services in both.
  • Faster time to market: By focusing on the applications rather than a proprietary platform, telecom equipment manufacturers will be able to bring their applications to market faster. They will also be able to incorporate new IP–T technologies (such as faster processors and new Linux distributions) more rapidly and more comprehensively, since they do not have to sponsor these development activities.
  • Partnering: In today's global economy, partnering for solutions is essential to remaining competitive. IBM is a recognized leader in hardware, software, and technical services.

Although the primary target for IP–T is the telecom network products sector, it is also applicable to products in the enterprise, government, and financial sectors. The wireless and wireline telecom companies will benefit by having an open platform as the base for developing innovative Internet Protocol-based communication services.

The following sections provide an overview of each of the four key features: high-availability services, systems management, OS, and integration.

High-availability services

The term high-availability services indicates the set of services that provide the framework, application program interfaces, and failover services upon which highly available applications may be written. The Service Availability Forum is the group whose function it is to provide the standards for this set of software. As of mid-2005, the specification level is B.01.01, with work ongoing to incorporate more facets of the high-availability environment. The current specification includes the availability management framework and availability services.

The availability management framework is the software entity that provides service availability by coordinating redundant resources within a cluster to deliver a system with no single point of failure. Availability services are the core application programming interfaces and services on which the availability management framework and high-availability applications are implemented. They include the following:

  • Cluster membership service: Provides applications with cluster membership.
  • Checkpoint service: Provides for data replication between copies of an application and process recovery on a failure.
  • Event service: Provides a publish-or-subscribe, multipoint-to-multipoint event service.
  • Message service: Provides a reliable, fault-tolerant interprocess message delivery and queuing mechanism.
  • Lock service: Provides a cluster-wide locking service for shared resources.
Systems management

It is in the nature of the telecom industry to install solutions in geographically and physically dispersed areas, which presents interesting logistical problems if failures occur. The optimum solution is to have a way to deploy, control, diagnose, update, and provide interfaces to higher-level management software in these locations without having to be physically present. To fulfill this need [17], the collection of systems-management technologies described in the previous sections of this paper have been made an integral part of IP–T. Because these tools must apply to both the core network and enterprise solutions, the BladeCenter and BCT management technologies provide a seamless method to cover both.

IP–T systems management takes advantage of the following tools to accomplish the tasks listed above:

  • Deploy: IBM Remote Deployment Manager and/or CSM.
  • Control: IBM Director and its extensions.
  • Diagnose: IBM Real Time Diagnostics.
  • Update: IBM UpdateXpress and Tivoli NetView* Distribution Manager.
  • Interface with higher-level management: Upward integration modules for IBM Director (IBM Tivoli, HP OpenView, etc.), SNMP, and IPMI-N.
Operating system

Because of its widespread acceptance in the telecom marketplace, the operating system of choice for IP–T is Linux. The Open Source Development Labs** Carrier Grade Linux working group has documented two specifications of requirements necessary for enhancements to Linux to make it robust enough for the telecom marketplace and is working on the next set. The code supporting these specifications has been accepted by the leading companies that provide Linux distributions, and it is being rolled into their new Linux releases.

The Carrier Grade Linux 2.0 Priority 1 requirements are split into three sections: general systems, clustering, and security. These requirements were driven by a variety of sources, including legacy features (e.g., functions available in Sun Microsystems Solaris** or proprietary operating systems and by those needed to meet future network needs, such as Internet Protocol Version 6 (IPv6). The Priority 1 recommendations related to general systems, clustering, and security are documented at the Open Source Development Lab Website [18]. A number of these recommendations require the Linux 2.6 kernel.

Integration

The IP–T pulls together these key components into an integrated platform package. This package greatly improves the ability of telecom equipment providers and independent software vendors to innovate and compete in their core strengths and business. High-availability applications can be written and tested on one platform and deployed in either a telecom or enterprise environment, or both.

IBM can augment the IP–T product by providing consultation services, application preload and configuration services, customer installation, IBM middleware components, third-party applications, and long-term support to meet customer requirements.

Customer solutions with BCT

CIRPACK is an early BCT adopter for next-generation networking infrastructure. Flexibility is key to their providing solutions to their customers [19]. CIRPACK has designed a special-purpose telecom-oriented blade, the Public Telephony Gateway (PTG), that serves as the media gateway foundation for a broad array of applications. Controlling the PTG blades are the CIRPACK High Velocity SoftSwitch blades, a redundant pair of HS20 or HS40 blades, and blade storage expansion units preloaded with an OS and application software. The PTG is the first special-purpose blade developed under the auspices of the BladeCenter Alliance program. The PTG Blade connects to the BCT Gigabit Ethernet backplane, supporting up to 2,048 VoIP channels and up to 63 E1 Interfaces or 1,024 VoIP channels and 1,024 AAL2 VoATM (voice over asynchronous transfer mode) channels via optical connections along the front panel of each blade. This design allows the CIRPACK Carrier-Class SoftSwitch to be integrated into time-division multiplexing (TDM), ATM, or VoIP networks, providing a bridge between the various technologies. This is key to CIRPACK's ability to support legacy, current, and future network infrastructure with a wide selection of Class 4 and Class 5 features. By utilizing BCT with the Intel-processor-based HS20 and HS40 processor blades running Linux as the building blocks for its softswitch, CIRPACK was able to leverage software from its partners to rapidly deliver complete customer solutions for a variety of services—wireless, wireline, digital subscriber line, cable, and traditional plain old telephone service (POTS). The solutions available include a variety of Class 4 switch capabilities, such as local number portability, 800/900 number call routing, detailed billing generation, and many others. The Class 5 subscriber services include features such as conference calling, follow me, call-forwarding centrex, and even interactive wakeup calls. These capabilities and more were available for the CIRPACK customer field trials in a little more than a year from project inception, due in large part to the flexibility and industry attributes of BCT.

By using the combination of BCT and the BladeCenter PCI I/O expansion unit (PEU) [20], current rack-server-based applications can be ported ideally to a BladeCenter system to take advantage of the integrated features, the single point of control for management, and its rugged design. Using current independent voice recognition and DSP-based intelligent peripheral component interconnect (PCI) adapters plugged into a PEU, a telephone-switch-in-a-box solution can be offered to customers, including small or medium-sized businesses. This type of solution can support connections into current T1/E1 interfaces, provide echo-cancellation services and dialed or spoken-digit recognition, and offer basic voice prompts; in short, it is a complete, interactive voice-response application. Deploying the solution in a BCT enables the use of the same technologies available to the service-provider industry, ensuring that the critical connections between businesses and their customers are available around the clock. Additional solutions enabled by the PEU and BCT, listed below, are made available through the IBM ServerProven* program, which brings together key PCI adapter suppliers and IBM to jointly validate hardware and software configurations.

  • Secure communications with PCI-X-based IPSec (Internet Protocol security specification) accelerators from Interphase** supporting 3DES encryption, Advanced Encryption Standard (AES), secure hash algorithm (SHA), and standard APIs, such as FreeS/WAN for Linux.
  • Signaling System 7 (SS7) end points using PCI adapters from Ulticom** or Interphase, including MTP3 and MTP2 software stack, respectively.
  • Protocol conversion via packet processing adapters, such as the Radisys** ENP-2611, which includes an Intel IXP2400 network processor.
  • Legacy communications interconnects from the Interphase T1/E1/J1 adapter.

BCT will also be utilized in a variety of customer solutions which require a more robust overall design that continues to leverage the economies of scale afforded by industry-standard components (for instance, medical imaging, in which state-of-the-art scanning technologies generate very-high-resolution images which are complex representations of internal organic structures that must be correctly and quickly interpreted). Processing these images requires numerous steps that tend to be handled in both sequential and parallel ways, depending on the step in the process. Granular and substantial processing power, high-speed interdevice communication, and the ability to connect into existing scanning equipment are all attributes associated with a blade-based system. Add to those basic requirements the deployment of the solutions in a portable or semitransportable vehicle to deliver the services at the locations at which they are needed—and attributes such as volumetric size, operation from nonstandard power sources, and the ability to function in harsh environments become critically important. This is clearly an application in which BCT will deliver the key ingredients at a cost that will help ensure that advanced medical technologies are available to those in need.

BCT usefulness extends beyond the telecom industry. It is a platform that customers can use to rapidly develop and go to market with solutions that meet a variety of demanding applications. It is a high-performance industry-standard server with integrated networking and advanced systems management that is being delivered in a package that has been designed for and tested to the rigorous NEBS criteria.

Conclusion

The BCT platform is an extension to IBM BladeCenter technology, developed as a platform upon which solutions can be built by telecom end-users and hardware and software suppliers. With it, server blades can be brought into the rapidly expanding segments of IP-based telecommunications. New telecom applications, such as voice over IP and call control, and high-bandwidth media applications are increasing the need for high-performance processing at the core of the telecom network. The scalability and serviceability of blade technology, supported by the telecom-ready chassis, is a combination that provides telecommunications service providers with reliable, stable IP platforms.

Acknowledgments

The concept of a NEBS-compliant BladeCenter product was an integral part of the early discussions for the platform, and the blade format was determined on the basis of telecom requirements for a 20-in.-deep chassis. The original development team was led by Ben DeLuca of IBM Austin, who assembled a multisite team from Austin, Texas, Rochester, Minnesota, and Research Triangle Park, North Carolina, to address the need for a telecom product based on xSeries processor blade technology. The design requirements were guided largely by the efforts of Bruce Anthony, IBM Distinguished Engineer, and Jim Pertzborn, Vice President, Telecommunications Industry for Systems and Technology Group.

Implementation of the product requirements became an integral part of the collaboration between IBM and Intel on the BladeCenter product family. Therefore, it fell largely to the Intel mechanical and hardware design team in Columbia, South Carolina, to take the BladeCenter “DNA” and create from it the BCT. IBM would like to acknowledge the professional dedication and personal sacrifice made by the Intel development team headed by Juha Salenius, managed by Jim Bringley (a former IBM employee), and consisting of the following designers: David Redys, Brooker Strom, Jim Kluttz, Daniel Wong, Billy Taylor, Brandy Walters, Jim Waring, and Jonathan Boyce. Special thanks goes to Steve Hackett of Intel for his expertise in the area of NEBS compliance.

*Trademark or registered trademark of International Business Machines Corporation.
**Trademark or registered trademark of Telcordia Technologies, Inc., Hewlett-Packard Development Company, L.P., Computer Associates International, Inc., Linus Torvalds, Microsoft Corporation, Service Availability Forum, Intel Corporation, PCI Industrial Computer Manufacturers Group, Open Source Development Labs, Inc., Sun Microsystems, Inc., IEEE, Interphase Corporation, Ulticom Inc., and RadiSys Corporation in the United States, other countries, or both.

References

Received December 16, 2005; accepted for publication February 21, 2005; Published online October 7, 2005.


    About IBMPrivacyContact