IBM
Skip to main content
 
Search IBM Research
     Home  |  Products & services  |  Support & downloads  |  My account
 Select a country
 IBM Research Home
Weather Modelling
Deep Thunder
 ·Details
 ·Results and Applications
 ·Frequently Asked Questions
 ·What the Press Says
Weather Data Visualization

Contact Us
More Information
 Tropical Weather Forecasting
 Optimization and benchmarking of weather codes
 Collaborative research with universities, government labs and industry
 
 


IBM Research
  Deep Thunder

Precision Local Modelling for Weather-Sensitive Business Operations

For many applications, expected local weather conditions during the next day or two are critical factors in planning operations and making effective decisions. Typically, what optimization that is applied to these processes to enable proactive efforts utilize either historical weather data as a predictor of trends or much broader scale weather models. Hence, there is a need to improve the accuracy of local forecasts. One solution to this problem is the introduction of additional tools and data focused on the region of interest. Consider numerical weather prediction models that are typically run at relatively low resolution over a large geographic region, that is, at a synoptic to meso-beta scale.  Such models are used, for example, by the National Weather Service (NWS) as provided by the National Centers for Environmental Prediction (NCEP) in the United States, and by the European Center for Medium Weather Forecasting (ECMWF) for various agencies in Europe.

Meteorologists will employ this class of model output, their knowledge of the region in question and local conditions derived from in situ and remotely-sensed observations to arrive at a final forecast. Unfortunately, the resolution at which these models usually operate is often too coarse for capturing information on local predictions of thunderstorms, wind shear, land-sea breezes, etc.  Since these types of weather phenomena are often critical for weather-sensitive decision making, there can be a mismatch with the temporal and/or spatial scale of such models with many business operations.  Schematically, this can be illustrated by the image to the lower left, which shows a low-resolution grid overlaid over the United States.


However, forecasts can be substantially improved with the application of regional and local numerical modeling techniques, that provide predicted information at a meso-gamma-scale with greater emphasis on cloud micro-physics and land surface processes. Such models do not replace the utilization of broader scale simulations, but supplement them by using the results to establish boundary conditions for the cloud-scale model, which computes predictions at greater resolution.  Schematically, this idea is shown with the image to the right, which illustrates an original mesh from a synoptic-scale model over Georgia with finer meshes embedded corresponding to the grid(s) that a mesoscale model would employ.  Operation at higher resolution provides the capability to capture predictions at a finer scale.
The application of these models imply the generation of larger volumes of data, for which improved facilities, including visualization are required for their timely assessment and utilization (e.g., like the image to the left, which you can click to view an animation at higher resolution).  Unlike efforts in atmospheric science research environments, advanced methods of three-dimensional visualizations have not been widely used in mission-critical, operational settings.

IBM Research has focused on providing an operational meso-gamma-scale numerical weather prediction solution for many applications, complementary to, yet dependent on, the capabilities available through agencies like NCEP to demonstrate both the business and meteorological value at a practical cost.  This service or system has been dubbed "Deep Thunder".  These capabilities are significant extension of some of the technology first proven during four months of operations in support of the 1996 Summer Olympic Games in Atlanta for the NWS office in Peachtree City, GA, where it achieved a high level of accuracy in its weather forecasts.  More importantly, the current work has focused on the business applications and the enabling of customized services.  As backrgound, there is a discussion of the science, technology and results of the Atlanta Olympics project published in a journal paper (John S. Snook, Peter A. Stamus, James Edwards, Zaphiris Christidis, John A. McGinley.  Local-Domain Mesoscale Analysis and Forecast Model Support for the 1996 Centennial Olympic Games. Weather and Forecasting, 13, no. 1, pp. 138-150, January 1998)  An overview of the entire forecasting effort for the Olympics is also available for your reference (Lans P. Rothfusz, Melvin R. McLaughlin, Stephen K. Rinard. An Overview of NWS Weather Support for the XXVI OlympiadBulletin of the American Meteorological Society, 79, No. 5, pp. 845–860, May 1998).

There is a growing recognition within the meterological community about the maturation of mesoscale models for operational use for many applications.  A paper (Ying-Hwa Kuo, Clifford F. Mass.  Regional Real-Time Numerical Weather Prediction:  Current Status and Future Potential. Bulletin of the American Meteorological Society, 79, no. 2, pp. 253-263, February 1998) summarizes these ideas from a perspective independent of this project.  There is also a promising future for applications of these ideas, as discussed by Robert Gall and Melvyn Shapiro in  The Influence of Carl-Gustaf Rossby on Mesoscale Weather Prediction and an Outlook for the FutureBulletin of the American Meteorological Society, 81, no. 7, pp. 1507-1523, July 2000 and many others (e.g., Storm-in-a-Box ForecastingScience, 304, pp. 946-948, May 14, 2004.)   The American Meteorological Society has developing a policy statement on Weather Analysis and Forecasting.  It outlines the different scales of forecasts, including the type discussed herein, as well as the issues and limitations concerning such capabilities.

This service utilizes a system that consists of several hardware and software components in an integrated environment, as illustrated in the figure to the left: a high-performance computer system (IBM pSeries), a forecasting model (e.g., RAMS, MM5 or WRF), a data assimilation package (e.g., LAPS, WRF 3DVAR), visualization software (Data Explorer), and associated peripherals.  The effort supporting the Olympics represented the first operational use of such an architecture in a more limited implementation.  As the system evolved it was demonstrated doing actual forecasts at several American Meteorological Society and Supercomputing conferences as well as a number of other venues.   Example results are available for you to examine.  A very early version was also adapted for use at the China Meteorological Agency in Beijing, where it was installed in May 1997.  An important aspect of this architecture is in the understanding that the users of the forecasts "sit" at the right-hand side of this figure.  Their requirements enable working backward to determine how the information needs to be disseminated and visualized, which leads to a specification of the model configuration to adequately capture relevant geography and local physics, and indicates how the forecasts should be generated. 

IBM has developed a number of high-performance computer systems.   Initially, the IBM RS/6000* Scalable Power Parallel (SP*) was used.   It was a distributed memory MIMD parallel computer consisting of a cluster of two to 512 RS/6000 processor nodes, which communicated via a multi-stage interconnect (the SP Switch).  Each node was an SMP configuration and is packaged in one or more frames (racks). A node had two, four, eight or 16 Power3 processors at 375 or 450 MHz.  The two- or four-way nodes were "thin" or "wide". Four-, eight- or 16-processor Power3 nodes were available as "high" nodes.  Wide nodes had additional expansion slots compared to thin nodes.  High nodes had larger cache, increased memory bandwidth and additional slots compared to wide nodes.  A frame had up to 16 thin nodes, 8 wide nodes or 4 high nodes in various combinations.  These nodes communicated with each other by sending and receiving packets through the SP Switch.  There were three types of switches, which provided a low-latency, high-bandwidth fabric for a bi-directional data-transfer.  The original SP Switch provided a peak transfer rate of 300 MB/second between thin or wide node pairs.  Its immediate replacement, the SP Switch2 was capable of 1 GB/second transfer rate between high node pairs. 

The RS/6000 SP was replaced with an architecturally similar system called a pSeries Cluster 1600*.  For this newer system, the nodes can include the aforementioned SP nodes. More importantly, they can utilize the newer and more powerful Power4-based SMP nodes (1 to 1.7 GHz) and Power5-based SMP nodes (1.9 to 2.2 GHz).  Power4 and Power5 nodes are available in several SMP configurations ranging from a single cpu to 64 cpus.   Power5 nodes may consist of single or dual core processors.  They utilize the latest version of IBM's switch technology called the High Performance Switch (HPS), which can provide up to 4 GB/second performance.   Efficient use of the SP or Cluster 1600 enables timely execution of the other (software) components of the system.  Depending on the size and resolution of the domain over which forecasts are being produced and the configuration of the SP, the continual generation of new 24-hour forecasts on an update cycle every few hours can easily be supported.    Many organizations (e.g., national meteorological service agencies, worldwide) use this same type of computing system for weather modelling. 

More information about the IBM SP server portion of this work, as originally implemented for the 1996 Olympics is available for you to read.  Although this configuration used hardware that is now obsolete, the approach to design a high-performance and high-reliability weather forecasting server still applies.  Information about the current version is available.   The level of performance achieved by such a system can now be met by a much smaller configuration at significantly lower cost using more modern hardware technology.  For example , in November 1999 using ten 200 MHz thin 2-way POWER3 nodes, a triply nested domain at 16, 4 and 1 km resolution with 48, 12 and 3 second time steps, respectively, and 31 vertical levels generated 5 to 6 hours of forecast time in an hour of compute time.  Regular forecasts using a similar nested configuration for the New York City area with full microphysics were originally generated on seven 375 MHz 4-way Power3 nodes typically at less than 12 hours of forecast time in an hour of compute time.  Later migration to eleven nodes (42 cpus) reduced that time by about 25%.  Using five 4-way Power4-based nodes (the next generation technology) reduced the time of a 24-hour forecast with this configuration to about 50 minutes.  The current Power-5-based nodes should improve the performance by another factor of two.  Additional information about the current operational implementation for New York City is available for you to read.  For model runs currently being done for other metropolitan areas (e.g., Chicago) using 66x66x31 grids at 32, 8 and 2 km resolution (100, 25 and 6.25 second time steps), a 24-hour forecast with cloud microphysics is completed in about 50 minutes using 42 cpus.  With twenty 375 MHz Power3 cpus, the model run requires about 75 minutes while with twenty 1.7 GHz Power4 cpus, only about 35 minutes is required. 

The model that has been used for this effort is a highly modified version of the Regional Atmospheric Modeling System or RAMS (Pielke et al, 1992), and is also derived from the aforementioned work supporting the 1996 Centennial Olympic Games.  This customized modelling system fully exploits the parallel processing power of the SP, and has a number of other custom enhancements.   It is suitable for regional, mesoscale and cloud-scale atmospheric simulations using a series of interconnected modules that allows simulation and prediction of atmospheric phenomena ranging from global to cumulus cloud scales.  Only those options required for a specific purpose need be employed.  Horizontal grid sizes can range from two meters (simulation of flows around buildings) to greater than 100 km (global circulation modeling).  It has detailed planetary boundary layer representations and two-way multiple nested interactive grids in both horizontal and vertical dimensions (up to 40 layers).  The model can be run either hydrostatically or non-hydrostatically, and can employ uniform or variable land use, topography, roughness, soil moisture and water temperature.
 

There are selectable options for desired turbulent schemes, finite differencing schemes, geographic coordinate systems, upper and lateral boundary conditions. The initial conditions are derived from a pre-processing, assimilation step.  The model generates basic atmospheric state variables, such as wind, temperature, pressure, moisture, etc., at each model grid point and time step.  From these predicted variables, a wide variety of diagnostic variables and parameters can be obtained.  These parameters include turbulence, vorticity, stability indices, sound propagation, visibility, air density, refractive indices, cloud liquid water, precipitation rate, etc.  In recent years, there has been a significant effort to develop more comprehensive community models as a framework to incorporate advances in both physics and software, and to leverage the best features of these and other earlier models.  Of particular note is the Weather Research and Forecasting or WRF model


The modeling component utilizes current observations processed through the data assimilation component as initial conditions and results from a synoptic-scale model for boundary conditions, and computes a forecast typically from six to 48 hours from that point in time. The results are the most accurate in the first portion of the prediction, while, as expected, the confidence in the model accuracy decreases over time. Since the model is parallelized on the pSeries Cluster 1600 and can execute relatively quickly, and updated observations and lower-resolution model data are available on a regular basis, one can maximize the accuracy of the numerical predictions by running the model on a regular, operational basis. This is illustrated in the figure to the left as an example, where a 24-hour run is always completed every three hours (i.e., eight times per day). The initial and boundary conditions are refreshed at each execution with the latest available data. There is overlap in time from each run to the next, so that essentially one is constantly sliding a six-hour window of the most accurate forecasts through time.

This new capability of producing high-resolution, numerical model data requires a change in how operational meteorologists, decision makers in weather-sensitive operations, etc. utilize the results so that they might quickly assess, analyze and disseminate the data for the formulation or utilization of a forecast. Since large volumes of complex data for each run are quickly produced, the use of traditional graphical representations of data for forecasters and especially for non-meteorologists can be burdensome. Instead of static or simple flip-book animations of two-dimensional techniques like contour maps, novel three-dimensional visualization strategies are employed. An example of such a visualization is at the beginning of this page.

Visualization Data Explorer* (DX) is used to create new visualization methods to help support forecasting and decision making from the model results. DX is a general-purpose software package for scientific data visualization and analysis. It employs a client-server architecture with an extended data-flow execution model and is available on Unix workstations (e.g., Sun, Silicon Graphics, Hewlett-Packard, IBM, DEC and Data General) and Intel-based personal computers running Windows or Linux, AMD-based systems running Windows or Linux, or PowerPC-based systems running MacOS, Linux or AIX  These methods are developed from a perspective of understanding how the weather forecasts are to be used in order to create task-specific designs.  In many cases, a "natural" coordinate system is used to provide a context for three-dimensional analysis, viewing and interaction.  They provide representations of the state of the atmosphere, derived from the model output, registered with relevant terrain and political boundary maps. You can view examples of these techniques below. Visualizations such as these also facilitate the dissemination of the computed weather forecasts to the public via the World Wide Web as well as broadcast and print media. More information about the visualization portion of this work is available for you to read.

Samples of several applications of this regional weather forecasting system are available for you to examine.  Many of them are from some of the early experimental efforts in this project.  The map below shows the domains that have been used for these efforts. You can click on a domain of interest, all of which are marked in red.


 

  • 2001 American Meteorological Society Conference (Albuquerque)

  • [There is a paper on the visualization work presented at this conference for you to read.]
     
  • Operational Forecasts for New York City
  • This capability was extended to provide similar forecasts operationally for the Atlanta, Chicago, Kansas City, Baltimore and Washington metropolitan areas at 2 km resolution, the San Diego area 1 km resolution, and 1.5 km resolution for the Miami-Fort Lauderdale area .   The image below places all but one of these forecasts in a geographic context, which shows a map of the eastern two-thirds of the continental United States.  On the map are three regions associated with six of the seven aforementioned metropolitan areas.  They correspond to the triply nested, multiple resolution forecasting domains used to produce each high-resolution weather forecast.  The outer nests are in gray, the intermediate nests are in magenta and the inner nests are in white.  The areas within the gray borders are covered at 32 km resolution for Kansas City, Chicago, Atlanta and Baltimore/Washington, 24 km for Miami-Fort Lauderdale and 16 km resolution for New York.  The areas within the magenta borders are covered at 8 km resolution for Kansas City, Chicago, Atlanta and Baltimore/Washington, 6 km for Miami-Fort Lauderdale and 4 km resolution for New York.  The areas within the white borders are covered at 2 km resolution for Kansas City, Chicago Atlanta and Baltimore/Washington, 1.5 km for Miami-Fort Lauderdale and 1 km resolution for New York. 

    If you are interested in seeing the current results for any of these forecasts, please contact us.



Additional material is available for you to learn more about the Deep Thunder project.   Some information is below as well as via the links to the upper left.  

  
The early experimental use of this system and methodologies focused on forecasting by meteorological agencies.  The more recent, operational use has focused on applications for weather-sensitive decision making in transportation, emergency management, agriculture, broadcast, energy, insurance, pollution monitoring, and fire control and management.  The modular approach to its implementation:  1) data collection; 2) data assimilation; 3) forecast modeling; and 4) forecast dissemination, provides flexibility to enable it to be tailored to specific business needs.  The "off-the-shelf" components provide the base technology while the integration allows the system to be used effectively in operational environments.  Further, the modularity enables components such as the forecast model to be replaced, should that be appropriate to leverage advances in the science or technology.  The system takes advantage of the available weather data infrastructure in a particular region.  It does not replace such facilities but enables their further utilization for additional or complementary, local applications.  Available observations are used for analysis and to initialize the local model.  The results from synoptic-scale forecasts are used to establish boundary conditions for the mesoscale predictions.  The implementations to date have indicated that each of aforementioned components are among the best available.  But their integration into one complete and coherent system is unique and enables a service for decision makers to focus on their business rather than the system infrastructure.

For further information on this system beyond what is discussed on this site, please contact either Tony Praino or Lloyd Treinish at IBM Thomas J. Watson Research Center or Don Stremme of IBM Sales and Distribution.


lloydt@us.ibm.com
Last updated September 25, 2006

  
 
  

  About IBM  |  Privacy  |  Legal  |  Contact