lloydt@watson.ibm.com
The heating of the ocean off the Peruvian coast during El Nino periods is part of a larger scale warming of the eastern equatorial Pacific Ocean by several degrees C that creates large anomalies in oceanic and atmospheric circulation. These have, for example, led to the loss of much marine life. The El Nino of 1972 virtually destroyed the Peruvian anchovy fishing industry, which at that time represented a significant percentage of the world's protein supply with a catch of about 12 million tons per year [Quinn et al, 1978].
The 1982-1983 El Nino has received wide attention for its severity [Philander, 1983]. In Peru alone, it was responsible for much loss of life, damage affecting over 80% of the highway system, railroad washouts, and material loss estimated in the billions of dollars. Such destruction emphasizes the need to better understand the meteorological forces unleashed by this powerful ocean-air interaction.
Goldberg et al [1987] have investigated the mesoscale structure of severe rainfall events during the 1982-1983 period by examining daily data from 66 rainfall stations in the Chiura-Piura region of northwestern Peru. Figure 1 shows the location of this region, which was selected because it was most severely affected by the 1982-1983 El Nino and because the data were highly reliable and complete.

Figure 1. Location of Peruvian Rainfall Stations.
These data support the study of rainfall characteristics over this localized region during El Nino and non-El Nino periods, as a function of elevation, geographic location, and time of year. The stations are enumerated in the Appendix.
Figure 2 is representative of a straightforward discrete realization of such data as a scatter plot to show the spatial distribution. Figure 3 illustrates the temporal distribution for a single station.


There is a long history of mathematical methods to create meshes from scattered data points. Each method does change the data and their artifacts must be understood because they will carry through to the actual visualization. This discussion is only meant as a very brief introduction to the topic. Nielson [1993] summarizes many of the methods in use today and their relative advantages and disadvantages.
The simplest and quickest approach is to create a regular grid from the point data by nearest neighbor meshing -- find the nearest point to each cell in the resultant grid and assign that cell the point's value as illustrated in Figure 4. Such a technique is valuable because it preserves the original data values and distribution of a grid after a coordinate transformation may have taken place on a collection of points. Although computationally inexpensive, the results may not be very suitable for qualitative display because of the preservation of the discrete spatial structure.

Figure 4. Nearest Neighbor Gridding.
An alternate approach that preserves the original data values involves imposing an unstructured grid dependent on the distribution of the scattered points. In two dimensions, this would be a method for triangulating a set of scattered points in a plane [Agishtein and Migdal, 1991]. This technique first requires the Voronoi tesselation of the plane with a polygonal tile surrounding each of the scattered points. These tiles are such that the locus of all points within a particular tile are closer to the scattered point associated with that tile than they are to any other points in the set.
Figure 5. Delauney Triangulation of Rainfall Stations.
A triangulation can then be constructed which is the dual of the Voronoi tesselation (i.e., connecting a line between every pair of points whose tiles share edges). This is known as Delauney triangulation and is illustrated in Figure 5 as applied to the rainfall stations.
Figure 6. Pseudo-Colored Rainfall Distribution from Delauney Triangulation of Stations.
A potentially more appropriate method, and certainly one that is more accurate than nearest neighbor meshing, uses weighted averaging as illustrated in Figure 7. For any given cell in a grid, the weighted average of the n nearest values in the original data distribution spatially nearest to that cell is chosen. A weighting factor, wi = f(di), where di is the distance between the cell and the ith (i = 1 , ..., m) point in the original distribution, is applied to each of the n values.

Figure 7. Weighted Average Gridding.
Figure 7 illustrates the case where n = 3. A common weight is w = d^2. These are variants of Shepard's method [Shepard, 1968]. For example, Renka [1988] modified this approach with local adaptive surface fitting. Collectively, these methods are typically O[nlog(n)] in cost. Intermediate in quality and computational expense would be using linear instead of weighted averaging.

Effective gridding of the observations is critical for analysis. The weighted average method described earlier is used to create a pseudo-colored mesh independently for each day over the eight months of daily data being examined (November 1, 1982 through June 30, 1983). As a further aid to the study of spatial and temporal variations, the mesh is deformed by the altitude at each node, which is determined from the same gridding process applied to the altitude of each station as shown in the Appendix. The result is a simple elevation model, which gives a reasonable approximation of the topography in northwestern Peru, especially given the paucity of high-resolution elevation data for this region. This surface with pseudo-colored rainfall is used in Figures 11 and 12.
Figure 11. Rainfall Distribution in Northwestern Peru for January 24-27, 1983.
Figure 12. Rainfall Distribution in Northwestern Peru for May 19-22, 1983.
Applications to Other Data
To illustrate the generality of the distance weighted average approach to gridding of scattered data, consider its application to other data sets. Figure 13 shows a visualization of a collection of yearly averages of weather data for 1960 from 1702 stations scattered around the earth. Temperature and precipitation data are independently gridded to an irregular mesh of quads, which is a digitized representation of the earth's land masses at one-degree resolution in latitude and longitude. The precipitation data are shown as a pseudo-color map in a Mollweide projection while the temperature data are shown as a pseudo-color contour overlay every five degrees C. The gridded representation illustrates correlation between lack of precipitation with very high temperatures and high precipitation with moderately-high temperatures, for example. These relationships would be difficult to see with scattered realization methods.
Figure 13. Pseudo-Colored Precipitation and Pseudo-Color Temperature Contours from Gridded Global Station Data.
Figure 14. Weighted Average Gridding Applied to Atmospheric Profile Data.
Implementation
The techniques described herein have been developed with IBM Visualization Data Explorer, a general-purpose software package for scientific data visualization and analysis. It employs a client-server architecture with an extended data-flow execution model and is available on Unix workstations (e.g., Sun, Silicon Graphics, Hewlett-Packard, IBM, DEC and Data General) and Intel-based personal computers running Windows NT [Abram and Treinish, 1995]. Data Explorer provides tools for operating on both scattered and gridded data. The Data Explorer Connect module performs the Delauney triangulation used in Figures 5 and 6 while the Regrid module performs the weighted average interpolation used in Figures 8, 9, 10, 11, 12, 13 and 14. The Regrid module provides independent control of the exponent of the weighting factor, the size of n and the radius of influence from each node of the grid within which to consider data points. In this case a radius of 0.36 degree (of latitude and longitude) was used. It should be noted that for each day of data, not all stations have rainfall measurements. This is NOT the same as a station reporting no rain. Data Explorer supports a notion of data invalidity. Hence, for any given day, only those stations having a measurement are considered by both the Connect and Regrid modules in creating gridded versions of the data. The choice of modules that support continuous realization is independent of the use of Connect or Regrid even though they result in different mesh structures because these Data Explorer operations are polymorphic and appear typeless to the user. This polymorphism is a consequence of Data Explorer being built on a foundation of an unified data model, which describes and provides consistent access services for any data that is to be studied independent of shape, rank, type, mesh structure or dependency or aggregation.
Figure 15. Data Explorer Visual Program Illustrating Weighted Average Gridding of Stations.
Import - reads rainfall data from disk.Sequencer - provides a frame counter for controlling daily animation.Select - chooses which day from the time series to process.Include - flags which stations failed to report a measurement.Construct - defines the aforementioned 0.36-degree-resolution grid (36 x 46) for interpolation.Regrid - interpolates the point data to a grid, which is highlighted. The configuration for this module is also shown, where the number of nearest neighbors, radius of influence and weighting factor are defined.Colormap - provides interactive construction of a custom color map, which is shown.Color - applies the color map to the gridded rainfall data.Image - renders an image and provides interactive tools for its manipulation.
Figure 16. User Interface of a Data Explorer-based Application for Studying Peruvian Rainfall Data.
Conclusions and Future Work
The characteristic topography near regions such as Chulucanas (roughly in the center, cf., Figures 3, 11, 12 and 16 and the Appendix), where such storms were observed to occur on a frequent basis, is ideal for the aforementioned interaction between the rainbands and the Andean foothills. The origin of the eastwest rainbands near the north Peruvian coast is less clear but may be caused by low altitude wind surges, which are driven northward along the coast of Peru by a large and quasi-permanent high in the southeastern Pacific.
Figure 17. Conditions in the Pacific Ocean on January 24, 1983.
Careful methods of gridding scattered (measured) data are critical for effective visualization, especially when used for continuous realization to yield qualitative information. The techniques described herein appear to be suitable for earth sciences applications other than meteorology (e.g., hydrological samplings, petroleum or mining well logs) as well as independent disciplines as diverse as medicine (e.g., measurements distributed on a patient's skin) or aerospace engineering (e.g., pressure along an airfoil or temperature inside a jet engine). Enhancement of the current study or extensions to other domains will require investigation of the applicability of other methods of gridding scattered data (e.g., [Nielson, 1993], [Gmelig-Meyling and Pfluger, 1990], [Smith and Wessel, 1990], [Yue-sheng and Lu-tai, 1990]).
Acknowledgments
The data are available courtesy of NASA/Goddard Space Flight Center, Greenbelt, Maryland. References
1. E. M. Rasmusson. "El Nino and Variations in Climate". American Scientist, 73, 168, 1985.
Name Number Latitude Longitude Altitude
(Degrees S) (Degrees W) (m)
ALTAMIZA 19 5.07 79.73 2600
ANIA 17 4.85 79.48 2450
ARANZA 16 4.85 79.58 1300
ARDILLA 40 4.52 80.43 150
ARENALES 8 4.92 79.85 3010
ARRENDAMIENTOS 57 4.83 79.90 3010
AUL 42 4.55 79.70 640
AYABACA 2 4.63 79.72 2700
BARRIOS 33 5.28 79.70 310
BERNAL 36 5.47 80.73 32
BIGOTE 34 5.33 79.78 200
CANCHAQUE 35 5.37 79.60 1200
CHALACO 25 5.03 79.80 2250
CHIGNIA 60 5.60 79.70 360
CHILACO 3 4.70 80.50 90
CHULUCANAS 9 5.10 80.17 95
CHUSIS 37 5.52 80.82 12
CIRUELO 64 4.30 80.15 202
CORPAC 15 5.20 80.62 49
>ESPINDOLA 49 4.63 79.50 2300
FRIAS 20 4.93 79.93 1700
HUANCABAMBA 68 5.23 79.43 1052
HUAR HUAR 62 5.08 79.47 3150
HUARA DE VERAS 47 4.58 79.57 1680
HUARMACA 14 5.57 79.52 2100
JILILI 46 4.58 79.80 1330
LA ESPERANZA 7 4.92 81.07 12
LA TINA 1 4.40 79.95 427
LAGARTERA 54 4.73 80.07 307
LAGUNA RAMON 59 5.55 80.67 9
LANCONES 45 4.57 80.47 110
LAS LOMAS 66 4.65 80.25 265
LOS ALISOS 21 4.97 79.53 2150
MALLARES 6 4.85 80.73 45
MIRAFLORES 11 5.17 80.62 30
MONTEGRANDE 13 5.35 80.72 27
MONTERO 48 4.63 79.83 1070
MORROPON 10 5.18 79.98 140
NANGAY DE MATALACAS 18 4.87 79.77 2100
OLLEROS 53 4.70 79.65 1360
PACAYPAMPA 23 5.00 79.67 1960
PAITA 67 5.08 81.13 6
PALOBLANCO 28 5.05 79.63 2800
PALTASHACO 30 5.12 79.87 900
PANANGA 43 4.55 80.88 450
PARAJE GRANDE 65 4.63 79.92 1500
PASAPAMPA 31 5.12 79.60 2410
PICO DEL ORO 41 4.53 79.87 1325
PIRGA 61 5.67 79.62 1510
PUENTE INTERNACIONAL 63 4.38 79.95 408
SAN JOAQUIN 32 5.13 80.35 100
SAN MIGUEL 12 5.23 80.68 29
SAN PEDRO 27 5.08 80.03 254
SANTO DOMINGO 24 5.03 79.87 1475
SAPILLICA 56 4.78 79.98 1446
SAUSAL DE CULUCAN 4 4.75 79.77 980
SICCHEZ 44 4.57 79.77 1435
SUYO 39 4.50 80.00 250
TACALPO 50 4.65 79.60 2010
TALANEO 26 5.05 79.55 3430
TAPAL 55 4.77 79.55 1890
TEJEDORES 5 4.75 80.25 230
TIPULCO 52 4.70 79.57 2600
VADO GRANDE 38 4.45 79.60 900
VIRREY 58 5.53 79.98 230