Skip to main content
ESIP Home
Project
ESIP Product
Cluster
White Paper
Contact
iis header page

next up previous
 Next:Preliminary Up: Model-Based Mining of Previous: Model-Based Mining of
 

Introduction

There is a growing interest in identifying environmental factors that contribute to diseases and other public health risks including hantavirus, denge fever, cholera, lyme disease and air pollution. Most of the existing epidemiological techniques rely on either passive or active surveillance of clinical data from health care facilities to identify outbreaks. In passive surveillance, health care providers, hospitals, and sometimes labs send reports to the health department as prescribed in a set of rules or regulations. In contrast, health department staff call or visit healthcare providers on a regular basis (e.g., weekly) to solicit case reports in active surveillance. An outbreak is then determined from the difference between the actual and expected number of new cases for a specific disease.

A number of statistical techniques are applied to model the evolution of epidemic diseases. A typical model [10] includes the following three variables:

  • S(t): the number of uninfected people at time t that are susceptible to the disease,
  • I(t): the number of infected people at time t, and
  • R(t): the number of people at time t who have either recovered from, and are thus immune to, the illness, or are deceased, and have been removed from the susceptible population.

These variables are related to each other in the model through the following differential equations:

equation35 

equation37 

equation39 

In the first equation, the rate of change of the susceptible population (S'(t)) is proportional  to the product of the susceptible population and the infected population. The rationale is that the disease is spread when a susceptible person comes into contact with an infected person. The rate of change is negative because once a person is infected, the person is taken out of the susceptible population and is added to the infected population. The rate of change of the infected population, I'(t ), is the difference between the rate of infection and the rate of recovery (R'(t)), which is proportional to the total infected population with a factor tex2html_wrap_inline695 .

There is a recent movement towards analyzing environmental factors such as weather, landscape, and topography that influence the spread of epidemic diseases. Some of this information, such as weather and landscape data, is captured in remotely sensed data. The data is gathered on a continuous and wide-scale basis, and thus provides better coverage for predicting impending outbreaks than current passive surveillance techniques.

In this chapter, we propose a novel model-based data mining technique for constructing models to predict the localized risk of diseases that are influenced by environmental conditions. In the proposed technique, the environmentis decomposed into interconnected cellular micro-environments. Each of these micro-environments is then characterized by a set of input and output variables. The input variables include environmental factors such as weather, topographical data, land cover/land use category, demographic data (such as age or racial distribution, average annual income, population density), and the disease-state variables (such as S(t), I(t) and R(t)) that characterize the particular disease under observation. A novel technique is proposed to hypothesize the model using an object-oriented approach. In this approach, an environmental model is constructed using decomposition and substitution rules. The process builds a set of objects, where each object is specified at potentially different abstraction levels and details. The objects are then connected through a set of spatial, temporal, and Boolean relationship operators. The significance of each object in the model is evaluated and revised through iterative refinement based on nonlinear multidimensional scaling.

The rest of this chapter is organized as follows: Section 2 presents a study of hantavirus in order to demonstrate environmental issues related to disease outbreak. A basic framework for environmental modeling is outlined in Section 3. Techniques for model generation and validation are investigated in Section 4 and 5, respectively. Model revision through iterative refinement is described in Section 6. The chapter is briefly summarized in Section 7.


next up previous
 Next: Preliminary Up: Model-Based Mining of Previous: Model-Based Mining of
ESIP Home Project ESIP Product Cluster White Paper Contact

| Project home| Technical agenda| Publications| Contact|

[ Research home page | IBM home page | Order | Search | Contact IBM | Legal ]