|
Next:Preliminary
Up: Model-Based Mining of Previous: Model-Based Mining of
Introduction There is a growing interest in identifying environmental factors that contribute to diseases and other
public health risks including hantavirus, denge fever, cholera, lyme disease and air pollution. Most of the existing epidemiological techniques rely on either passive or active surveillance of clinical data from
health care facilities to identify outbreaks. In passive surveillance, health care providers, hospitals, and sometimes labs send reports to the health department as prescribed in a set of rules or regulations. In
contrast, health department staff call or visit healthcare providers on a regular basis (e.g., weekly) to solicit case reports in active surveillance. An outbreak
is then determined from the difference between the actual and expected number of new cases for a specific disease.
A number of statistical techniques are applied to model the evolution of epidemic diseases. A typical model [10] includes the following three variables:
- S
(t): the number of uninfected people at time t that are susceptible to the disease,
- I
(t): the number of infected people at time t, and
- R
(t): the number of people at time t who have either recovered from, and are thus immune to, the illness, or are deceased, and have been removed from the susceptible population.
These variables are related to each other in the model through the following differential equations:
In the first equation, the rate of change of the susceptible population (S'(t)) is proportional to the
product of the susceptible population and the infected population. The rationale is that the disease is spread when a susceptible person comes into contact with an infected person. The rate of change is
negative because once a person is infected, the person is taken out of the susceptible population and is added to the infected population. The rate of change of the infected population, I'(t
), is the difference between the rate of infection and the rate of recovery (R'(t)), which is proportional to the total infected population with a factor . There is a recent movement towards analyzing environmental factors such as weather, landscape, and
topography that influence the spread of epidemic diseases. Some of this information, such as weather and landscape data, is captured in remotely sensed data. The data is gathered on a continuous and
wide-scale basis, and thus provides better coverage for predicting impending outbreaks than current passive surveillance techniques.
In this chapter, we propose a novel model-based data mining technique for constructing models to predict the localized risk of diseases that are influenced by environmental conditions. In the proposed technique, the
environmentis decomposed into interconnected cellular micro-environments. Each of these micro-environments is then characterized by a set of input and output variables. The input
variables include environmental factors such as weather, topographical data, land cover/land use category, demographic data (such as age or racial distribution, average annual income, population
density), and the disease-state variables (such as S(t), I(t) and R(t)) that characterize the particular
disease under observation. A novel technique is proposed to hypothesize the model using an object-oriented approach. In this approach, an environmental model is constructed using decomposition
and substitution rules. The process builds a set of objects, where each object is specified at potentially different abstraction levels and details. The objects are then connected through a set of spatial,
temporal, and Boolean relationship operators. The significance of each object in the model is evaluated and revised through iterative refinement based on nonlinear multidimensional scaling.
The rest of this chapter is organized as follows: Section 2 presents a study of hantavirus in order to demonstrate environmental issues related to disease outbreak. A basic framework for environmental
modeling is outlined in Section 3. Techniques for model generation and validation are investigated in Section 4 and 5, respectively. Model revision through iterative refinement is described in Section 6. The
chapter is briefly summarized in Section 7.
Next:
Preliminary Up: Model-Based Mining of Previous: Model-Based Mining of
|