TALK ABOUT FISHING for information.
Enter the word "fishing" in a popular
Internet search engine, and you'll get
back 3.5 millions hits. Any page that
mentions the word "fishing" in any
context makes the list-- more the
equivalent of dredging the ocean than trying to land a
particular fish. And the problem is not necessarily one that
can be solved by forming a more specific query. You might
want general information on "fishing." But the number one
page returned on the 3.5 million hit list is "Fish Finder
Charter Fishing Trips in Naples Florida" -- the kind of
answer you'd expect for a much more specific query, like
"Fishing in Naples, Florida."
In an attempt to lessen this problem, some search sites hire
teams of categorization experts to cull through the mass of
available information and determine the most useful and
authoritative pages on a given subject by forming
"relevance judgments," eliminating many extraneous
pages. But there is a limit to what even a team of experts
can do, and the amount of information available on the
Web is growing faster than it can be absorbed and
categorized using this model.
Is it possible to automate the process and provide more
useful information? IBM researchers have done something
approaching just that with a new search algorithm called
CLEVER, a Deep Computing application that takes
advantage of relationships between Web pages in addition
to text and context analysis to determine the relevance and
authority of a given web page.
SO MUCH OF the supercomputing story focuses on how
fast the computer can process information, an important
quality, to be sure. But Deep Computing seeks to separate
the issue of raw processing speed from the time actually
required to solve a problem. Or to put it differently, instead
of always trying to make something faster, how can we
take computer speed and improve the quality of a result
given a bit more time? What can we do about problems
where the bottleneck is not computational power?
In so doing, researchers seek to imitate some of the
problem- solving abilities that characterize the human mind,
which, although not fast in the same way we may think of
computers being fast, is still able to discover or create
novel solutions to previously unsolved problems. It is the
cognitive modeling area of Deep Computing, then, that
most resembles A.I., the field of artificial intelligence -- with
one very important difference. Cognitive modeling is not
about trying to build an artificial brain, or duplicate or even
simulate an intelligent human. It is about mimicking the
problem-solving approaches and abilities of the mind,
sometimes in very novel ways.
Internet search technology is an excellent example of this
approach. Faster searching by itself is not better. The 3.5
million hit list for "fishing" referred to earlier took less than
a second to return, but is almost entirely useless. IBM
researchers felt that with a little more analysis and
computing time spent on the results of Internet crawling,
better information could be returned.
To accomplish this, they used one of the oldest tricks of
the human mind: get help solving the problem. In this case,
they decided to tap into the group behavior of millions of
people building Web pages. Rather than attempt to model
the cognitive process of a single individual, researchers
sought to exploit the consistency that emerges from a
seemingly chaotic process: on the Web, consensus
invariably builds around various topics of intellectual
interest to certain communities.
Researchers employed an algorithm called
Hypertext-Induced Topic Search (HITS) that finds
authoritative sources of information (called "authorities")
and sites (called "hubs") featuring compilations of such
authorities. CLEVER basically follows the following
process:
- Using a standard text search engine, it gathers a
"root set" of pages matching a query subject.
- It adds to this pool all pages pointing to or pointed
to by the root set.
- Using only the links between these pages, it distills
the best authorities and hubs.
- Additionally, CLEVER uses both the content and
context of the Web pages (text and other properties
of a page) in addition to the link structure.
The results? Enter "fishing" on the CLEVER search engine,
and a list of about a hundred pages sorted neatly into
"authorities" and "hubs" is returned. A quick scan of the
list shows every entry to be germane to the topic -- broad
enough to meet the query criteria but focused enough to
hook even the pickiest user.
FUTURE APPLICATIONS: Continued forays into cognitive
modeling will allow us to answer some very interesting
questions: How would a major reorganization affect a
company's profit? Its potential market value? Its ability to
retain talent? How might various potential news
developments affect financial markets?
Cognitive modeling is probably the broadest application
area of Deep Computing, covering diverse areas where all
of its advantages -- processing power, algorithms,
heuristics -- are brought to bear to solve problems that we
may not now even know how to approach. When we speak
of building more intelligence into the future of computing,
cognitive modeling is in many senses what we are
describing. And as computing power increases, and with it
the amount of information available from diverse sources
(both human and device-created), the potential for
mimicking our own innate problem solving ability will
enable solutions difficult today even to imagine, perhaps
even solving problems that we don't yet know exist.