Japanese | English
IBM TAKMI (Text Analysis and Knowledge Mining)
This page introduces a text mining system called IBM TAKMI (Text Analysis and Knowledge Mining).
Although TAKMI was originally created for analyzing call center logs, it can be applicable for any type of large text data in general. In particular, we have offered a medical version of TAKMI system (called MedTAKMI) for analyzing medical publications.
The following figure shows the graphical image of TAKMI system.

TAKMI
TAKMI provides the following mining/analysis views;
- 2D Map ... shows the associations between concepts in the specified two categories.
- Trend Analysis ... shows the fluctuation of the frequency of documents including concepts in the specified category.
- Chronological Analysis ... shows the frequencey of documents.
- Topic Extraction ... shows salient events during the specified period.
Analysis Examples
We have evaluated TAKMI at IBM PC Help Centers in Japan and in the USA. In the Help Centers, call takers make reports of each call by typing in customer information such as name and phone number, selecting call categories such as "technical QA"" and typing in brief descriptions of questions or messages from the customer and brief descriptions of answers and/or actions taken. Two typical analysis examples are as follows:
Trend Analysis
The following figure shows that Windows 98 was the most increasing word in the "software" category from the middle of June to the beginning of July in 1998. (This figure is a snapshot of the Japanese version.)
In the data for July 1998, a list of "software...question" pairs that mentioned Windows 98 are shown. It shows the following messages.
- Is it possible to install Windows 98?
- Is it good for Windows 98?
- Can I use Windows 98?
- Can I upgrade?
This list tells us that most of the customers were asking if they could install Windows 98 on their machine. In this case, the company prepared an answer to this question and put the information on their WWW home page in order to reduce the number of calls from the customers as well as the workload of call takers by preparing quick answers.

Trend of the 'Window98' topic

Relative Frequency
2DMap
This is an example of the 2D table of the PC Help Center's data. It shows the distribution of items in PROBLEM toward items in LIQUID.

2Dmap
From this table, you can see that "soda" in LIQUID is strongly associated with "sticking" in PROBLEM. The numbers in its cell, "12 (12.63%)" mean that there were 12 calls from customers who mentioned both "sticking" and "soda," and those 12 calls occupy 12.63% of the calls mentioning "soda." Since this percentage, 12.63% is much higher than percentages with other items in LIQUID, the cell is automatically highlighted.

2Dmap requirement and products
