Human generates a summary of a text by understanding it by the deep semantic
processings using huge domain/common knowledge. It is too difficult for the current
computer to simulate this human's processes.Therefore, most automatic summarization
programs analyze a text statistically and linguistically, determine
important sentences, and generate a summary text from these important sentences.
- Important sentence extraction ... The importance of a sentence is determined by some
surface clues such as the number of important keywords, the type of sentence (fact, conjecture, opinion, etc.),
rhetorical relations in the context, and the location in which a sentence exists in a document. A summary text is created by selecting important sentences. If just selecting these sentences, a text consisting of these sentences is not appropriate in terms of text cohesion. Therefore, we need a method for the sake of the naturality in reading such that if a selected sentence has any relation to the context, related sentences are also selected.
- Adaptation by document types ... A summarization method should be changed by text types. For instance, we would use different strategies of summarization for ordinary articles and editorial articles in newspapers. This corresponds to change
the weight (or the importance) of each surface features when calculating sentence importance. We developed a method to determine the optimal set of feature weights by the experimental results.
Systems / Products
Internet King of Translation (in Japanese) ... This is a Web translation software released by IBM Japan, which incorporates this summarization technology as a function summarizing an English Web page.
Lotus Word Pro Japanese Version ... Lotus Word Pro Japanese Version incorporates this technology for summarizing a Japanese document.
TRL in the media
- ``Text Summarization Method by Computer,'' Asahi Newspaper Evening Edition,
September 10, 1997.
- Nakano, ``The meaning of reading questioned by summarization softwares,''
Monthly GENGO Magazine, pp. 54--59, Vol. 27, No.2, 1998.