The project is focussed on the technology for discovering unknown knowledge and relationship between entities, such as genes and proteins, from huge amount of bio-medical documents by using natural language processing(NLP) technologies. MedTAKMI, that is "IBM TAKMI for bio-medical documents", is extended for bio medical documents from IBM TAKMI, which is text mining system for call center analysis and successfully introduced by the call center of several customers.
In bio-medical literatures, we find challenging features for NLP technologies, such as huge size of the dictionary, many nynonyms for a keyword, and number or symbols in a literature. And, we also find challenging features for data mining technologies, such as huge size of data, thesaurus with deep hierarchy and interactiveness for a query.
We have successfully introduced the MedTAKMI system to the customer, where the system manages over 12 million MEDLINE documents.