AI that understands

Building the first AI for human-like understanding of complex documents

Our mission

The AI that Understands (AITU) team is building the first AI for human-like understanding of complex documents, which aims to help people make well-informed decisions.

Our mission is to build artificial intelligence algorithms that understand complex objects such as documents. We aim to push the current limits of state-of-the-art research in knowledge representation and reasoning. Our artificial intelligence algorithms make the information in documents accessible, from user manuals to scientific papers.

AI and deep-learning systems are touching every aspect of our lives due to great successes in computer vision (self-driving cars), speech processing, natural language processing and many other applications.

Despite these successes, current AI systems still lack real, deep understanding. For instance, object recognition systems are easily fooled by objects in unusual orientations or environments. They make errors that humans avoid, because they only analyse statistical properties of pixels, sounds or characters, but have no actual understanding of the content.

Our challenge

Our team at IBM Research – Australia is composed of researchers with comprehensive expertise in natural language understanding, image processing and knowledge representation. The team aims to build an AI that can understand documents such as user manuals, insurance policies or scientific papers and answer complex questions such as, “How do I change the batteries?” when given the manual for an alarm clock.

Understanding documents can be challenging because they contain an arrangement of text, title, pictures, graphs, tables and other types of information that are interlinked in complex ways. An AI system needs to extract and represent these pieces of knowledge in a way that a computer can reason about it. This is the key challenge faced by the AI that Understands team.

Publications

Peter Zhong, Ella Shafiei, Antonio Jose Jimeno Yepes,
Image-based table recognition: data, model, and evaluation,”
ECCV 2020, submitted.

Ella Shafiei, Antonio Jose Jimeno Yepes, Peter Zhong, David Martinez,
Global Locality in Event Extraction,”
ACL/BioNLP 2020.

Ying Xu, Peter Zhong, Antonio Jose Jimeno Yepes, Jey Han Lau,
Forget Me Not: Reducing Catastrophic Forgetting for Domain Adaptation in Reading Comprehension,”
IJCNN, 2020.

Peter Zhong, Jianbin Tang, Antonio Jose Jimeno Yepes,
PubLayNet: largest dataset ever for document layout analysis,”
ICDAR 2019. Best paper award.

Ella Shafiei, Antonio Jose Jimeno Yepes,
Neural Relation Extraction from Biomedical Literature,”
American Medical Informatics Association, 2019.

Our team

Antonio Jose Jimeno Yepes
Antonio Jose Jimeno Yepes

Technical team leader

David Martinez Iraola
David Martinez Iraola

IBM Research scientist

Ella Shafiei
Ella Shafiei

Post-doctoral researcher

Peter Zhong
Peter Zhong

IBM Research Scientist

Stefan Maetschke
Stefan Maetschke

Machine learning researcher

Ying Xu
Ying Xu

Post-doctoral researcher