IBM Research - Israel

Artificial Intelligence

We are innovating and developing core technologies that improve the state-of-the-art in such areas as natural language processing and generation, computer vision, speech technologies, optimization, and AI trust. Our teams create technologies to solve business problems in areas such as customer care, business analytics, process automation, and asset management. The data we handle includes: unstructured data, such as text, images, and speech; semi-structured data; and traditional structured data.

 

News and Blogs

What is retrieval-augmented generation?

Retrieval-augmented generation (RAG) involves grounding large language models in external data for better results. ISRL researchers are working on this solution, described in this IBM Research blog post.

IBM and ING Optimize Data Usage Across Clusters

ING and IBM Research have partnered to add automatic IT optimization into Fybrik. In this post, ISRL researchers explain how Fybrik now addresses both governance requirements and IT infrastructure preferences.

Multi-agent and energy-efficient AI models are the next big thing

Project Debater is already capable of reasoning, yet even more is possible with cutting-edge language models, as researcher Noam Slonim explains in this podcast.

What is synthetic data?

According to researcher Ateret Anaby-Tavor, the push of a button can yield thousands of sentences to fill a chatbox's knowledge gaps -- you just need to evaluate and filter them. Learn more in this blog post.

Research Projects

Document Understanding

Business documents are central to many corporate processes and lie at the heart of digital transformation. Such documents include contracts, loan applications, invoices, purchase orders, financial statements, and many more. The information in these documents is presented in natural language and is often unstructured. Understanding these documents is challenging, due to complex document layouts and content such as tables, charts, infographics. It is often even more challenging because of poor quality, noisy scans, or inadequately accurate OCR. The ability to read these business documents, either programmatically or by OCR, interpret their content so that it can be used in downstream automatic business processes is referred to as Document Understanding. We are treating this as a multi-disciplinary challenge, spanning across computer vision as well as natural language understanding, information representation, model optimization, thus advancing the state of the art in document understanding. In 2021, we participated in a workshop and shared publications about this research. Here is a link to all of the papers accepted at that workshop, several of which are from IBM

Learn more about Deep Document Understanding on the IBM Research Blog

Conversational Text to Speech

The voice channel is a crucial element in customer-care scenarios, especially over the phone, and text-to-speech (TTS) systems play a fundamental role in establishing and maintaining a positive customer experience.
We are developing a conversational end-to-end text-to-speech being used in conversational voice agents for customer-care. By designing and recording a speech corpus with conversational content, expressive speaking styles, and interjections, and by employing innovative deep learning and data augmentation techniques, our conversational TTS system can produce human sounding expressive spoken machine responses in a variety of voices.

Advanced Speech Classification

Human speech is a rich signal that carries with it a vast amount of information. In addition to words, the speech signal encodes information about the speaker’s identity, language, accent, emotions, and physical state, that may be particularly useful for analyzing customer speech to improve the service and customer experience.
We are developing advanced speech classification technology, based on state-of-the-art self-supervised speech representations. The technology enables the computer to accurately identify elements such as the customer’s language or emotion, in customer-care calls with either a human agent or a voice bot.

Customer Care

Our team is advancing the research for Watson Assistant, the AI-powered virtual agent that provides customers with fast, consistent, and accurate answers across multiple messaging platforms, applications, devices, and channels. we're helping Watson Assistant learn how to provide even better answers to common questions through the website, social media, chatbots, or with customer support agents.

Read more

AI-powered Business Automation

Automation improves business performance by making all information-centric jobs more productive; AI accelerates and further scales automation. We discover, generate, and improve business processes with AI, making automation trustworthy for employees and enterprises alike. We automate every enterprise, one at a time, with a focus on asset management, facility management and supply chain processes. Our innovations arrive to market via research lead pilots, IBM’s Automation products and IBM Sustainability Software’s product.

Publications

IBM researchers in Israel publish a wide variety of work every year as part of their work on research projects in the lab, in collaboration with other researchers and scientists in IBM, and together with academic and industrial partners from around the world.

Researchers in our group publish works at conferences and in scientific journals such as the AAAI conference, Nature, the ICASSP conference, NeurIPS, and others.

Tools & Code

Label Sleuth

An open source no-code system for text annotation and building text classifiers.

View project

Project Debater's Early Access Program

We offer free access to these services as Cloud APIs for non commercial academic use. The early access website is available at early-access-program.

View project

Low-Resource Text Classification Framework

A framework for experimenting with text classification tasks, focusing on low-resource scenarios, and examining how active learning (AL) can be used in combination with classification models from Ein-dor et al. (2020) paper.

View project

Intermediate Training using Clustering

Intermediate training of BERT in an unsupervised manger improves topical classification when labeled data is scarce. Code from ACL paper by Shnarch et al. (2022)

View project

AI Privacy and Compliance Toolkit

A toolkit for tools and techniques related to the privacy and compliance of AI models. The anonymization module contains methods for anonymizing ML model training data, so that when a model is retrained on the anonymized data, the model itself will also be considered anonymous. The minimization module contains methods to help adhere to the data minimization principle in GDPR for ML models. It enables to reduce the amount of personal data needed to perform predictions with a machine learning model, while still enabling the model to make accurate predictions. This is done by by removing or generalizing some of the input features.

View project

Academic Collaboration

Collaborate with our researchers on a wide range of NLP (Natural Language Processing) topics ranging from conversational agents and neural information retrieval to computational argumentation.

Let's talk

We're always happy to talk. Feel free to get in touch.


Manager, AI


Computer vision


Automation