View all topics

Computer Vision

Modern computer vision systems have superhuman accuracy when it comes to image recognition and analysis, but they don’t really understand what they see. At IBM Research, we’re designing AI systems with the ability to see the world like we do.

Our work

Introducing the IBM Granite 4.1 family of models
Release
Mike Murphy
29 Apr 2026
An artist’s tribute to modern AI
Q & A
Kim Martineau
27 Oct 2025
IBM Granite Vision tops the chart for small models in document understanding
News
Mike Murphy
26 Jun 2025
IBM Granite now has eyes
Research
Kim Martineau
26 Feb 2025
IBM’s new benchmark changes monthly to avoid teaching to the test
Research
Kim Martineau
17 Feb 2025
Environmental analysis made easier with IBM’s Geospatial Studio
News
Kim Martineau
21 May 2024
See more of our work on Computer Vision

Publications

Identify, Locate, Link: End-to-End Key-Value Extraction from Document Images
- - Said Gürbüz
  - Ahmed Nassar
  - et al.
- 2026
- ICDAR 2026
Conference paper
SatWellMCQ: A Vision–Language Satellite Dataset for MCQ‑Based Image Grounding of Oil Wells
- - Ahmed Emam
  - Sultan Alrowili
  - et al.
- 2026
- IGARSS 2026
Conference paper
Towards Effective Waste Segmentation for Automated Waste Recycling in Cluttered Background
- - Mamoona Javaid
  - Mubashir Noman
  - et al.
- 2026
- ICML 2026
Conference paper
Moving Beyond Sparse Grounding with Complete Screen Parsing Supervision
- - Said Gürbüz
  - Sunghwan Hong
  - et al.
- 2026
- ICML 2026
Conference paper
SPARC: Separating Perception And Reasoning Circuits for Test-time Scaling of VLMs
- - Niccolo Avogaro
  - Nayanika Debnath
  - et al.
- 2026
- ICML 2026
Conference paper
AraVQA: Building a New Arabic Factoid Visual Question Answering Dataset from Wikipedia
- - Sultan Alrowili
  - Younes Samih
  - et al.
- 2026
- ACL 2026
Conference paper

View all publications