| Organizing Committee |
 | David Carmel, IBM Research Lab in Haifa carmel@il.ibm.com |
 | Ian Soboroff , NIST ian.soboroff@nist.gov |
 | Elad Yom-Tov, IBM Research Lab in Haifa yomtov@il.ibm.com |
| |
| Program Committee |
 | Gianni Amati, Fondazione Ugo Bordoni |
 | Steve Cronen-Townsend, University of Massachusetts |
 | Kui-Lam Kwok, City University of New York |
 | Iadh Ounis, University of Glasgow |
 | Ellen Voorhees, NIST |
 |  |
|
 |
ACM SIGIR 2005 Workshop

|
Predicting Query Difficulty - Methods and Applications
Background
Estimation of query difficulty is an attempt to quantify the quality of results returned by the search system for a query. Ideally, a system that can predict difficult queries can adapt parameters or change algorithms to suit the query. Perhaps more simply, such systems could give feedback to the user, for example, by reporting confidence scores for results, and report to the system administrator regarding topics that are of increasing interest to users but are not answered well by the system.
The prediction of query difficulty has recently been recognized by the IR community as an important capability for IR systems. The Reliable Information Access (RIA) workshop, organized by NIST, investigated the reasons for system variance in performance by performing failure analysis of several state-of-the-art IR systems. One of the conclusions of that workshop was that if a system can realize the problems associated with a given topic, then current IR techniques are able to improve results significantly. This suggests that systems can improve performance by discovering which techniques to apply to which topics.
In the Robust track of TREC 2004, systems have been asked to rank the topics by predicted difficulty, with the goal of eventually being able to use such predictions to do topic-specific processing. System performance was measured by comparing the ranking of topics based on their actual precision to the ranking of topics based on their predicted precision. Prediction methods suggested by the participants have included measuring query difficulty based on the system's score of the top results, analyzing the ambiguity of the query terms, and by learning a predictor using TREC topics and their associate relevance sets as training data. The track results clearly demonstrate that measuring query difficulty is still intrinsically difficult.
Goals
In this workshop, we will explore techniques for the prediction of and adaptation to query difficulty. Specifically, we plan to focus on:
- Identifying the reasons that cause a specific query to become difficult for a given system
- Prediction methods for query difficulty
- Classification of queries and failure modes, with an eye toward predicting difficulty and suggesting solutions
- Evaluation methodology for query prediction
- Potential applications for query prediction
- Tools and techniques for analysis of retrieval results and failure modes
Planned Activities
- Presentation sessions on query prediction (accepted submissions from candidate participants)
- Invited talks
- Panel discussion
Submission Information
Send by email to David Carmel carmel@il.ibm.com or to Ian Soboroff ian.soboroff@nist.gov
For presentation:
A short vita and a position paper. Length should be no more than 2000 words (Postscript or PDF format). Final versions should be submitted in PDF or Postscript for the printed version of the workshop materials.
For participation only:
A statement of interest, not to exceed 500 words.
Working notes for the workshop, containing all the research and position papers, will be distributed to participants at the workshop. If there is interest among attendees, we may publish an edited volume after the workshop.
Important Dates
| Submission: |
May 15, 2005 |
| Notification: |
July 01, 2005 |
| Final version: |
July 22, 2005 |
| SIGIR technical conference: |
August 15-18, 2005 |
| SIGIR workshop: |
August 19, 2005 |
Program
| 09:00 - 09:15 |
Welcome |
| 09:15 - 10:15 |
Keynote - NIST: Towards Query-Specific Customization of IR Systems, Keynote Speaker - Donna Harman |
| 10:15 - 10:45 |
Coffee break |
| 10:45 - 12:30 |
First session - Prediction methods: |
| 10:45 - 11:15 |
An Attempt to Identify Weakest and Strongest Queries, K.L. Kwok |
| 11:15 - 11:45 |
Linguistic features to predict query difficulty, Josiane Mothe, Ludovic Tanguy |
| 11:45 - 12:15 |
Automatic classification of queries by expected retrieval performance, Grivolla, Jourlin, de Mori |
| 12:15 - 12:30 |
Open discussion |
| 12:30 - 13:30 |
Lunch |
| 13:30 - 15:00 |
Second Session - Prediction applications: |
| 13:30 - 14:00 |
Predicting Query Performance in Intranet Search, Craig Macdonald, Ben He, Iadh Ounis |
| 14:00 - 14:30 |
Predicting Performance for Gene Queries, Aditya Sehgal and Padmini Srinivasan |
| 14:30 - 15:00 |
Metasearch and Federation using Query Difficulty Prediction, Elad YomTov, Shai Fine, David Carmel, Adam Darlow |
| 15:00 - 15:30 |
Coffee break |
| 15:30 - 16:30 |
Panel discussion - TBD |
Further Information
Additional information is available at the Query Prediction Workshop Web page and questions can be sent directly to any of the organizers mentioned above. Information about the workshop venue and local arrangements (hotel reservations, etc.) as well as the sponsoring conference can be found at the SIGIR Conference main Web page.
| |
|
|