IBM Haifa researcher Maxim Gurevich, and co-author Ziv Bar-Yossef of the Technion - Israel Institute of Technology, recently received the Best Paper Award at the 15th international World Wide Web conference (WWW2006) in Edinburgh. The WWW conference is the premier conference on Web Search Engine Technology and one of the most selective conferences around, with an acceptance ratio of approximately 11%. This year, only 75 papers were accepted from among 667 submissions. The Program Committee selected Gurevich and Bar-Yossef's paper, titled "Random Sampling from a Search Engine's Index", as the best paper among the 75 papers accepted.
Organized by the International World Wide Web Conference Committee (IW3C2), the annual WWW conference has played a fundamental role in gathering trail-blazers from the international community to discuss, debate and explore how to shape and develop the future direction of the World Wide Web. The paper was submitted to the Search Track of the conference, which traditionally draws the most submissions (120 this year) and whose sessions draw the largest audiences, along with the Web Mining Track (105 submissions).
The paper itself presents a novel technique for estimating the parameters used to measure search engine performance, including the index size (how well the search engine covers the Web), the percentage of spam, the freshness of the index, and so forth. Although techniques exist for determining these parameters, the ensuing measurements are usually biased and do not produce objective or accurate results. For example, a content-rich page will have higher impact on these measurements than a short page, and will therefore induce bias. The new technique estimates the bias, calculates by how much it influences performance, and then neutralizes its effect by means of a counterbalance bias.
Apparently, Gurevich and Bar-Yossef are the first to use these methods for measuring search engine performance. The researchers analyzed the accuracy of their technique rigorously and proved that the calculated estimates are guaranteed to be close to the true values. Experiments on a corpus of 2.4 million documents substantiate their analytical findings and show that their measurements do not have significant bias.
"The results of Maxim and Ziv's work will have a significant influence on future work in this area," stated Ronny Lempel, Manager of Information Retrieval at the IBM Haifa Labs, "Being able to determine the size of the search space and more accurately measure the performance of search engines is of major importance. This award will motivate others to delve further into the topic and advance research and development in this area."