A new statistical strategy for pooling: ELI
Abstract
Doing exhaustive relevance judgments is one of the most challenging tasks in the construction process of an IR test collection, especially when the collection is composed of millions of documents. Pooling (or system pooling), which is basically a method for selecting documents to assess, is a solution to overcome this challenge. In this paper, to form such an assessment pool, a new, ranked-based document selection criterion, called the expected level of importance (ELI), is introduced. The results of the experiments performed, using TREC 5, 6, 7, and 8 data, showed that by using a pool in which the documents are sorted in the decreasing order of their calculated ELI scores, relevance judgments can efficiently be made by minimal human effort, while maintaining the size and the effectiveness of the resulting test collection. The criterion we propose can directly be adapted to the traditional TREC pooling practice in favor of efficiency, with no additional cost. (C) 2013 Elsevier B.V. All rights reserved.