The main findings from our experiments are: 1. Although the differences are very small, rw * log r is at least as effective as rw * r in (A)-(D), and the difference is statistically significant in (B). Moreover, rw * r~(1/2) outperforms rw * r on average for Japanese IR ((C) and (D)). Thus it seems worthwhile to consider discounting r appropriately for given data. Recent Japanese IR experiments at NTCIR-3 confirm this as well. 2. As in, the advantage of incorporating ij into the term selection criterion is not clear: while (A) appears to suggest a small positive effect, rw * rtf is significantly worse than rw * r for Japanese IR. The advantage of calibration (i.e. using ctf) for Okapi is not clear either. 3. As in, relative criteria generally do at least as well as the absolute ones. Combining different expansion runs as in did not give any improvements in our experiments. Since there is no single outstanding criterion for PRF, the best strategy available at present is to carefully select a good criterion from several candidates during the training phase. As for future work, we would like to tackle Subproblem (a) mentioned in Section 2. along with other flexible PRF problems, in order to make PRF as close to true relevance feedback as possible.
展开▼