摘 要: 对支持检索结果多样化任务的查询性能预测进行了研究。分析了现有性能预测算法的不足,考虑利用不 同方式衡量最终检索结果列表的多样性,并在此基础上提出同时考察查询结果的相关性性能与多样性性能的三种方法。 采用TREC ClueWeb09B数据集、Web Track任务的查询集及开源的Indri搜索引擎构建实验平台并进行实验。基于 Spearman、Pearson和Kendall相关系数的评价结果表明,所提出的三种方法与传统方法相比更适用于预测多样化检索 结果,且在不同条件下性能稳定。 |
关键词: 信息检索;查询性能预测;检索结果多样化 |
中图分类号: TP391.3
文献标识码: A
|
基金项目: 本论文得到江苏省自然科学基金(BK20171303:大数据环境下支持检索结果多样化的联邦搜索引擎技术)资助. |
|
Query Performance Prediction for Search Result Diversification in Information Retrieval |
ZHANG Zhongmin,WU Shengli
|
(Jiangsu University, Zhenjiang 212013, China)
|
Abstract: Query performance prediction in supporting of the task of retrieval results diversification is studied.This paper analyses the shortcomings of the existing performance prediction algorithms,considers using different ways to measure the diversity of the final search results list,and proposes three methods to simultaneously examine the relevance and diversity performance of the query results.TREC's ClueWeb09B dataset and query sets for the Web Track Task,and open sourced search engine Indri are used to build the experimental platform and carry out experiments.Measured by using Spearman,Pearson and Kendall correlation coefficients,the evaluation results show that the three proposed methods are more suitable for predicting diversified retrieval results than traditional methods,and have stable performance under different conditions. |
Keywords: information retrieval;query performance prediction;search result diversification |