摘 要: 选择合适的相似性测度推断共引网络,对于提升网络的关联性和真实性具有重要意义。然而,样本量的大小对相似性测度选择的影响尚未可知。基于样本量大小的敏感性,分别使用两个常用的相似性测度Phi相关系数(简称Phi)和Ochiai系数(简称Och)推断共引网络,通过网络节点属性和拓扑结构对推断的网络质量进行评价。结果显示:与Phi相比,Och推断的共引网络对样本量具有强鲁棒性。随着样本量的变化,Och推断的共引网络一直都遵循小世界特性,而Phi则不符合此特性。研究结论可以推广到其他遵循小世界特性的事务推断网络。同时,研究可以充实网络技术研究领域的基础理论。 |
关键词: 相似性测度;样本量;共引网络;Ochiai系数;Phi相关系数 |
中图分类号: TP393.0
文献标识码: A
|
基金项目: 国家自然科学基金项目(61801264);山东省社会科学基金项目(19BJCJ47);聊城大学校级人文社科一般项目(321021948) |
|
The Impact of Sample Size on the Selection of Similarity Measure in Co-citation Network |
MA Zhen1, JIA Baoxian2
|
(1.Library, Liaocheng University, Liaocheng 252000, China; 2.School of Computing, Liaocheng University, Liaocheng 252000, China)
mazhen@lcu.edu.cn; jiabaoxian@lcu.edu.cn
|
Abstract: Choosing an appropriate similarity measure to infer a co-citation network is of great significance for improving the relevance and authenticity of the network. However, the impact of sample size on the selection of similarity measure is unknown. Based on sample size sensitivity, two commonly used similarity measures, Phi correlation coefficient (referred to as Phi) and Ochiai coefficient (referred to as Och), are used to infer the co-citation network. The inferred network quality is evaluated through network node attributes and topology structure. The results show that compared to Phi, the co-citation network inferred by Och has strong robustness to sample size. As the sample size changes, co-citation network inferred by Och has always follows the small world characteristic, while Phi does not conform to this characteristic. The research conclusion can be extended to other transaction inference networks that follow small world characteristics. At the same time, research can enrich the basic theories in the field of network technology research. |
Keywords: similarity measure; sample size; co-citation network; Ochiai coefficient; Phi correlation coefficient |