摘 要: 用户画像主要用于精准营销、用户征信和个性化推荐等,该技术已广泛应用于电信、电子商务、社交网 络等领域。在很多领域,由于数据孤岛的存在,还没有真正实现数据赋能,如在研究生教育评价领域,一方面教育主管 部门大多采用抽样、问卷调查等形式对研究生教育进行评价;另一方面,海量的科研数据散落在互联网上,如智立方、 谷歌学术、百度学术、DBLP和Web of Science等网站收录了大量科研人员论文信息,学习经历和工作经历散落在各类 招聘网站上。为了实现精准的研究生教育质量评价,需要整合各类科研数据,打破数据孤岛。因此,本文以互联网数据 为基础,在数据融合的基础上设计并开发了一款科研人员画像系统,辅助科研人员、科研机构、教育主管部门等开展智 能决策。 |
关键词: 科研人员画像;数据孤岛;数据清洗;数据融合 |
中图分类号: TP391
文献标识码: A
|
基金项目: 国家重点研发计划(2016YFB1000905);国家自然科学基金(61402177,61502236,61472321);上海市农业推广项目(T20170303). |
|
Design and Implementation of the Scientific Researchers Profiling System |
JIN Gangzeng,LI Na,ZHENG Jianbing,GAO Ming
|
( School of Data Science and Engineering, East China Normal University, Shanghai 200062, China)
|
Abstract: User profiling is mainly used for precise marketing,credit investigation and personalized recommendation.The technology has been widely used in telecommunications,e-commerce,social networking and other fields.In many fields,due to the existence of data islands,data empowerment has not yet been realized.For example,in the field of postgraduate education evaluation,on the one hand,most education authorities use sampling and questionnaires to evaluate graduate education.To achieve accurate graduate education quality assessment,it is necessary to integrate various types of scientific research data to break the island of data.For example,there are a large number of research papers which are distributed in heterogeneous web platforms,such as Zhicub,Google Scholar,Baidu Academic,DBLP,and Web of Science,etc.Their education background and work experience are scattered on various recruitment websites.Therefore,based on the Internet data,this paper designs and implements a researcher profiling system based on data fusion to assist scientific researchers,research institutes,and educational authorities in making intelligent decisions. |
Keywords: researcher profiling;data island;data cleaning;data fusion |