摘 要: 大数据分析和应用得到了各个行业的关注,人们试图从大量数据中发现蕴含的模式和规律,进而产生更 多的价值,数据过滤作为数据分析过程中常用手段所起到的作用是无可替代的。基于方便用户快速筛选数据并找到差异 性的数据子集的实际需求[1],需要分析与挖掘数据项之间联系,对数据过滤规则进行建模,以帮助用户快速定位到差异 性的数据子集。在本篇论文中创新性地提出一种查找差异数据子集的过滤规则建模方法。该方法的目的是解决如何在数 据分析中应用数据过滤规则建立分析过滤模型,然后利用模型分析过滤得到差异性的数据子集,最后利用模型完成结果 集的自动可视化。利用该建模方法建立的数据分析系统能在真实数据集中快速找到差异性数据子集,并且自动完成对结 果子集的可视化展示,展现了建模方法的实用性和高效性。 |
关键词: 数据分析;差异性数据;过滤模型 |
中图分类号: TP18
文献标识码: A
|
基金项目: 国家重点研发计划资助(2018YFB1004404);国家自然科学基金项目(61732004). |
|
A Filtering Rule Modeling Method for Finding Subset of Differential Data |
ZHOU Pengcheng,HE Zhenying,JING Yinan,WANG Xiaoyang1,2
|
1.( 1.Software School, Fudan University, Shanghai 201203, China;2. 2.Computer Science and Technology School, Fudan University, Shanghai 201203, China)
|
Abstract: The analysis and application of big data have attracted the attention of various industries.People try to find the patterns and rules contained in a large amount of data so as to generate more values.Data filtering plays an irreplaceable role as a common approach in the process of data analysis.Based on the actual requirements of facilitating users to quickly filter data and find the differential data subsets,it is necessary to analyze and mine the connections between data items and conduct modeling of data filtering rules to help users quickly locate the differential data subsets.The purpose of this method is to solve the problem of how to apply data filtering rules in data analysis to establish an analytical filtering model,and then use the model to analyze and filter differential data subsets,and finally use the model to complete automatic visualization of result sets.The data analysis system established by this modeling method can quickly find out the differential data subsets in real data sets,and automatically complete the visualization of the result subsets,which shows the practicability and efficiency of the modeling method. |
Keywords: data analysis;differential data;filtering model |