摘 要: 在大数据时代,数据成为推动各个行业发展的动力,有效的分析数据不仅对社会经济效应有巨大影响, 而且对政府,企业的管理也有深远影响。于是,怎样高效且快速地从Web日志中挖掘出有用的价值并且转化为分析依据 是系统设计的重点。本文主要采用Hadoop为开源框架,利用HDFS进行数据的存储,Hive为开源数据仓库工具,设计 并实现一个Web日志分析系统。文章主要阐述了系统的结构、设计思想和实现方法。 |
关键词: Hadoop;Web;Hive |
中图分类号: TP399
文献标识码: A
|
|
The Design of the Web Log Analysis System Based on Hadoop |
HE Xuan,MA Jialin
|
( Software Institute, Shenyang Normal University, Shenyang 110000, China)
|
Abstract: In the era of big data,data has become a driving force for the development of various industries.Effective analysis of data not only has a huge impact on social and economic effects,but also has a profound impact on the management of governments and enterprises.Therefore,how to efficiently and quickly extract useful value from Web logs and turn it into analysis basis is the key point of system design.In this paper,by means of adopting Hadoop as the open source framework,HDFS for data storage and Hive as the open source data warehouse tool,a Web log analysis system is designed and implemented.This article mainly elaborates the system structure,the design thought as well as the realization method. |
Keywords: Hadoop;Web;Hive |