摘 要: Hadoop集群环境下本地性调度算法是提高数据本地性的算法。算法本质是提高数据本地性,减少数据传输 时间,减少集群的网络I/O,提高资源利用率。由于调度算法采用FIFO方式,当前作业数据量大时将影响其他紧急性高的 作业响应时间,降低系统性能。本文提出一种新的调度策略,即在保证原算法数据本地性的前提下,集成静态优先级的抢 占调度策略。实验结果表明,在相同的数据集上,采用集成静态优先级抢占的调度策略,优先级高的作业响应时间较优先 级低的作业响应时间减少。 |
关键词: 数据本地性;静态优先级抢占;作业响应时间 |
中图分类号: TP316.4
文献标识码: A
|
基金项目: 辽宁“百千万人才工程”培养经费资助(项目编号:2012921041) |
|
Improvement of Local Scheduling Algorithm in the Hadoop Cluster Environment |
WANG Yuefeng,CHEN Fuhong1,2
|
1.( 1.Shenyang University of Technology, Shenyang 110023, China;2. 2.Liaoning University, Shenyang 110136, China)
|
Abstract: Local scheduling algorithm can improve data locality in the Hadoop cluster environment.The essence of the algorithm is to improve data locality,reduce data transmission time,reduce the network I/O of the cluster,and increase the resource utilization rate.As the scheduling algorithm adopts the FIFO mode,the current large amount of data will affect the response time of other jobs with high emergency,which decreases the system performance.This paper proposes a new scheduling strategy, which can guarantee the data locality of the original algorithm and integrate the static priority preemption scheduling strategy.The experiment results show that,onthe same data set,adopting the integrated static priority preemption scheduling strategy,the response time of the job with higher priority is less than that of the job with lower priority. |
Keywords: data locality;static priority preemption;job response time |