摘 要: 随着对数据处理的实时性要求越来越高,分布式流处理系统应运而生。但是在分布式的集群规模下,各 种软硬件原因导致的故障很难避免的。现有的相关基准测试主要关注于分布式流处理系统的处理性能,很少对该类系统 处理故障的容错性能进行评测,以至于关键应用在系统选型的时候特别艰难。针对分布式流处理系统的容错性能,本文 设计并实现了一套灵活的基准测试框架。最后,本文在开源数据流处理系统Apache Storm和Apache Flink进行了容错 性能的基准测试,验证定义的测试基准的正确性和有效性,实验结果也表明Flink的容错性能相对较好。 |
关键词: 分布式系统;流处理;容错性能;基准测试 |
中图分类号: TP302.8
文献标识码: A
|
基金项目: 科技部重大专项(2018YFB1003402),国家自然科学基金资助项目(61432006). |
|
Benchmarking for Fault-tolerant Performance in Distributed Stream Processing Systems |
JIANG Cheng,WANG Xiaotong,ZHANG Rong
|
( School of Data Science and Engineering, East China Normal University, Shanghai 200062, China)
|
Abstract: With the increasing real-time requirements for data processing,distributed stream processing systems have emerged.However,under the distributed cluster scale,failures caused by various hardware and software problems are inevitable.The existing related benchmarking mainly focus on the performance of the distributed stream processing system during failure-free time,while rarely evaluating the fault-tolerant performance of the system for handling faults.As a result,it is particularly difficult to select a system for mission-critical applications.This paper designs and implements a flexible benchmarking framework tailored for fault-tolerant performance.Finally,benchmarking the fault-tolerant performance of Apache Storm and Apache Flink verifies the correctness and effectiveness of the benchmark defined in this paper. Experimental results show that fault-tolerant performance of Flink outperforms that of Storm. |
Keywords: distributed system;stream processing;fault-tolerant performance;benchmarking |