摘 要: 面对网络流量新类别不断涌现的挑战,以及随之而来的开集识别和模型更新需求,文章提出了一种基于增量学习的开集网络流量分类方法。对于开集识别,支持向量机和K均值聚类算法的级联结构可以持续识别新类和已知类;对于模型更新,基于候选支持向量筛选的“样本回放”和新旧模型加权融合的“参数回放”方法,能有效解决“有类增量的灾难性遗忘”问题。与ISK和DACS方法相比,该方法应用在开集流量识别和分类任务中表现出显著优势,F1 分数能提高1百分点至8百分点,分类速度也优于现有方法。 |
关键词: 网络流量分类;开集识别;增量学习;支持向量机 |
中图分类号: TP391
文献标识码: A
|
|
Research on Network Traffic Classification Based on Incremental Learning in Open Set Environment |
CUI Mengyang, DONG Yuning, QUI Xiaohui, TIAN Wei
|
(School of Communications and In f ormation Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China)
cuimy_work@163.com; 19900011@njupt.edu.cn; qiuxh@njupt.edu.cn; tianw@njupt.edu.cn
|
Abstract: Facing the challenge of emerging new categories of network traffic and the resulting needs for open set recognition and model updates, this paper proposes an incremental learning-based method for open set network traffic classification. For open set recognition, a cascade structure of support vector machines and K-means clustering algorithms is employed to continuously identify new and known classes. For model updates, the "sample replay" method based on candidate support vector selection and the "parameter replay" approach of weighted fusion of old and new models effectively address the issue of "catastrophic forgetting in class incremental learning". Compared to the ISK and DACS methods, this approach demonstrates significant advantages in open set traffic recognition and classification tasks, with an F1 score improvement of 1 to 8 percentage points, and a classification speed superior to existing methods. |
Keywords: network traffic classification; open set recognition; incremental learning; support vector machine |