软件工程

引用本文:

李锦辉,刘继.基于知识蒸馏模型的文本情感分析[J].软件工程,2024,27(4):27-32.【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于知识蒸馏模型的文本情感分析

李锦辉¹, 刘继^1,2

(1.新疆财经大学统计与数据科学学院, 新疆乌鲁木齐 830012;
2.新疆财经大学新疆社会经济统计与大数据中心, 新疆乌鲁木齐 830012)
1187357069@qq.com; Liuji5000@126.com

摘要: 为了解决预训练语言模型训练时间过长、参数多且难以部署,以及非预训练语言模型分类效果较差的问题,提出了基于知识蒸馏模型的文本情感分析。以预训练深度学习模型(Bidirectional Encoder Representations from Transformers,BERT)作为教师模型,选择双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)作为学生模型;在知识蒸馏过程中,将教师模型的Softmax层的输出作为“知识”蒸馏给学生模型,并将蒸馏后的模型应用到公共事件网络舆情文本情感分析中。实验结果表明,该模型参数仅为BERT模型的1/13,使BiLSTM模型的准确率提升了2.2百分点,优于其他同类别轻量级模型,提高了文本情感分析效率。

关键词: 知识蒸馏网络舆情 BERT模型 BiLSTM模型

中图分类号: TP391.1 文献标识码: A

基金项目: 国家自然社科基金资助项目(大数据背景下网络舆情智能治理:共同体构建、协同演进与引导机制,编号:72164034)

Text Sentiment Analysis Based on Knowledge Distillation Model

LI Jinhui¹, LIU Ji^1,2

(1.School of Statistics & Data Science, Xinjiang University of Finance & Economics, Urumqi 830012, China;
2.Xinjiang Social & Economic Statistics & Big Data Application Research Center, Xinjiang University of Finance & Economics, Urumqi 830012, China)
1187357069@qq.com; Liuji5000@126.com

Abstract: To address the issues of lengthy training time, large number of parameters, and deployment challenges of pre-trained language models, as well as the comparatively poor performance of non-pre-trained language models in sentiment analysis, this paper proposes a text sentiment analysis based on knowledge distillation model. This model utilizes the pre-trained deep learning model, Bidirectional Encoder Representations from Transformers (BERT), as the teacher model and the Bidirectional Long Short-Term Memory (BiLSTM) as the student model. In the process of knowledge distillation, the output of the Softmax layer of the teacher model is distilled as ″knowledge″ for the student model, which is then applied to text sentiment analysis of online public opinions of public events. Experimental results demonstrate that the proposed model, with parameters only 1/13 of the BERT model, improves the accuracy of the BiLSTM model by 2.2 percentage points; it outperforms other similar lightweight models and enhancing the efficiency of text sentiment analysis.

Keywords: knowledge distillation online public opinions BERT model BiLSTM model

用微信扫一扫