摘 要: 为了解决预训练语言模型训练时间过长、参数多且难以部署,以及非预训练语言模型分类效果较差的问题,提出了基于知识蒸馏模型的文本情感分析。以预训练深度学习模型(Bidirectional Encoder Representations from Transformers,BERT)作为教师模型,选择双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)作为学生模型;在知识蒸馏过程中,将教师模型的Softmax层的输出作为“知识”蒸馏给学生模型,并将蒸馏后的模型应用到公共事件网络舆情文本情感分析中。实验结果表明,该模型参数仅为BERT模型的1/13,使BiLSTM模型的准确率提升了2.2百分点,优于其他同类别轻量级模型,提高了文本情感分析效率。 |
关键词: 知识蒸馏;网络舆情;BERT模型;BiLSTM模型 |
中图分类号: TP391.1
文献标识码: A
|
基金项目: 国家自然社科基金资助项目(大数据背景下网络舆情智能治理:共同体构建、协同演进与引导机制,编号:72164034) |
|
Text Sentiment Analysis Based on Knowledge Distillation Model |
LI Jinhui1, LIU Ji1,2
|
(1.School of Statistics & Data Science, Xinjiang University of Finance & Economics, Urumqi 830012, China; 2.Xinjiang Social & Economic Statistics & Big Data Application Research Center, Xinjiang University of Finance & Economics, Urumqi 830012, China)
1187357069@qq.com; Liuji5000@126.com
|
Abstract: To address the issues of lengthy training time, large number of parameters, and deployment challenges of pre-trained language models, as well as the comparatively poor performance of non-pre-trained language models in sentiment analysis, this paper proposes a text sentiment analysis based on knowledge distillation model. This model utilizes the pre-trained deep learning model, Bidirectional Encoder Representations from Transformers (BERT), as the teacher model and the Bidirectional Long Short-Term Memory (BiLSTM) as the student model. In the process of knowledge distillation, the output of the Softmax layer of the teacher model is distilled as ″knowledge″ for the student model, which is then applied to text sentiment analysis of online public opinions of public events. Experimental results demonstrate that the proposed model, with parameters only 1/13 of the BERT model, improves the accuracy of the BiLSTM model by 2.2 percentage points; it outperforms other similar lightweight models and enhancing the efficiency of text sentiment analysis. |
Keywords: knowledge distillation; online public opinions; BERT model; BiLSTM model |