摘 要: 网络暴力在短时间内就能对个人或群体造成极大伤害。针对目前缺乏网络暴力中文评论语料库、难以准确捕捉文本情绪特征和实现极端暴力情绪倾向的分类存在挑战的问题,在构建网络暴力中文语料库的基础上,提
出一种融合多层次特征的网络暴力情绪分析方法。将评论文本词嵌入得到原始语义特征后,由融合长短期记忆网络(LongShort-Term Memory,LSTM)获取文本全局上下文信息,由文本卷积神经网络(TextConvolutionalNeural
Networks,TextCNN)获取局部关键信息,最后将三个特征融合,通过全连接层输出三分类结果。通过对比实验与消融实验验证了模型的有效性,该情绪分析方法宏平均F1值达到80.88%,显著优于其他基线模型。 |
关键词: 网络暴力 情绪分析 中文语料库 长短期记忆网络 文本卷积神经网络 |
中图分类号: TP391
文献标识码: A
|
基金项目: 陕西省重点产业创新链(群)———工业领域项目(2022ZDLGY06-04) |
|
Sentiment Analysis of Cyberbullying Text by Integrating Multi-Level Features |
ZHANG Xinsheng, HOU Yijun
|
(School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China)
zhangxs@xauat.edu.cn; 770685540@qq.com
|
Abstract: Cyberbullying can inflict significant harm on individuals or groups within a short timeframe. Addressing the current lack of a Chinese cyberbullying comment corpus and the challenges in accurately capturing textual sentiment features for classifying extreme violent sentiment tendencies, this paper constructs a Chinese cyberbullying corpus and proposes a sentiment analysis method integrating mult-i level features. After obtaining the original semantic features through word embedding of comment texts, the method leverages Long Shor-t Term Memory (LSTM) to capture global contextual information and Text Convolutional Neural Networks (TextCNN) to extract local key information. The three features are then fused and passed through a fully connected layer to output three-class classification results. Comparative and ablation experiments confirm the model’s effectiveness, achieving a macro averaged F1-score of 80.88% , significantly outperforming other baseline models. |
Keywords: cyberbullying sentiment analysis Chinese corpus Long Shor-t Term Memory ( LSTM) Text Convolutional Neural Network (TextCNN) |