摘 要: 针对金融文本情感倾向模糊问题,设计了一种基于BERT(Bidirectional Encoder Representations from Transformers,基于Transformer的双向编码技术)和Bi-LSTM(Bidirectional Long Short-Term Memory Network,双向长短时记忆网络)的金融文本情感分析模型,以BERT模型构建词向量,利用全词掩盖方法,能够更好地表达语义信息。为搭建金融文本数据集,提出一种基于深度学习模型的主题爬虫,利用BERT+Bi-GRU(双门控循环单元)判断网页内文本主题相关性,以文本分类结果计算网页的主题相关度。实验结果表明:本文所设计的情感分析模型在做情感分析任务时取得了87.1%的准确率,能有效分析文本情感倾向。 |
关键词: 情感分析;主题爬虫;长短时记忆网络;预训练语言模型 |
中图分类号: TP391
文献标识码: A
|
|
Financial Text Sentiment Analysis and Application Based on BERT |
JI Yuwen1, CHEN Zhe2
|
(1.School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China; 2.School of Inf ormation Science and Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China)
yuwen.ji.yan@foxmail.com; 18758099691@163.com
|
Abstract: Aiming at the problem of vague sentiment orientation in financial texts, this paper proposes to design a financial text sentiment analysis model based on BERT (Bidirectional Encoder Representations from Transformers) and Bi-LSTM(Bidirectional Long Short-Term Memory Network)is designed. The BERT model is used to construct word vectors, and the whole word masking method is employed to better express semantic information. To construct a financial text dataset, a theme crawler based on a deep learning model is proposed, which uses BERT + Bi-GRU (dual Gate Recurrent Unit) to determine the topic relevance of text within a webpage, and calculates the topic relevance of the webpage based on the text classification results. The experimental results show that the proposed sentiment analysis model achieves an accuracy of 87.1% when performing sentiment analysis tasks, and can effectively analyze text sentiment orientation. |
Keywords: sentiment analysis; theme crawler; long short-term memory networks; pre-training language model |