摘 要: 为提升合同中数据项识别和提取的准确率,提出一种基于卷积神经网络(Convolutional Neural Network, CNN)和残差结构单元(Residual Building Unit,RBU)结合优化的CNN-RECR(Real Estate Transaction Contract Information Detection and Recognition Method Based on Improved Convolutional Neural Network)模型,并将其应用到不动产交易平台中合同数据项的识别提取场景。首先,针对提取特征表示能力弱等问题,设计了合同数据文本检测网络(Contract Data Text Detection Network, CDTD-Net)对合同手写文字的不同尺度特征进行提取;其次,与残差结构单元相结合,设计识别文字与识别数字模型;最后,对实例进行实验,实验结果显示CNN-RECR模型的识别准确率达到97.62%,证明本方法能有效提高模型的识别性能,为实现低成本运行奠定了基础。 |
关键词: 卷积神经网络;残差结构单元;合同数据;识别提取 |
中图分类号: TP391.1
文献标识码: A
|
|
Contract Data Recognition and Extraction Based on Convolutional Neural Networks and Residual Building Units |
ZHANG Chun1, LIU Congjun1,2
|
(1.School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212000, China; 2.Jiangsu KeDa Hui f eng Technology Co., Ltd., Zhenjiang 212000, China)
18505242212@163.com; liu_cj@163.com
|
Abstract: To improve the accuracy of recognizing and extracting data items in contracts, this paper proposes a CNN-RECR model ( Real Estate Transaction Contract Information Detection and Recognition Method Based on Improved Convolutional Neural Network) based on Convolutional Neural Network(CNN)and Residual Building Unit (RBU). This model is applied to the scenario of recognizing and extracting contract data items on real estate transaction platforms. Firstly, to address issues related to weak feature representation capabilities, the Contract Data Text Detection Network (CDTD-Net) is designed to extract multi-scale features of handwritten text in contracts. Secondly, models for recognizing text and digits are designed in conjunction with RBU. Finally, experiments conducted on instances show that the CNN-RECR model achieves a recognition accuracy of 97.62% , demonstrating that this method effectively enhances the model's recognition performance and lays the foundation for low-cost implementation. |
Keywords: CNN; RBU; contract data; recognition and extraction |