软件工程

引用本文:

张晗烁,姜明,张旻.基于深度学习实现增强更新的文本检测模型[J].软件工程,2025,28(8):5-8.【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于深度学习实现增强更新的文本检测模型

张晗烁,姜明,张旻

(杭州电子科技大学计算机学院,浙江杭州 310018)
zhs1316168044@163.com; jmzju@163.com; hz_andy@163.com

摘要: 为了应对场景文本检测的查询更新上依赖隐式更新的挑战,提出了基于深度学习实现增强更新的文本检测模型。该模型首先对边界框的控制点进行建模完成查询的初始化。在解码过程中,该模型不仅利用解码器的注意力机制,还结合当前解码器层及后续层的预测信息来指导查询进行更精确的增强更新。此外,还引入了预测聚合模块,它能够对相似的控制点预测进行聚合,从而提高了检测的鲁棒性。Total-Text数据集上的实验,结果表明,Recall提升了0.7%,F-measure提升了0.3%,验证了该方案的有效性。

关键词: 文本检测增强更新深度学习预测聚合

中图分类号: TP391 文献标识码: A

基金项目: 浙江省科技计划项目(2024C01181)

A Text Detection Model with Enhanced Updates Based on Deep Learning

ZHANG Hanshuo, JIANG Ming, ZHANG Min

(School of Computer, Hangzhou Dianzi University, Hangzhou 310018, China)
zhs1316168044@163.com; jmzju@163.com; hz_andy@163.com

Abstract: To address the challenge of implicit update dependency in query updates for scene text detection, this paper proposes a deep learning-based text detection model with enhanced updates. The model first initializes queries by modeling the control points of bounding boxes. During the decoding process, it not only leverages the attention mechanism of the decoder but also incorporates prediction information from both the current and subsequent decoder layers to guide more precise and enhanced query updates. Additionally, a prediction aggregation module is introduced to aggregate predictions of similar control points, thereby improving detection robustness. Experiments conducted on the Total-Text dataset demonstrate the effectiveness of the proposed method, achieving a 0.7% improvement in recall and a 0.3% increase in F-measure.

Keywords: text detection enhanced updates deep learning prediction aggregation

用微信扫一扫