• 首页
  • 期刊简介
  • 编委会
  • 投稿指南
  • 收录情况
  • 杂志订阅
  • 联系我们
引用本文:戴夏菁,徐谊程,王馨娅,佟德宇.基于Word2Vec的中文文本零水印算法[J].软件工程,2023,26(1):19-23.【点击复制】
【打印本页】   【下载PDF全文】   【查看/发表评论】  【下载PDF阅读器】  
←前一篇|后一篇→ 过刊浏览
分享到: 微信 更多
基于Word2Vec的中文文本零水印算法
戴夏菁,徐谊程,王馨娅,佟德宇
(南京财经大学信息工程学院,江苏 南京 210023)
2415700426@qq.com; YichengXu421@163.com; 1520369099@qq.com; tdyforweb@163.com
摘 要: 经典的文本鲁棒水印会修改文本内容或格式,从而降低文本的保真性和可用性,文章提出了一种基于Word2Vec的中文文本零水印算法,能够在不修改文本信息的前提下实现水印的生成和检测。首先对文本数据进行分词,统计词频并提取特征词,运用Word2Vec生成相应的特征词向量;然后采用SVD(奇异值分解)算法对其进行降维,并结合AES(高级加密标准)加密生成最终的零水印。水印检测时,通过对比SVD分解产生的特征值和特征向量判断版权归属。基于理论概述和实验结果综合分析,文章提出的零水印算法不需要对原始文本做任何修改,能够抵抗一定程度的增删、句型转换、同义词替换等攻击,具有一定的鲁棒性,切实有效地解决了文本的版权保护问题。
关键词: Word2Vec;SVD;零水印;中文文本;词向量
中图分类号: TP309.2    文献标识码: A
A Zero-Watermark Algorithm for Chinese Text based on Word2Vec
DAI Xiajing, XU Yicheng, WANG Xinya, TONG Deyu
(School of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China)
2415700426@qq.com; YichengXu421@163.com; 1520369099@qq.com; tdyforweb@163.com
Abstract: Classic text robust watermark can modify the content or the format, thereby reducing the fidelity and usability of the text. This paper proposes a Word2Vec-based zero-watermark algorithm for Chinese text, which ensures that watermark generation and detection make no modification to the original text. Firstly, by dividing the text into words, word frequency is counted and feature words are extracted; the corresponding feature word vector is generated by Word2Vec. Then, SVD (Singular Value Decomposition) algorithm is used to reduce its dimension, and the zero-watermark is finally generated by AES (Advanced Encryption Standard) encryption. In watermark detection, the copyright ownership is determined by comparing the eigenvalues and eigenvectors generated by SVD. Based on theoretical summary and comprehensive analysis of experimental results, the proposed zero-watermark algorithm does not need to make any modification to the original text, and can resist attacks such as addition and deletion, sentence pattern conversion and synonym substitution to a certain extent. It has certain robustness and effectively solves the problem of protecting the copyright of the text.
Keywords: Word2Vec; SVD; zero-watermark; Chinese text; word vector


版权所有:软件工程杂志社
地址:辽宁省沈阳市浑南区新秀街2号 邮政编码:110179
电话:0411-84767887 传真:0411-84835089 Email:semagazine@neusoft.edu.cn
备案号:辽ICP备17007376号-1
技术支持:北京勤云科技发展有限公司

用微信扫一扫

用微信扫一扫