软件工程

引用本文:

徐浩南,林立岚,蔡霞.基于深度强化学习的多智能体防窃听波束成形[J].软件工程,2024,27(11):1-5.【点击复制】

【打印本页】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】

←前一篇|后一篇→

过刊浏览

分享到：微信更多

基于深度强化学习的多智能体防窃听波束成形

徐浩南, 林立岚, 蔡霞

(浙江理工大学计算机科学与技术学院, 浙江杭州 310018)
1196378307@qq.com; 1062751361@qq.com; cxdaisy@zstu.edu.cn

摘要: 针对多智能体通过无线传感器网络与目标接收器通信时可能遭遇的信息窃取问题,提出了一种创新的多智能体波束成形方法。该方法旨在通过动态调整智能体的分布及传输信号状态,确保接收器能收到高质量的信号,最大限度地避免被潜在的窃听者窃取信息。首先将联合优化问题定义为部分可观测马尔可夫决策过程(POMDP),其次基于深度强化学习算法解决此优化问题。通过引入集中式训练、分布式执行的框架,智能体可以根据局部观测进行协同决策,从而调整全局通信状态。为了验证所提方法的有效性,基于多智能体粒子环境(MPE)设计了仿真环境,并在多个场景下进行了训练及测试,实验结果验证了该方法的有效性。

关键词: 多智能体系统波束成形防窃听通信深度强化学习

中图分类号: TP301.6 文献标识码: A

Multi-Agent Anti-Eavesdropping Beamforming Based on Deep Reinforcement Learning

XU Haonan, LIN Lilan, CAI Xia

(School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China)
1196378307@qq.com; 1062751361@qq.com; cxdaisy@zstu.edu.cn

Abstract: This research addresses the potential issue of information theft during communication between multiple agents and target receivers through wireless sensor networks. An innovative multi-agent beamforming method is proposed to dynamically adjust the distribution of agents and the transmission signal states, ensuring the receiver receives high-quality signals while minimizing the risk of information being intercepted by potential eavesdroppers. Firstly, the joint optimization problem is defined as a Partially Observable Markov Decision Process (POMDP). Then, the optimization problem is solved using the deep reinforcement learning algorithm. By introducing a centralized training and distributed execution framework, agents can make collaborative decisions based on local observations, adjusting the overall communication state. To verify the effectiveness of the proposed method, a simulation environment based on the Multi-Agent Particle Environment (MPE) is designed, in which training and testing are conducted in multiple scenarios. Experimental results verify the effectiveness of the method.

Keywords: multi-agent system beamforming anti-eavesdropping communication deep reinforcement learning

用微信扫一扫