摘要

With the improvement in the intelligence level of UAVs and the development of cluster control technology, the intelligent decision-making method of UAV cluster confrontation will become the key technology of UAV combat in the future. The UAV cluster confrontation learning environment is of complex characteristics of high dimension, non-linearity, incomplete information, continuous action space, and so on. Recently, artificial intelligence technology represented by deep learning and reinforcement learning has made a great breakthrough. Deep reinforcement learning has shown a great ability to solve intelligent decision-making problems in complex environments. In this paper, enlightened by the multi-agent centralized training distributed execution framework and the idea of maximum policy entropy, we propose a deep reinforcement learning method of the multi-agent soft actor-critic (MASAC) with incomplete information. A game model for UAV cluster confrontation based on multi-agent deep reinforcement learning is established, and a continuous space multiple UAV combat environment is constructed. Simulation experiments are performed on the asymmetric confrontation of UAV clusters with red and blue teams. The experimental results show that MASAC outperforms the existing popular multi-agent deep reinforcement learning methods, making the game players converge to a higher return equilibrium point of the game. Moreover, the convergence of MASAC is investigated and analyzed extensively, and the results show that MASAC is of good convergence and stability, ensuring the practicability of MASAC in intelligent decision-making regarding UAV cluster confrontation.

全文