A human-swarm coordination method based on hierarchical multi-agent reinforcement learning

By adopting a three-layer hierarchical multi-agent reinforcement learning architecture, the decision-making difficulties of multi-agent systems in large-scale complex environments are solved, and efficient and interpretable drone swarm control under human-machine collaboration is realized, improving the adaptability of the strategy and the efficiency of cooperation.

CN122242650APending Publication Date: 2026-06-19UNIV OF SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
UNIV OF SCI & TECH OF CHINA
Filing Date
2026-03-11
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing multi-agent reinforcement learning methods suffer from poor convergence, low collaboration efficiency, and insufficient policy generalization ability in large-scale, dynamically changing, and complex environments, especially in human-machine collaboration and dynamic preference adjustment.

Method used

A three-layer hierarchical multi-agent reinforcement learning architecture is adopted, including a top-level human-cluster interaction module, a middle-level target selection module, and a bottom-level policy collaboration execution module. Through centralized training, distributed execution, and attention mechanisms, it achieves seamless integration of human commander intent and agent decision-making and policy scalability.

🎯Benefits of technology

It improves the decision-making transparency and adaptability of UAV swarms in complex adversarial environments, enables efficient and interpretable swarm control, and allows for real-time response to tactical changes while maintaining tactical advantage.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

This invention relates to the field of human-machine collaboration technology and discloses a human-cluster collaboration method based on hierarchical multi-agent reinforcement learning. By constructing a hierarchical decision-making architecture and introducing a human-cluster interaction mechanism, the preferences of the human commander are integrated into the decision-making process of the agent cluster. The hierarchical decision-making architecture includes a top-level human-cluster interaction module, a middle-level target selection module, and a bottom-level strategy collaboration execution module. The top-level module groups the target groups and receives the preference values ​​assigned to each target group by the human commander. The middle-level module assigns targets to each agent based on human preference rewards and task rewards. The bottom-level module, constrained by the target allocation results, completes the specific task execution through multi-agent collaborative control. This invention effectively alleviates the performance degradation problem under out-of-distribution training conditions by training in small-scale scenarios and transferring to large-scale scenarios, combined with human-machine collaboration.
Need to check novelty before this filing date? Find Prior Art