The invention relates to a resource allocation and unloading decision-making method based on multi-agent architecture reinforcement learning, and belongs to the technical field of mobile communication. According to the method, excitation constraints, energy constraints and network resource constraints are considered, wireless resource allocation, computing resource allocation and unloading decisions are jointly optimized, and a random optimization model for maximizing the QoE of a total user of a system is established and converted into an MDP problem. Secondly, according to the method, an original MDP problem is subjected to factorization, and a Markov game model is established; then, the method provides a centralized training and distributed execution mechanism based on an actor-evaluator algorithm. In the centralized training process, multiple agents obtain global information through cooperation, resource allocation and task unloading decision strategy optimization are achieved, andafter the training process is finished, all the agents independently conduct resource allocation and task unloading according to the current system state and strategy. According to the invention, theQoE of the user can be effectively improved, and the time delay and the energy consumption are reduced.