The invention relates to a heterogeneous cloud wireless access network resource allocation method based on deep reinforcement learning, and belongs to the technical field of mobile communication. Themethod comprises the following steps: 1) taking queue stability as a constraint, combining congestion control, user association, subcarrier allocation and power allocation, and establishing a random optimization model for maximizing the total throughput of the network; 2) considering the complexity of the scheduling problem, the state space and the action space of the system are high-dimensional,and the DRL algorithm uses a neural network as a nonlinear approximation function to efficiently solve the problem of dimensionality disasters; and 3) aiming at the complexity and the dynamic variability of the wireless network environment, introducing a transfer learning algorithm, and utilizing the small sample learning characteristics of transfer learning to enable the DRL algorithm to obtain an optimal resource allocation strategy under the condition of a small number of samples. According to the method, the total throughput of the whole network can be maximized, and meanwhile, the requirement of service queue stability is met. And the method has a very high application value in a mobile communication system.