The invention provides a multi-energy park scheduling method and system based on double-layer reinforcement learning. The method comprises the steps of: obtaining scheduling controllable objects, i.e., a source side unit, a load side unit, an energy conversion unit and a storage unit, in an integrated energy system, constructing a double-layer optimization decision model which comprises an upper-layer reinforcement learning sub-model and a lower-layer mixed integer linear programming sub-model, enabling the upper-layer reinforcement learning sub-model to acquire action variable information ofthe storage unit under the state variable information at the current moment and transmit the action variable information to the lower-layer mixed integer linear programming sub-model, enabling the lower-layer mixed integer linear programming sub-model to acquire a corresponding award variable and state variable information of the storage unit at the next moment, and feed back the award variable and the state variable information to the upper-layer reinforcement learning sub-model, and iteratively executing the above steps until the scheduling is finished. According to the embodiment of the invention, through a data-driven reinforcement learning method, a decision only needs to be made according to the current state, future information does not need to be predicted, the decision timelinessis high, the decision effect is excellent, and a real-time optimization decision can be realized.