The invention discloses a method, system and device for efficient memory replacement between GPU devices and a storage medium. In a related scheme, firstly, on the premise of effectively reducing memory limitation, data exchange operation (unloading and retrieving) is parallel to model training, calculation overhead is not introduced, and transmission time can be hidden; and secondly, the inactive data on the GPU equipment with high memory load are unloaded to other GPU equipment and are retrieved when needed, so that the free memory space of the equipment in the system is fully utilized, a plurality of direct connection high-speed links among the GPUs are aggregated, and the high-speed communication bandwidth is obtained, so that the memory is reduced more quickly, and the data are retrieved more timely. By combining the two points, the performance overhead introduced by memory compression can be greatly reduced, and the limitation of the memory on model training can be effectively reduced, so that the model training efficiency is improved.