The invention discloses an intelligent sound box control method, device and equipment and a storage medium, and relates to the technical field of voice. According to the specific implementation scheme, a to-be-played text is segmented through a terminal device according to a preset text length; and the segmented texts are sent to a server in sequence, so that the server converts the received segmented texts into audios, and sends the audios to the intelligent sound box for playing. According to the embodiment of the invention, the to-be-played text is segmented, and the audio can be played after each segment of segmented text is converted into the audio, and the intelligent sound box does not need to wait for the to-be-played text to be completely converted into the audio for playing, so that the response speed is improved, and it is convenient to stop converting the text into the audio at any time and stop playing at any time; the process of converting the text into the audio is completed by the server, so that the system resource consumption and the power consumption of the terminal device are reduced; and moreover, the intelligent sound box plays the audio of the segmented text, does not occupy an audio channel of the terminal device, and does not affect the playing of other audios by the terminal device.