The invention discloses a video translation method, a system and a device and a storage medium. The method comprises the following steps: acquiring video data; performing voice segmentation on the video data to obtain a voice segment and a video segment corresponding to the voice segment; performing voice recognition on the voice segment to obtain a first text, and translating the first text to obtain a second text; obtaining synthetic voice according to the second text, and enabling the synthetic voice to be matched with the video clip by adjusting the synthetic voice and the video clip corresponding to the synthetic voice; and detecting and adjusting the lip shape in the video clip so as to enable the lip shape to be synchronously matched with the synthetic voice. According to the method, the function of automatically translating the video is realized, the audio of the target language sound can be generated, the video with the sound matched with the lip shape is generated, the communication obstacle between different languages is solved, manual dubbing is not needed, the translation cost is reduced, and the method can be widely applied to the field of video processing.