The invention provides a video copy extraction method based on tensor decomposition, in particular to a video copy detection method based on multimodal features and tensor decomposition, which includes the following steps: (1) video preprocessing: video clips are standardized by spatial-temporal sampling; (2) video tensor modeling and tensor decomposition: the global, local and time domain features of the video are respectively extracted, tensor modeling is carried out, and the Tucker model is utilized to carry out tensor decomposition, so that a nuclear tensor and a low-order tensor are obtained; (3) video fingerprint matching: the nuclear tensor is utilized to carry out rough matching, and a video fingerprint is utilized to carry out fine matching in a coarse selection. Compared with the prior art, the method realizes the true complementary fusion of multimodal features of a video, not only overcomes the defect of poor robustness of video fingerprints constructed with single-mode features, but also realizes the temporal associated co-occurrence between a variety of modes of features, and increases the accuracy and efficiency of video copy extraction.