Video texture migration method and device, electronic equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By extracting and fusing features from video frames, the problem of inter-frame flickering in video texture transfer was solved, generating high-quality texture transfer videos.

CN116362956BActive Publication Date: 2026-06-23BEIJING ZITIAO NETWORK TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date: 2021-12-24
Publication Date: 2026-06-23

Application Information

Patent Timeline

24 Dec 2021

Application

23 Jun 2026

Publication

CN116362956B

IPC: G06T3/04; G06V10/40; G06V10/80

AI Tagging

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Analog display system based on reverse bilinear interpolation
CN118072684BAvoid carry jittersimple calculation Static indicating devices Complex mathematical operations
Display panel, driving method thereof, and display device
CN122337135AImprove the display effect Simple structure Capacitance Driving current
Display panel and display device
CN119546059BImprove the display effect easy to use
Display panel, display device, and method for manufacturing display panel
CN122094311AOvercome abnormal display of white spot defectsImprove the display effect Computer hardware Computer graphics (images)
Display panel and display device
CN119559896Breduce in quantity High light transmittance

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

During the video texture transfer process, there is inter-frame flickering between the video frames of the generated texture transfer video, which affects the display effect of the video.

Method used

By extracting features from the target frame in the original video, first feature information is generated, and then feature fusion is performed with the feature information of the reference frame to remove random noise and generate second feature information. Finally, it is fused with texture feature information to generate texture transfer video.

Benefits of technology

It effectively eliminates the inter-frame flickering problem between video frames and improves the display effect of texture-transferred videos.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116362956B_ABST

Patent Text Reader

Abstract

Embodiments of the present disclosure provide a video texture migration method and device, electronic equipment and storage medium, which obtain an original video, perform feature extraction on a target frame in the original video to generate first feature information of the target frame, wherein the target frame is a video frame after an Nth video frame in the original video, and the first feature information represents an image structure contour of the target frame; perform feature fusion on the first feature information of the target frame and reference feature information to obtain second feature information of the target frame, wherein the reference feature information is used to represent an image structure contour of a reference frame, the reference frame is a video frame before the target frame, and the second feature information is the first feature information without random noise; and generate a texture migration video according to the second feature information of the target frame and corresponding texture feature information, wherein the texture feature information represents image texture details of a reference image, thereby avoiding frame flickering and improving the display effect of the texture migration video.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of video image processing technology, and in particular to a video texture transfer method, apparatus, electronic device, and storage medium. Background Technology

[0002] Image style transfer is a popular image processing technique. A common approach is to transfer features from a reference image, resulting in a transferred image that shares the reference image's tone or imaging style. However, this style transfer can distort the transferred image, affecting its visual appeal.

[0003] To further improve the performance of image style transfer, a texture transfer technique that performs feature transfer on image texture has been proposed. This technique can transfer the image texture of an image so that the generated transferred image has the image texture of the reference image, thereby further improving the realism of the image.

[0004] However, in scenarios involving video texture transfer, there is an issue of inter-frame flickering between the video frames of the generated texture-transferred video, affecting the video display quality. Summary of the Invention

[0005] This disclosure provides a video texture transfer method, apparatus, electronic device, and storage medium to overcome the problem of inter-frame flickering between video frames in texture-transferred videos, which affects the video display effect.

[0006] In a first aspect, embodiments of this disclosure provide a video texture transfer method, including:

[0007] Acquire the original video; extract features from the target frame in the original video to generate first feature information of the target frame, wherein the target frame is a video frame after the Nth video frame in the original video, and the first feature information represents the image structure contour of the target frame; fuse the first feature information of the target frame with reference feature information to obtain second feature information of the target frame, wherein the reference feature information is used to represent the image structure contour of a reference frame, the reference frame is a video frame before the target frame, and the second feature information is first feature information after removing random noise; generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information represents the image texture details of the reference image.

[0008] In a second aspect, embodiments of this disclosure provide a video texture transfer apparatus, comprising:

[0009] The acquisition module is used to acquire the original video.

[0010] The feature extraction module is used to extract features from the target frame in the original video and generate the first feature information of the target frame, wherein the target frame is the video frame after the Nth video frame in the original video, and the first feature information characterizes the image structure contour of the target frame.

[0011] The feature fusion module is used to fuse the first feature information of the target frame with the reference feature information to obtain the second feature information of the target frame. The reference feature information is used to characterize the image structure contour of the reference frame, and the reference frame is a video frame preceding the target frame. The second feature information is the first feature information after removing random noise.

[0012] The texture transfer module is used to generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information characterizes the image texture details of the reference image.

[0013] Thirdly, embodiments of this disclosure provide an electronic device, including:

[0014] A processor, and a memory communicatively connected to the processor;

[0015] The memory stores computer-executed instructions;

[0016] The processor executes computer execution instructions stored in the memory to implement the video texture transfer method as described in the first aspect and various possible designs of the first aspect.

[0017] Fourthly, embodiments of this disclosure provide a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the video texture migration method described in the first aspect and various possible designs of the first aspect.

[0018] Fifthly, embodiments of this disclosure provide a computer program product, including a computer program that, when executed by a processor, implements the video texture transfer method as described in the first aspect and various possible designs of the first aspect.

[0019] This embodiment provides a video texture transfer method, apparatus, electronic device, and storage medium. The method involves acquiring an original video; extracting features from a target frame in the original video to generate first feature information for the target frame, wherein the target frame is a video frame after the Nth video frame in the original video, and the first feature information characterizes the image structure contour of the target frame; fusing the first feature information of the target frame with reference feature information to obtain second feature information of the target frame, wherein the reference feature information characterizes the image structure contour of a reference frame, the reference frame is a video frame before the target frame, and the second feature information is first feature information after removing random noise; and generating a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information characterizes the image texture details of the reference image. Since the first feature information of the target frame is extracted from the original video, and then the first feature information of the target frame is fused based on the first feature information of the corresponding reference frame, the random noise in the generated second feature information is reduced or eliminated. As a result, after the second feature information based on the target frame is fused with the corresponding texture feature information, the generated texture transfer video will not have the problem of inter-frame flickering caused by random noise between target frames, thus improving the display effect of the texture transfer video. Attached Figure Description

[0020] To more clearly illustrate the technical solutions in the embodiments of this disclosure or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this disclosure. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0021] Figure 1 This is an application scenario diagram of a video texture transfer method provided in the embodiments of this disclosure;

[0022] Figure 2 This is a schematic diagram of a process for generating texture-transfer videos in the prior art;

[0023] Figure 3 A flowchart illustrating the video texture transfer method provided in this embodiment. Figure 1 ;

[0024] Figure 4 A schematic diagram of a target frame and a reference frame provided in an embodiment of this disclosure;

[0025] Figure 5 A schematic diagram illustrating a process for sequentially determining the second feature information of each target frame, as provided in an embodiment of this disclosure;

[0026] Figure 6 This is a schematic diagram illustrating a process for generating texture transfer video according to an embodiment of the present disclosure;

[0027] Figure 7 A flowchart illustrating the video texture transfer method provided in this embodiment. Figure 2 ;

[0028] Figure 8 for Figure 7 A schematic diagram illustrating the implementation steps of step S203 in the illustrated embodiment;

[0029] Figure 9 A schematic diagram of a weight coefficient sequence corresponding to a target frame provided in an embodiment of this disclosure;

[0030] Figure 10 This is a schematic diagram illustrating the generation of second feature information provided in an embodiment of the present disclosure;

[0031] Figure 11 for Figure 7 A schematic diagram illustrating the implementation steps of step S206 in the illustrated embodiment;

[0032] Figure 12 This is a structural block diagram of the video texture transfer apparatus provided in the embodiments of this disclosure;

[0033] Figure 13 This is a schematic diagram of the structure of an electronic device provided in an embodiment of the present disclosure;

[0034] Figure 14 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of this disclosure. Detailed Implementation

[0035] To make the objectives, technical solutions, and advantages of the embodiments of this disclosure clearer, the technical solutions of the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this disclosure, and not all embodiments. Based on the embodiments of this disclosure, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this disclosure.

[0036] The application scenarios of the embodiments of this disclosure are explained below:

[0037] Figure 1 This diagram illustrates an application scenario of the video texture transfer method provided in this disclosure. The video texture transfer method provided in this disclosure can be applied to video compositing scenarios. Specifically, as shown... Figure 1As shown in the embodiments of this disclosure, the method can be applied to a server. The server is communicatively connected to a terminal device. After receiving a video synthesis request sent by the terminal device, the server uses the video texture transfer method provided in this disclosure to perform texture transfer on the original video and reference image included or indicated in the video synthesis request. The texture features of the reference image are transferred to the original video, so that the generated texture-transferred video simultaneously possesses the image structure features of the original video and the texture features of the reference image. Subsequently, based on the request of the terminal device or a preset sending rule, the server sends the generated texture-transferred video to the terminal device, so that the terminal device obtains the synthesized texture-transferred video.

[0038] Currently, texture transfer techniques have been proposed to transfer features from image textures. These techniques can transfer the texture of an image, giving the generated transferred image the texture of the reference image, thus improving the realism of the image. However, in video texture transfer scenarios, inter-frame flickering occurs between the video frames of the generated texture-transferred video. Figure 2 This is a schematic diagram of a process for generating texture-transferred videos in the prior art. This process can be implemented using a pre-trained image transfer model, such as... Figure 2 As shown, the pre-trained image transfer model extracts features from each video frame in the original video sequentially, generating corresponding structural and texture features. Simultaneously, by extracting features from the reference image, corresponding structural and texture features can also be generated. Then, the global texture features of the reference image are used to fuse the texture with the structural features of the target frame to generate the texture transfer image corresponding to the target frame. Finally, the texture transfer images are combined to form a texture transfer video.

[0039] However, in the above process, the structural feature information obtained by feature extraction of the target frame through a pre-trained image transfer model has a certain degree of randomness, i.e., random noise. Therefore, after feature fusion, random noise also exists in the corresponding texture transfer image, causing inter-frame flickering when playing texture transfer images continuously. Therefore, there is an urgent need for a method to eliminate random noise in the structural feature information of video frames during the generation of texture transfer videos, thereby eliminating the problem of inter-frame flickering in texture transfer videos.

[0040] Figure 3 A flowchart illustrating the video texture transfer method provided in this embodiment. Figure 1 The method of this embodiment can be applied to electronic devices. In one possible implementation, the method provided in this embodiment can be applied to, for example... Figure 1The application scenario diagram shown depicts a server or terminal device; in this embodiment, the server is used as the execution entity for explanation. The video texture migration method includes:

[0041] Step S101: Obtain the original video.

[0042] For example, the original video is the video to be texture transferred, which the server can obtain by reading the corresponding video file. Depending on the specific implementation and requirements, this process may also require steps such as video decoding, which will not be detailed here.

[0043] The original video comprises multiple video frames. These frames have a temporal relationship, and the playback process of the original video is achieved by displaying each frame sequentially using a preset playback timestamp. Furthermore, the video texture transfer method provided in this embodiment can be implemented using a pre-trained video texture transfer model. Therefore, the step of obtaining the original video in this embodiment is equivalent to inputting the original video into the video texture transfer model used to implement the method provided in this embodiment. After the video texture transfer model obtains the input original video, it processes the original video based on subsequent embodiment steps to obtain the corresponding texture-transferred video.

[0044] Step S102: Extract features from the target frame in the original video to generate the first feature information of the target frame, wherein the target frame is the video frame after the Nth video frame in the original video, and the first feature information represents the image structure contour of the target frame.

[0045] For example, the video texture transfer model used to implement the method provided in this embodiment includes a functional unit for feature extraction of video frames from the original video. Specifically, for example, feature extraction of video frames from the original video can be achieved based on an Encoder-Decoder model framework. Encoder-Decoder is a commonly used model framework in deep learning; for example, the unsupervised algorithm auto-encoding is designed and trained based on the structure of this Encoder-Decoder. Another example is the application of image captioning in the prior art, which is based on the Encoder-Decoder framework of Convolutional Neural Network (CNN)-Recurrent Neural Network (RNN). The Encoder and Decoder, by setting corresponding processing models, can process arbitrary text, speech, image, and other data to achieve the corresponding technical objectives. More specifically, in this embodiment, the Encoder is used to achieve the purpose of feature extraction of video frames from the original video. The Decoder is used in subsequent steps to achieve feature fusion and generate texture-transferred videos. For example, the Encoder and Decoder are encoders and decoders based on pre-trained Generative Adversarial Network (GAN) models.

[0046] For example, after extracting features from video frames, the video texture transfer model generates first feature information and first texture information corresponding to the video frames. The first feature information represents the image structural features, more specifically, the image structural contour; the first texture information represents the image texture features, more specifically, the image texture details. For example, the first feature information and first texture information can be implemented in the form of a pixel matrix. The specific implementation method of extracting features from video frames in the original video using an encoder to generate the corresponding first feature information and first texture information is existing technology and will not be elaborated here.

[0047] Further, exemplarily, each video frame in the original image can be divided into target frames and non-target frames. Target frames are video frames after the Nth video frame in the original video, where N is a positive integer. In subsequent steps, the target frames in the video frames will be processed to eliminate inter-frame flicker. The specific implementation method will be described in detail in the following embodiments. The step of sequentially extracting features from the target frames in the original video is part of the feature extraction process for video frames in the original video based on the Encoder described above, and will not be repeated here.

[0048] Step S103: Perform feature fusion between the first feature information of the target frame and the reference feature information to obtain the second feature information of the target frame. The reference feature information is used to characterize the image structure contour of the reference frame, which is a video frame preceding the target frame. The second feature information is the first feature information after removing random noise.

[0049] Figure 4 A schematic diagram of a target frame and a reference frame provided for an embodiment of this disclosure, as shown below. Figure 4 As shown, for example, in each video frame of the original video, the video frames from the Nth frame onwards are the target frames, while the first N video frames are non-target frames. After the video texture transfer model begins processing the original video, it sequentially extracts features from each video frame in the original video, thereby obtaining the first feature information and the first texture information corresponding to each video frame. When processing to the target frame, the reference frames corresponding to the target frame are determined, that is, the M video frames before the target frame, where M is a positive integer less than or equal to N. More specifically, as shown in the figure, when M = N = 3, that is, the video frames from the 4th video frame onwards are the target frames, and the 3 video frames before the target frame are the reference frames of the target frame, that is, the 3 video frames before target frame A are the reference frames of target frame A; the 3 video frames before target frame B are the reference frames of target frame B.

[0050] Furthermore, by fusing the first feature information of the target frame and the reference feature information of the corresponding reference frame, second feature information with random noise removed can be obtained. The reference feature information is equivalent to the first feature information of the reference frame. Since the first feature information represents the image structure features of the image, the image structure of several adjacent frames in a video is basically stable. Therefore, by fusing the features of the target frame and its preceding frames (i.e., the reference frames), the common structural features can be strengthened while the different random noises they possess can be weakened. More specifically, there are several methods for feature fusion between the target frame and the reference frames. For example, by weighted averaging of the target frame and each reference frame, the second feature information with random noise removed can be obtained. Alternatively, correlation calculation can be performed on the target frame and the reference frames, and the second feature information can be generated based on the correlation calculation result. The specific implementation steps of image averaging and coherence calculation are existing technologies and will not be elaborated here. Of course, it should be noted that the second feature information referred to in the steps of this embodiment is the result of denoising the first feature information of the target frame. It has less random noise than the first feature information. However, based on the differences in image samples, it may still contain some random noise. Therefore, the second feature information referred to in this embodiment is not limited by the specific noise level it contains.

[0051] Furthermore, exemplarily speaking, in the process of real-time video texture transfer, feature fusion needs to be performed sequentially on each target frame to generate second feature information corresponding to each target frame. The following provides a more detailed description of the process of sequentially fusing the first feature information of each target frame with the corresponding reference feature information. Figure 5 This is a schematic diagram illustrating a process for sequentially determining the second feature information of each target frame, as provided in an embodiment of the present disclosure. Figure 5 As shown, during the sequential processing of each target frame (C, B, A), the reference frame corresponding to the current target frame is the target frame preceding the current target frame. That is, when the video texture transfer model of this embodiment generates the second feature information corresponding to each target frame, it processes each target frame sequentially based on the playback sequence of each target frame. The previously processed target frame is used as the reference frame for the current target frame, and the second feature information of the previously processed target frame is used as the reference feature information of the reference frame for the current target frame. For the first target frame located at the boundary (and several target frames thereafter), the boundary problem can be handled by mapping the preceding non-target frames to target frames and mapping the first feature information of the non-target frames to the second feature information of the target frames. The method for sequentially processing target frames provided in this embodiment uses reference frame reference information generated from the second feature information of previous target frames, except for the first target frame. The second feature information of previously processed target frames does not include or only includes a small amount of random noise. Therefore, by iteratively averaging during the sequential processing of target frames, the random noise carried in the reference frames decreases. Furthermore, by fusing the reference frames with the random noise removed with the current target frame, a better denoising effect can be achieved, reducing the random noise in the second feature information of the target frames while avoiding the introduction of random noise.

[0052] Step S104: Generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information represents the image texture features of the reference image.

[0053] For example, after determining the second feature information of the target frame, the Decoder in the video texture transfer model fuses it with the texture feature information that characterizes the image texture details. This achieves the purpose of filling in texture details within the image contour represented by the second feature information, thereby generating a texture-transferred image of the target frame.

[0054] Furthermore, by combining the texture transfer images corresponding to each target frame according to their playback sequence, a texture transfer video can be generated. The specific implementation method for fusing information representing image structural features with information representing image texture features is prior art known in the art, and will not be elaborated upon here.

[0055] The texture feature information represents the image texture details of the reference image. Different target frames can correspond to the same reference image or different reference images. If the target frame of the original video corresponds to more than two reference images, the reference image corresponding to the target frame can be determined based on a preset mapping relationship, and then the texture feature information corresponding to the second feature information of the target frame can be determined.

[0056] Figure 6 This is a schematic diagram illustrating a process for generating texture transfer video according to an embodiment of the present disclosure, combined with... Figure 6 The video texture transfer method provided in this embodiment will be further described. (Reference) Figure 6 As shown, the process of processing a single target frame from an original video is described. After acquiring the original video, a video texture transfer model is used to extract features from the target frame, obtaining the first feature information of the video frame. Then, the first feature information of the target frame is fused with the reference feature information of the corresponding reference frame to generate the second feature information. Simultaneously, features are extracted from the reference image corresponding to each target frame (in this embodiment, each target frame corresponds to the same reference image) to obtain texture feature information. Then, the second feature information of the target frame is fused with the texture feature information to generate the corresponding texture transfer image. Finally, the texture transfer images corresponding to each target frame are combined according to their playback sequence to generate a texture transfer video.

[0057] In this embodiment, the process involves acquiring the original video; extracting features from the target frame in the original video to generate first feature information for the target frame, where the target frame is the video frame after the Nth video frame in the original video, and the first feature information characterizes the image structure contour of the target frame; fusing the first feature information of the target frame with reference feature information to obtain second feature information for the target frame, where the reference feature information characterizes the image structure contour of the reference frame, where the reference frame is the video frame before the target frame, and the second feature information is the first feature information after removing random noise; and generating a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, where the texture feature information characterizes the image texture details of the reference image. Because the first feature information of the target frame in the original video is extracted and then fused with the first feature information of the corresponding reference frame, the random noise in the generated second feature information is reduced or eliminated. Consequently, after fusing the second feature information of the target frame with the corresponding texture feature information, the generated texture transfer video does not exhibit inter-frame flickering caused by random noise between target frames, thus improving the display effect of the texture transfer video.

[0058] Figure 7 A flowchart illustrating the video texture transfer method provided in this embodiment. Figure 2 This embodiment is in Figure 3 Based on the illustrated embodiment, step S103 is further refined, and the video texture transfer method includes:

[0059] Step S201: Obtain the original video.

[0060] Step S202: Extract features from the target frame in the original video to generate first feature information and first texture information of the target frame. The first feature information represents the image structure outline, and the first texture information represents the image texture details.

[0061] Step S203: Sequentially obtain the weight coefficient sequence corresponding to each target frame. The weight coefficient sequence includes the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to each reference frame.

[0062] For example, the weight coefficient sequence includes at least one weight coefficient. In one possible implementation, when the weight coefficient sequence includes only one weight coefficient, the first weight coefficient corresponding to the target frame is equal to the second weight coefficient corresponding to the reference frame. For example, the specific value of the weight coefficient can be determined by the number of reference frames. The weight coefficient is the reciprocal of the number of reference frames plus 1. For instance, if the target frame corresponds to 3 reference frames, and the weight coefficient sequence includes only one weight coefficient, the weight coefficient is 0.25, meaning the weight coefficient of the target frame and the 3 reference frames are equal, both being 0.25. When subsequently calculating the second feature information of the target frame based on this weight coefficient sequence, in this case, it is equivalent to averaging the first feature information of the target frame with the reference feature information of each reference frame to obtain the second feature information. More specifically, depending on the specific implementation of the first feature information, such as a pixel matrix, the pixel matrix corresponding to the target frame is averaged with the pixel matrices corresponding to the 3 reference frames to generate an average matrix, i.e., the second feature information.

[0063] In another possible implementation, the weight coefficient sequence includes only weight coefficients equal to the sum of the number of reference frames and the number of target frames; that is, each reference frame and target frame has its own corresponding weight coefficient. The target frame corresponds to the first weight coefficient, and the reference frames correspond to the second weight coefficient. The second weight coefficients for each reference frame can be the same or different.

[0064] For example, such as Figure 8 As shown, step S203 includes two specific implementation steps: S2031 and S2032.

[0065] Step S2031: Determine the second weighting coefficient corresponding to each reference frame based on the distance between each reference frame corresponding to the target frame and the target frame.

[0066] Step S2032: Generate a weight coefficient sequence corresponding to the target frame based on the second weight coefficients corresponding to each reference frame.

[0067] Figure 9 This is a schematic diagram of a weight coefficient sequence corresponding to a target frame provided in an embodiment of the present disclosure, as shown below. Figure 9As shown, the target frame corresponds to three reference frames: reference frame A, reference frame B, and reference frame C. The distance between reference frame A and the target frame is 1 (frame), the distance between reference frame B and the target frame is 2 (frame), and the distance between reference frame C and the target frame is 3 (frame). Based on the distances between each reference frame and the target frame, the second weight coefficients are determined as follows: reference frame A is 1, reference frame B is 0.8, and reference frame C is 0.6. The first weight coefficient can be preset to 1. Therefore, the weight coefficient sequence is [0.6, 0.8, 1, 1]. The mapping relationship between the distances between the reference frames and the target frame and the second weight coefficients can be determined based on a preset mapping relationship table.

[0068] In the subsequent calculation of the second feature information, the weight coefficient sequence provided in this embodiment is used to set a larger weight value for the reference frame that is closer to the target frame and a smaller weight value for the reference frame that is farther from the target frame. This can reduce the image blurring problem caused by the fusion of the distant reference frame and the target frame, improve the accuracy of the second feature information of the target frame, thereby improving the fineness of the subsequently synthesized texture transfer image and texture transfer video and improving the image quality.

[0069] Step S204: Based on the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to the reference frame, perform a weighted average of the first feature information of the target frame and the reference feature information of the corresponding reference frame to obtain the second feature information of each target frame.

[0070] For example, based on the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to each reference frame, the first feature information of the target frame and the reference feature information of the reference frames are weighted and averaged respectively to generate the second feature information corresponding to the target frame. Figure 10 This is a schematic diagram illustrating the generation of second feature information according to an embodiment of this disclosure. For example, the first feature information can be implemented as a pixel matrix, and correspondingly, the fused second feature information is also a pixel matrix. (See reference...) Figure 10 As shown, in the process of generating the first feature information after feature extraction of any target frame based on the pre-trained GAN model, random noise will be carried in the first feature information. After weighted averaging of the first feature information of the target frame and the first feature information of the corresponding reference frames through the weight coefficient sequence corresponding to the target frame (including the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to each reference frame), the effective information in each first feature information is retained under the effect of weighted averaging, while the random noise is reduced due to weighted averaging, thereby reducing or eliminating the random noise originally present in the first feature information of the target frame.

[0071] Step S205: Extract features from the reference image corresponding to the target frame to generate second texture information.

[0072] Step S206: Generate texture feature information corresponding to the target frame based on the first texture information and the corresponding second texture information.

[0073] Optionally, such as Figure 11 As shown, step S206 includes two specific implementation steps: S2061 and S2062.

[0074] Step S2061: Obtain preset migration weight coefficients. The migration weight coefficients are used to characterize the salience of the image texture details of the reference image relative to the image texture details of the target frame.

[0075] Step S2062: Based on the migration weight coefficient, perform a weighted average of the first texture information and the preset second texture information to generate texture feature information.

[0076] For example, the transfer weight coefficient represents the salience of the image texture details relative to the image texture details of the target frame. Specifically, it represents the degree to which texture details in the reference image are transferred to the target frame. For instance, a larger transfer weight coefficient results in a greater proportion of texture details from the reference image and a smaller proportion of texture details from the original image in the generated texture-transferred image corresponding to the target frame, thus making the texture-transferred image more like the reference image. Conversely, a smaller transfer weight coefficient results in a texture-transferred image more like the original image. After feature extraction of the target frame, the first texture information representing the image texture details corresponding to the target frame can be obtained. Based on the transfer weight coefficient, the first texture information and the second texture information are weighted and averaged to generate the corresponding texture feature information, thereby improving the flexibility of the model in generating texture-transferred videos.

[0077] Step S207: Generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information represents the image texture features of the reference image.

[0078] In this embodiment, the implementation methods of steps S201-S202 and S207 are the same as those in this disclosure. Figure 3 The implementation methods of steps S101-S102 and S104 in the illustrated embodiment are the same, and will not be described in detail here.

[0079] Corresponding to the video texture transfer method in the above embodiments, Figure 12 This is a structural block diagram of a video texture transfer apparatus provided in an embodiment of this disclosure. For ease of explanation, only the parts relevant to the embodiments of this disclosure are shown. (Refer to...) Figure 12 The video texture transfer device 3 includes:

[0080] Module 31 is used to acquire the original video;

[0081] Feature extraction module 32 is used to extract features from target frames in the original video and generate first feature information of the target frames, wherein the target frames are video frames after the Nth video frame in the original video, and the first feature information characterizes the image structure contour of the target frames.

[0082] The feature fusion module 33 is used to fuse the first feature information of the target frame with the reference feature information to obtain the second feature information of the target frame. The reference feature information is used to characterize the image structure contour of the reference frame, which is a video frame before the target frame. The second feature information is the first feature information after removing random noise.

[0083] The texture transfer module 34 is used to generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information represents the image texture details of the reference image.

[0084] In one embodiment of this disclosure, the feature fusion module 33 is specifically used to: obtain a weight coefficient sequence corresponding to the target frame, the weight coefficient sequence including a first weight coefficient corresponding to the target frame and a second weight coefficient corresponding to the reference frame; and perform a weighted average of the first feature information of the target frame and the reference feature information of the corresponding reference frame based on the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to the reference frame to obtain the second feature information of the target frame.

[0085] In one embodiment of this disclosure, when the feature fusion module 33 obtains the weight coefficient sequence corresponding to the target frame, it is specifically used to: determine the second weight coefficient corresponding to the reference frame based on the distance between each reference frame and the target frame; and generate the weight coefficient sequence corresponding to the target frame based on the second weight coefficient corresponding to the reference frame.

[0086] In one embodiment of this disclosure, the feature fusion module 33 is specifically used for: obtaining second feature information of a reference frame; and generating reference feature information based on the second feature information of the reference frame.

[0087] In one embodiment of this disclosure, the texture transfer module 34 is specifically used to: generate a texture transfer image corresponding to the target frame based on the second feature information of the target frame and the corresponding texture feature information; and combine the texture transfer images corresponding to each target frame based on the playback sequence of each target frame in the original video to generate a texture transfer video.

[0088] In one embodiment of this disclosure, when the texture transfer module 34 generates a texture transfer image corresponding to the target frame based on the second feature information and the corresponding texture feature information of the target frame, it is specifically used to: perform texture fusion on the second feature information and the corresponding texture feature information of the target frame through a pre-trained adversarial generative network model to generate a texture transfer image.

[0089] In one embodiment of this disclosure, the feature extraction module 32 is further configured to:

[0090] Feature extraction is performed on the target frame in the original video to generate first texture information, which represents the image texture details of the target frame; feature extraction is performed on the reference image corresponding to the target frame to generate second texture information, which represents the image texture details of the reference image; texture feature information corresponding to the target frame is generated based on the first texture information and the corresponding second texture information.

[0091] In one embodiment of this disclosure, when the feature extraction module 32 generates texture feature information corresponding to the target frame based on the first texture information and the corresponding second texture information, it is specifically used to: obtain a preset migration weight coefficient, the migration weight coefficient being used to characterize the salience of the image texture details of the reference image relative to the image texture details of the target frame; and generate texture feature information by weighting the first texture information and the preset second texture information based on the migration weight coefficient.

[0092] In one embodiment of this disclosure, when the feature extraction module 32 extracts features from the target frame and generates the first feature information of the target frame, it is specifically used to: extract features from the target frame based on the encoder in the pre-trained adversarial generative network model and generate the first feature information of the target frame.

[0093] The acquisition module 31, feature extraction module 32, feature fusion module 33, and texture transfer module 34 are connected sequentially. The video texture transfer device 3 provided in this embodiment can execute the technical solution of the above method embodiment, and its implementation principle and technical effect are similar, so it will not be described again here.

[0094] Figure 13 This is a schematic diagram of the structure of an electronic device provided in an embodiment of the present disclosure, such as... Figure 13 As shown, the electronic device 4 includes:

[0095] Processor 41, and memory 42 communicatively connected to processor 41;

[0096] Memory 42 stores instructions executed by the computer;

[0097] The processor 41 executes computer execution instructions stored in the memory 42 to achieve, for example, Figures 3-11 The video texture migration method in the illustrated embodiment.

[0098] Optionally, the processor 41 and the memory 42 are connected via a bus 43.

[0099] For relevant instructions, please refer to the corresponding text. Figures 3-11The relevant descriptions and effects of the steps in the corresponding embodiments are understood, and will not be elaborated on here.

[0100] refer to Figure 14 The diagram illustrates a structural schematic of an electronic device 900 suitable for implementing embodiments of the present disclosure. The electronic device 900 can be a terminal device or a server. The terminal device can include, but is not limited to, mobile terminals such as mobile phones, laptops, digital radio receivers, personal digital assistants (PDAs), portable Android devices (PADs), portable media players (PMPs), and in-vehicle terminals (e.g., in-vehicle navigation terminals), as well as fixed terminals such as digital TVs and desktop computers. Figure 14 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of the embodiments disclosed herein.

[0101] like Figure 14 As shown, the electronic device 900 may include a processing unit (e.g., a central processing unit, a graphics processing unit, etc.) 901, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 902 or a program loaded from a storage device 908 into a random access memory (RAM) 903. The RAM 903 also stores various programs and data required for the operation of the electronic device 900. The processing unit 901, ROM 902, and RAM 903 are interconnected via a bus 904. An input / output (I / O) interface 905 is also connected to the bus 904.

[0102] Typically, the following devices can be connected to I / O interface 905: input devices 906 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 907 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 908 including, for example, magnetic tapes, hard disks, etc.; and communication devices 909. Communication device 909 allows electronic device 900 to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 14 An electronic device 900 with various devices is shown; however, it should be understood that it is not required to implement or possess all of the devices shown. More or fewer devices may be implemented or possessed alternatively.

[0103] In particular, according to embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of this disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 909, or installed from a storage device 908, or installed from a ROM 902. When the computer program is executed by a processing device 901, it performs the functions defined in the methods of embodiments of this disclosure.

[0104] It should be noted that the computer-readable medium described in this disclosure can be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this disclosure, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.

[0105] The aforementioned computer-readable medium may be included in the aforementioned electronic device; or it may exist independently and not assembled into the electronic device.

[0106] The aforementioned computer-readable medium carries one or more programs, which, when executed by the electronic device, cause the electronic device to perform the methods shown in the above embodiments.

[0107] Computer program code for performing the operations of this disclosure can be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, and C++, and conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a Local Area Network (LAN) or a Wide Area Network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).

[0108] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0109] The units described in the embodiments of this disclosure can be implemented in software or in hardware. The name of a unit does not necessarily limit the unit itself; for example, the first acquisition unit can also be described as "a unit that acquires at least two Internet Protocol addresses".

[0110] The functions described above in this document can be performed, at least in part, by one or more hardware logic components. For example, exemplary types of hardware logic components that can be used, without limitation, include: Field Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application Standard Products (ASSPs), System-on-Chip (SoCs), Complex Programmable Logic Devices (CPLDs), and so on.

[0111] In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0112] In a first aspect, according to one or more embodiments of the present disclosure, a video texture transfer method is provided, comprising:

[0113] Acquire the original video; extract features from the target frame in the original video to generate first feature information of the target frame, wherein the target frame is a video frame after the Nth video frame in the original video, and the first feature information represents the image structure contour of the target frame; fuse the first feature information of the target frame with reference feature information to obtain second feature information of the target frame, wherein the reference feature information is used to represent the image structure contour of a reference frame, the reference frame is a video frame before the target frame, and the second feature information is first feature information after removing random noise; generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information represents the image texture details of the reference image.

[0114] According to one or more embodiments of this disclosure, feature fusion is performed between the first feature information of the target frame and reference feature information to obtain the second feature information of the target frame, including: obtaining a weight coefficient sequence corresponding to the target frame, wherein the weight coefficient sequence includes a first weight coefficient corresponding to the target frame and a second weight coefficient corresponding to each of the reference frames; and sequentially performing a weighted average of the first feature information of the target frame and the reference feature information of the corresponding reference frames based on the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to each of the reference frames to obtain the second feature information of the target frame.

[0115] According to one or more embodiments of this disclosure, obtaining the weight coefficient sequence corresponding to the target frame includes: determining a second weight coefficient corresponding to each of the reference frames based on the distance between each of the reference frames and the target frame; and generating a weight coefficient sequence corresponding to the target frame based on the second weight coefficient corresponding to each of the reference frames.

[0116] According to one or more embodiments of this disclosure, the method further includes: obtaining second feature information of the reference frame; and generating the reference feature information based on the second feature information of the reference frame.

[0117] According to one or more embodiments of this disclosure, generating a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information includes: generating a texture transfer image corresponding to the target frame based on the second feature information of the target frame and the corresponding texture feature information; and combining the texture transfer images corresponding to each target frame based on the playback sequence of each target frame in the original video to generate the texture transfer video.

[0118] According to one or more embodiments of this disclosure, generating a texture transfer image corresponding to the target frame based on the second feature information and the corresponding texture feature information of the target frame includes: performing texture fusion on the second feature information and the corresponding texture feature information of each target frame through a pre-trained adversarial generative network model to generate a texture transfer image.

[0119] According to one or more embodiments of this disclosure, the method further includes: extracting features from a target frame in the original video to generate first texture information, the first texture information representing image texture details of the target frame; extracting features from a reference image corresponding to the target frame to generate second texture information, the second texture information representing image texture details of the reference image; and generating texture feature information corresponding to the target frame based on the first texture information and the corresponding second texture information.

[0120] According to one or more embodiments of this disclosure, generating texture feature information corresponding to the target frame based on first texture information and corresponding second texture information includes: obtaining a preset migration weight coefficient, the migration weight coefficient being used to characterize the salience of the image texture details of the reference image relative to the image texture details of the target frame; and generating texture feature information by weighted averaging of the first texture information and the second texture information based on the migration weight coefficient.

[0121] According to one or more embodiments of this disclosure, the step of extracting features from a target frame in the original video to generate first feature information of the target frame includes: extracting features from the target frame based on an encoder in a pre-trained generative adversarial network model to generate first feature information of the target frame.

[0122] Secondly, according to one or more embodiments of the present disclosure, a video texture transfer apparatus is provided, comprising:

[0123] The acquisition module is used to acquire the original video.

[0124] The feature extraction module is used to extract features from the target frame in the original video and generate the first feature information of the target frame, wherein the target frame is the video frame after the Nth video frame in the original video, and the first feature information characterizes the image structure contour of the target frame.

[0125] The feature fusion module is used to fuse the first feature information of the target frame with the reference feature information to obtain the second feature information of the target frame. The reference feature information is used to characterize the image structure contour of the reference frame, and the reference frame is a video frame preceding the target frame. The second feature information is the first feature information after removing random noise.

[0126] The texture transfer module is used to generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information characterizes the image texture details of the reference image.

[0127] According to one or more embodiments of this disclosure, the feature fusion module is specifically used for: obtaining a weight coefficient sequence corresponding to a target frame, the weight coefficient sequence including a first weight coefficient corresponding to the target frame and a second weight coefficient corresponding to a reference frame; and performing a weighted average of the first feature information of the target frame and the reference feature information of the corresponding reference frame based on the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to the reference frame to obtain the second feature information of the target frame.

[0128] According to one or more embodiments of this disclosure, when the feature fusion module obtains the weight coefficient sequence corresponding to the target frame, it is specifically used to: determine the second weight coefficient corresponding to the reference frame based on the distance between each reference frame and the target frame; and generate the weight coefficient sequence corresponding to the target frame based on the second weight coefficient corresponding to the reference frame.

[0129] According to one or more embodiments of this disclosure, the feature fusion module is specifically configured to: obtain second feature information of the reference frame; and generate the reference feature information based on the second feature information of the reference frame.

[0130] According to one or more embodiments of this disclosure, the texture transfer module is specifically used to: generate a texture transfer image corresponding to the target frame based on the second feature information of the target frame and the corresponding texture feature information; and combine the texture transfer images corresponding to each target frame based on the playback sequence of each target frame in the original video to generate a texture transfer video.

[0131] According to one or more embodiments of this disclosure, when the texture transfer module generates a texture transfer image corresponding to the target frame based on the second feature information and the corresponding texture feature information of the target frame, it is specifically used to: perform texture fusion on the second feature information and the corresponding texture feature information of the target frame through a pre-trained adversarial generative network model to generate a texture transfer image.

[0132] According to one or more embodiments of this disclosure, the feature extraction module is further configured to: extract features from a target frame in the original video to generate first texture information, wherein the first texture information characterizes the image texture details of the target frame; extract features from a reference image corresponding to the target frame to generate second texture information, wherein the second texture information characterizes the image texture details of the reference image; and generate texture feature information corresponding to the target frame based on the first texture information and the corresponding second texture information.

[0133] According to one or more embodiments of this disclosure, when the feature extraction module generates texture feature information corresponding to the target frame based on the first texture information and the corresponding second texture information, it is specifically used to: obtain a preset migration weight coefficient, the migration weight coefficient being used to characterize the salience of the image texture details of the reference image relative to the image texture details of the target frame; and generate texture feature information by weighting the first texture information and the preset second texture information based on the migration weight coefficient.

[0134] According to one or more embodiments of this disclosure, when the feature extraction module extracts features from the target frame and generates the first feature information of the target frame, it is specifically used to: extract features from the target frame based on the encoder in the pre-trained adversarial generative network model and generate the first feature information of the target frame.

[0135] Thirdly, according to one or more embodiments of the present disclosure, an electronic device is provided, including: a processor, and a memory communicatively connected to the processor;

[0136] The memory stores computer-executed instructions;

[0137] The processor executes computer execution instructions stored in the memory to implement the video texture transfer method as described in the first aspect and various possible designs of the first aspect.

[0138] Fourthly, according to one or more embodiments of the present disclosure, a computer-readable storage medium is provided, wherein computer-executable instructions are stored therein, which, when executed by a processor, implement the video texture migration method described in the first aspect and various possible designs of the first aspect.

[0139] Fifthly, embodiments of this disclosure provide a computer program product, including a computer program that, when executed by a processor, implements the video texture transfer method as described in the first aspect and various possible designs of the first aspect.

[0140] The above description is merely a preferred embodiment of this disclosure and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of this disclosure is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described concept. For example, technical solutions formed by substituting the above features with (but not limited to) technical features disclosed in this disclosure that have similar functions.

[0141] Furthermore, while the operations are described in a specific order, this should not be construed as requiring these operations to be performed in the specific order shown or in a sequential order. In certain environments, multitasking and parallel processing may be advantageous. Similarly, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of this disclosure. Certain features described in the context of individual embodiments may also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments.

[0142] Although the subject matter has been described using language specific to structural features and / or methodological logic, it should be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or actions described above. Rather, the specific features and actions described above are merely illustrative examples of implementing the claims.

Claims

1. A video texture transfer method, characterized in that, include: Obtain the original video; Feature extraction is performed on the target frame in the original video to generate the first feature information of the target frame, wherein the target frame is the video frame after the Nth video frame in the original video, and the first feature information characterizes the image structure contour of the target frame; The first feature information of the target frame is fused with the reference feature information to obtain the second feature information of the target frame. The reference feature information is used to characterize the image structure contour of the reference frame, which is a video frame preceding the target frame. The second feature information is the first feature information after removing random noise. A texture transfer video is generated based on the second feature information and the corresponding texture feature information of the target frame, wherein the texture feature information characterizes the image texture details of the reference image.

2. The method according to claim 1, characterized in that, The first feature information of the target frame is fused with the reference feature information to obtain the second feature information of the target frame, including: Obtain the weight coefficient sequence corresponding to the target frame, wherein the weight coefficient sequence includes a first weight coefficient corresponding to the target frame and a second weight coefficient corresponding to each of the reference frames; The first feature information of the target frame and the reference feature information of the corresponding reference frame are weighted and averaged sequentially according to the first weight coefficient corresponding to the target frame and the second weight coefficient corresponding to each reference frame to obtain the second feature information of the target frame.

3. The method according to claim 2, characterized in that, The step of obtaining the weight coefficient sequence corresponding to the target frame includes: Based on the distance between each reference frame and the target frame, a second weighting coefficient corresponding to each reference frame is determined; Based on the second weight coefficients corresponding to each of the reference frames, a weight coefficient sequence corresponding to the target frame is generated.

4. The method according to claim 1, characterized in that, The method further includes: Obtain the second feature information of the reference frame; The reference feature information is generated based on the second feature information of the reference frame.

5. The method according to claim 1, characterized in that, Based on the second feature information and corresponding texture feature information of the target frame, a texture transfer video is generated, including: Based on the second feature information and the corresponding texture feature information of the target frame, a texture transfer image corresponding to the target frame is generated; Based on the playback sequence of each target frame in the original video, the texture transfer images corresponding to each target frame are combined to generate the texture transfer video.

6. The method according to claim 5, characterized in that, The step of generating a texture transfer image corresponding to the target frame based on the second feature information and the corresponding texture feature information of the target frame includes: By using a pre-trained adversarial generative network model, texture fusion is performed on the second feature information and corresponding texture feature information of each target frame to generate a texture transfer image.

7. The method according to claim 1, characterized in that, The method further includes: Feature extraction is performed on the target frame in the original video to generate first texture information, which characterizes the image texture details of the target frame. Feature extraction is performed on the reference image corresponding to the target frame to generate second texture information, which characterizes the image texture details of the reference image. Based on the first texture information and the corresponding second texture information, texture feature information corresponding to the target frame is generated.

8. The method according to claim 7, characterized in that, Based on the first texture information and the corresponding second texture information, texture feature information corresponding to the target frame is generated, including: Obtain preset migration weight coefficients, which are used to characterize the salience of the image texture details of the reference image relative to the image texture details of the target frame; Based on the migration weight coefficient, the first texture information and the second texture information are weighted and averaged to generate texture feature information.

9. The method according to any one of claims 1-8, characterized in that, The step of extracting features from the target frame in the original video to generate the first feature information of the target frame includes: The encoder in the pre-trained adversarial generative network model extracts features from the target frame to generate the first feature information of the target frame.

10. A video texture transfer device, characterized in that, include: The acquisition module is used to acquire the original video. The feature extraction module is used to extract features from the target frame in the original video and generate the first feature information of the target frame, wherein the target frame is the video frame after the Nth video frame in the original video, and the first feature information characterizes the image structure contour of the target frame. The feature fusion module is used to fuse the first feature information of the target frame with the reference feature information to obtain the second feature information of the target frame. The reference feature information is used to characterize the image structure contour of the reference frame, and the reference frame is a video frame preceding the target frame. The second feature information is the first feature information after removing random noise. The texture transfer module is used to generate a texture transfer video based on the second feature information of the target frame and the corresponding texture feature information, wherein the texture feature information characterizes the image texture details of the reference image.

11. An electronic device, characterized in that, include: A processor, and a memory communicatively connected to the processor; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory to implement the method as described in any one of claims 1 to 9.

12. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, implement the video texture migration method as described in any one of claims 1 to 9.

13. A computer program product comprising a computer program that, when executed by a processor, implements the video texture migration method of any one of claims 1 to 9.

Citation Information

Patent Citations

Video style migration method and device based on neural network, computer equipment and storage medium
CN112883806A
Style migration system for automatically generating stylized video
CN113793253A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

Video style migration method and device based on neural network, computer equipment and storage medium

Style migration system for automatically generating stylized video