A video compression method, system, device and medium
By using convolutional kernel matrix operations and MCU to determine the similarity of video frames, static frames are replaced while dynamic frames are retained. This solves the problem of static frames occupying a large amount of storage space and achieves efficient video compression without reducing video quality.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- JINAN INSPUR DATA TECH CO LTD
- Filing Date
- 2022-10-14
- Publication Date
- 2026-06-26
AI Technical Summary
Existing video compression technologies often reduce video quality while reducing storage space, and static graphic frames occupy a lot of storage space but provide little useful information.
The video frames are convolved using a convolution kernel matrix to determine the similarity between frames. Similar static frames are replaced while dynamic frames are retained. The MCU is used to perform frame-by-frame comparison and key-value pair storage, keeping the video resolution and frame rate constant.
It effectively reduces video storage space usage while maintaining video clarity, saving storage space without affecting picture quality.
Smart Images

Figure CN115776568B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of video compression, and in particular to a video compression method, system, apparatus and medium. Background Technology
[0002] With the development of the big data era, video surveillance is widely used in industries, technology, and security. However, cameras generate massive amounts of data during recording, consuming significant server storage space. Since video files are often stored on big data cluster systems, excessive data storage reduces the efficiency of servers in storing valuable data. Currently, video compression is generally achieved by altering parameters such as resolution, bitrate, and frame rate.
[0003] Currently, traditional video compression techniques alter overall resolution, bitrate, and frame rate to compress data. This reduces video quality, impacting the user's viewing experience and potentially causing the loss of important information. Furthermore, in camera surveillance scenarios, static frames contain significantly less useful information than dynamic frames, and they often constitute a large portion of the video. Consequently, static frames not only consume substantial storage space but also fail to provide users with useful information.
[0004] How to solve the problem that static graphic frames occupy a very large proportion of video memory during video compression is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention
[0005] The purpose of this application is to provide a video compression method that compares each frame of a video with the convolution result of its previous frame using a convolution kernel matrix. If the convolution results are the same, the two frames are considered similar and are considered static frames that can be replaced. Otherwise, they are considered dynamic frames and are retained. The next frame is then compared frame by frame. This method can replace similar static frames while retaining dynamic frames with more useful information, thus reducing the memory footprint of the video. Furthermore, since the dynamic frames are not altered, the video's resolution, bitrate, and frame rate remain unchanged, compressing storage space while providing users with the original clarity. Additionally, this application also provides a video compression system, apparatus, and medium.
[0006] To address the aforementioned technical problems, this application provides a video compression method, comprising:
[0007] Obtain the convolution kernel matrix;
[0008] Get the video to be compressed;
[0009] Obtain the convolution result of each frame in the video to be compressed and the convolution kernel matrix;
[0010] Determine whether the convolution result of the current frame is consistent with the convolution result of the previous frame;
[0011] If so, then the previous frame replaces the current frame;
[0012] If not, then retain the current frame and return to the step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, until the last frame ends.
[0013] Preferably, after determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, the method further includes:
[0014] The key-value pairs are appended to each frame of the video to be compressed and stored.
[0015] Preferably, storing the key-value pairs appended to each frame of the video to be compressed includes:
[0016] When the convolution result of the current frame is consistent with the convolution result of the previous frame, the key value of the current frame is assigned to 0;
[0017] When the convolution result of the current frame is inconsistent with the convolution result of the previous frame, the key value of the current frame is assigned the value 1.
[0018] Preferably, determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame includes:
[0019] Determine whether the key value of the current frame is 0;
[0020] If so, proceed to the step of replacing the current frame with the previous frame;
[0021] Determine whether the key value of the current frame is 1;
[0022] If so, proceed to the step of retaining the current frame and returning to the step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, until the last frame ends.
[0023] Preferably, after acquiring the video to be compressed, the method further includes:
[0024] Obtain the video quality requirement level corresponding to the video to be compressed;
[0025] If the required level is lower than the preset level, then proceed to the step of obtaining the convolution result of each frame in the video to be compressed and the convolution kernel matrix.
[0026] If the required level is higher than the preset level, then the compression of the video to be compressed will be abandoned.
[0027] Preferably, each value in the convolution kernel matrix is randomly generated by a random function according to the size of the convolution kernel.
[0028] Preferably, the convolution operation formula is as follows:
[0029]
[0030] Wherein, out, core, and f represent the dimensions of the video frame matrix to be compressed, the convolution kernel matrix, the output feature matrix, and the convolution kernel matrix, respectively; x and y represent the coordinates in the video frame matrix to be compressed; and i and j represent the coordinates of the output feature matrix calculated by the convolution operation formula.
[0031] To address the aforementioned technical problems, this application also provides a video compression system, comprising:
[0032] The first acquisition module is used to acquire the convolution kernel matrix;
[0033] The second acquisition module is used to acquire the video to be compressed;
[0034] The third acquisition module is used to acquire the convolution result of each frame in the video to be compressed and the convolution kernel matrix.
[0035] The judgment module is used to determine whether the convolution result of the current frame is consistent with the convolution result of the previous frame;
[0036] A replacement module, used to replace the current frame with the previous frame;
[0037] A retention module is used to retain the current frame.
[0038] To address the aforementioned technical problems, this application also provides a video compression device, including a memory for storing computer programs;
[0039] A processor for executing the computer program to implement the steps of the video compression method described above.
[0040] To address the aforementioned technical problems, this application also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the video compression method described above.
[0041] The video compression method provided in this application specifically obtains the convolution kernel matrix and the video to be compressed through an MCU. Since the convolution kernel can extract features from video frames, each frame of the video to be compressed is convolved with the convolution kernel matrix frame by frame to obtain the convolution result. If the convolution result of the current frame is the same as that of the previous frame, it means that the current frame and the previous frame are similar, that is, the two frames are static frames and can be replaced, with the previous frame replacing the current frame. Conversely, if the convolution result of the current frame is inconsistent with that of the previous frame, it means that the current frame and the previous frame are different and contain different information, so the current frame is retained as a dynamic frame, and the process continues to judge whether the convolution result of the next frame is consistent with that of the previous frame until the last frame ends. This method utilizes the feature extraction capability of the convolution kernel from video frames, compares the convolution result of each frame in the video with that of the previous frame. If the convolution results are the same, it means that the two frames are similar and are static frames that can be replaced; otherwise, they are dynamic frames, which are retained, and the next frame is compared frame by frame. By replacing similar static frames while retaining dynamic frames that contain more useful information, the memory footprint of the video can be reduced. Furthermore, since the dynamic frames are not altered, the video's resolution, bitrate, and frame rate remain unchanged, thus compressing storage space while providing users with the original clarity. Attached Figure Description
[0042] To more clearly illustrate the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0043] Figure 1 A flowchart illustrating a video compression method provided in this application embodiment;
[0044] Figure 2 This is a structural diagram of the compressed video provided in an embodiment of this application;
[0045] Figure 3 A structural diagram of a video compression system provided in an embodiment of this application;
[0046] Figure 4 A structural diagram of a video compression device provided in an embodiment of this application;
[0047] in, Figure 2 In the attached figures, 1 represents the first frame of the compressed video, 2-6 represent the second to sixth frames of the compressed video, 7 represents the seventh frame of the compressed video, 8 and 9 represent the eighth and ninth frames of the compressed video, and 10 represents the tenth frame of the compressed video. Detailed Implementation
[0048] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of this application.
[0049] The core of this application is to provide a video compression method, system, apparatus, and medium. By using a convolution kernel matrix to compare each frame of the video with the convolution result of its previous frame, if the convolution results are the same, it means the two frames are similar and are static frames that can be replaced; otherwise, they are dynamic frames, which are retained, and the next frame is compared frame by frame. This allows similar static frames to be replaced while retaining dynamic frames with more useful information, reducing the video's memory footprint. Furthermore, since no changes are made to the dynamic frames, the video's resolution, bitrate, and frame rate remain unchanged, thus not only compressing the video's storage space but also providing users with the original clarity of the image.
[0050] The video compression method provided in this application can be implemented by a controller, such as a microcontroller unit (MCU), or any other controller besides an MCU. This application does not limit the implementation.
[0051] To enable those skilled in the art to better understand the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0052] Figure 1 A flowchart of the video compression method provided in the embodiments of this application is shown below. Figure 1 As shown, a video compression method includes:
[0053] S10: Obtain the convolution kernel matrix.
[0054] Specifically, traditional video compression techniques typically compress videos by reducing resolution, bitrate, and frame rate. While this reduces storage space, it also lowers image quality. When video information needs to be retrieved, insufficient image quality may prevent the acquisition of complete information, thus affecting the user. Since convolutional kernels can extract features from video frames, this embodiment incorporates a convolutional kernel matrix. This matrix is randomly generated based on the kernel size set in the MCU. The kernel size is determined by factors affecting video size, such as image quality and length. For example, a larger kernel is needed to provide more computational power for compressing a video with higher resolution and longer duration. Conversely, a smaller kernel can provide sufficient computational power for compressing a video with lower resolution and shorter duration. Furthermore, the convolution kernel matrix is randomly generated. If the convolution kernel matrix has a regular distribution, during subsequent convolution kernel operations, even if the current frame is not the same as the previous frame, the use of the same or similarly distributed convolution kernel matrix may result in the convolution result of the current frame with the kernel matrix being the same as that of other frames. However, since the two frames are different images, this can lead to system misjudgment, incorrect compression, and loss of important video frames. It should be noted that the convolution kernel algorithm mentioned above is only a preferred implementation method and is not specifically limited in this embodiment. Other methods capable of extracting video image frame features can also be used. This embodiment does not specifically limit the size of the convolution kernel or the random generation method of the convolution matrix. The size of the convolution kernel can be set according to the user's needs, and the random generation method of the convolution matrix only requires that the generated matrix is irregular and random.
[0055] It is evident that by using convolution kernels capable of extracting video image frame features to randomly generate convolution matrices, the features of the compressed video can be extracted more effectively for subsequent processing and analysis.
[0056] S11: Obtain the video to be compressed.
[0057] Specifically, in this embodiment, the MCU acquires the video to be compressed and prepares to compress it using convolution kernel operations.
[0058] S12: Obtain the convolution result of each frame in the video to be compressed and the convolution kernel matrix.
[0059] Specifically, in this embodiment, the MCU performs a convolution operation frame by frame with the convolution kernel matrix in the video to be compressed to obtain the convolution result. The convolution operation formula is as follows:
[0060]
[0061] Where in, out, core, and f represent the dimensions of the video frame matrix to be compressed, the convolution kernel matrix, the output feature matrix, and the convolution kernel matrix, respectively; x and y represent the coordinates in the video frame matrix to be compressed; and i and j represent the coordinates of the output feature matrix after the convolution operation. It should be noted that the convolution formula in this embodiment can be any other formula capable of calculating the result of the operation between the frame and the convolution kernel matrix; no special limitations are imposed in this embodiment.
[0062] S13: Determine whether the convolution result of the current frame is consistent with the convolution result of the previous frame. If yes, proceed to S14; otherwise, proceed to S15.
[0063] Specifically, in this embodiment, since a convolution method is used, the convolution result obtained through the convolution matrix operation represents the graphic features of the current frame. By comparing the convolution result of the current frame with that of the previous frame, the characteristics of the two graphic frames can be determined. If they are consistent, it means that the current frame, after convolution, has the same graphic frame features as the previous frame, i.e., the current frame has the same graphic information and can be considered a static frame, where the previous frame can replace the current frame. Conversely, if the results are inconsistent, it means that the current frame, after convolution, has different graphic frame features than the previous frame, i.e., the current frame has different graphic information and can be considered a dynamic frame, which is then retained. This embodiment distinguishes between static and dynamic frames by appending key-value pairs to each frame in the video. When the convolution result of the current frame is consistent with that of the previous frame, the key value of the current frame is assigned a value of 0. Conversely, when the convolution result of the current frame is inconsistent with that of the previous frame, the key value of the current frame is assigned a value of 1. At this point, the key value is appended to the current frame to describe its features. When the MCU makes a judgment, it only needs to determine whether the key value is 0 or 1 to determine whether the current frame is similar to or consistent with the previous frame, and then it can perform the operation of replacing or retaining the current frame.
[0064] As can be seen, by appending key-value pairs to the current frame of the video to be compressed, the MCU can accurately and quickly replace or retain the current frame and then determine the operation of the next frame.
[0065] S14: Replace the current frame with the previous frame.
[0066] Specifically, in this example, when it is determined that the convolution result of the current frame is the same as the convolution result of the previous frame, that is, by appending key-value pairs, the key value of the current frame is 0, which means that the current frame has similar graphic information to the previous frame and is a static frame. In order to save memory space, the previous frame replaces and overwrites the current frame, and the previous frame is retained. The memory space of the previous frame covers the memory space of the current frame, which can preserve the current video information and save space.
[0067] As can be seen, when the current frame and the previous frame have the same graphic information, replacing the current frame with the previous frame can preserve the current video information and save space.
[0068] S15: Keep the current frame and return to the step of S13: determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, until the last frame ends.
[0069] Specifically, in this embodiment, when it is determined that the convolution result of the current frame is different from the convolution result of the previous frame, the key value of the current frame is assigned a value of 1 by appending key-value pairs. This indicates that the current frame has different graphic information from the previous frame and is a dynamic frame. In order to retain useful graphic information, the dynamic frame with the current key value of 1 needs to be retained. The MCU continues to judge each frame in the video to be compressed in the same way until the last frame is judged. To enable those skilled in the art to better understand this embodiment, Figure 2 A structural diagram of the compressed video provided in the embodiments of this application, such as... Figure 2As shown in the example below: For instance, a 10-frame video needs to be compressed. When the MCU determines that the convolution result of frame 1 is the same as the convolution result of frame 2, it overwrites frame 2 with frame 1, resulting in 9 frames in the video. The MCU then checks if the convolution result of frame 3 is the same as the previous frame (frame 1, which has already been calculated using the convolution formula). If they are the same, frame 1 is overwritten with frame 3, resulting in 8 frames in the video. The MCU continues checking until a frame's convolution result differs from the previous frame's. For example, if the convolution result of frame 7 is different from frame 1, the MCU doesn't change frame 7. It then checks the convolution result of frame 8 with frame 7. If the convolution result of frame 8 is the same as frame 7, frame 7 is overwritten with frame 8, resulting in 4 frames in the video: frames 1, 7, 9, and 10. The MCU then checks if the convolution results of frame 9 and frame 7 are consistent. If they are, frame 7 is overwritten onto frame 9. The MCU then checks if the convolution result of frame 10 is consistent with the convolution result of frame 7. If they are inconsistent, frame 10, the last frame, is retained in the video to be compressed. At this point, the video output has three frames: frame 1, frame 7, and frame 10. Frames 2 through 6, 8, and 9 are compressed to form a new video. The three frames in the new video are the first, seventh, and tenth frames of the original video. The compression of the video to be compressed is now complete. It should be noted that the above example is only provided to help those skilled in the art better understand how video frames are replaced in this embodiment. It does not represent that this is the only implementation method, and there are no special limitations on the total number of frames in the video or the determination of each frame number.
[0070] As can be seen, after the convolution kernel is used to perform the operation, the video frames are assigned in the form of key-value pairs. Static frames are replaced and overwritten to save space and achieve video compression, while dynamic frames are retained to provide users with a better picture quality experience.
[0071] The video compression method provided in this embodiment specifically involves obtaining a convolution kernel matrix and the video to be compressed through an MCU. Since the convolution kernel can extract features from video frames, each frame of the video to be compressed is convolved with the convolution kernel matrix frame by frame to obtain the convolution result. If the convolution result of the current frame is the same as that of the previous frame, it means that the current frame and the previous frame are similar, i.e., the two frames are static frames and can be replaced, with the previous frame replacing the current frame. Conversely, if the convolution result of the current frame is inconsistent with that of the previous frame, it means that the current frame and the previous frame are different and contain different information. Therefore, the current frame is retained as a dynamic frame, and the process continues to judge whether the convolution result of the next frame is consistent with that of the previous frame until the last frame ends. This method utilizes the feature extraction capability of the convolution kernel from video frames, comparing the convolution result of each frame in the video with that of the previous frame. If the convolution results are the same, it means that the two frames are similar and are static frames that can be replaced; otherwise, they are dynamic frames, which are retained, and the next frame is compared frame by frame. By replacing similar static frames while retaining dynamic frames that contain more useful information, the memory footprint of the video can be reduced. Furthermore, since the dynamic frames are not altered, the video's resolution, bitrate, and frame rate remain unchanged, thus compressing storage space while providing users with the original clarity.
[0072] Based on the above embodiments, as a preferred embodiment, after determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, the method further includes:
[0073] The key-value pairs are appended to each frame of the video to be compressed and stored.
[0074] Specifically, in this embodiment, each frame of the video to be compressed is assigned a key value and stored by using a key-value pair appending method. When the MCU judges the video information of the video frame, it only needs to judge the value of the key-value pair to make the replacement or retention operation, without having to recalculate the convolution result using the convolution kernel matrix. In this embodiment, the key-value pair can be other values or forms that can represent additional features, and no special limitation is made in this embodiment.
[0075] As can be seen, by adding a key-value pair that can be attached to each frame of the video, the MCU can more effectively determine the operation that should be performed on that frame.
[0076] Based on the above embodiments, as a preferred embodiment, storing key-value pairs appended to each frame of the video to be compressed includes:
[0077] When the convolution result of the current frame is the same as the convolution result of the previous frame, the key value of the current frame is assigned to 0.
[0078] When the convolution result of the current frame is inconsistent with the convolution result of the previous frame, the key value of the current frame is assigned the value 1.
[0079] Specifically, in this embodiment, after calculating the convolution result between the current frame and the previous frame using the convolution kernel matrix, the key value of the current frame is assigned by comparing the convolution results. When the convolution results are the same, it means that the video content between the two frames has not changed, and the data of the current frame can be replaced by the previous frame. In this case, the key value is assigned to 0. Conversely, when the convolution results are inconsistent, it means that the content of the two frames has changed, and the key value of the current frame is assigned to 1. At this time, the current frame is retained, and the next frame is judged.
[0080] As can be seen, by assigning different values to the key based on the different convolution results, the MCU can change from judging whether the convolution results are the same to judging only whether the key value is 0 or 1, which speeds up the MCU's judgment speed, reduces the workload of the MCU, and improves the efficiency of video compression.
[0081] Based on the above embodiments, as a preferred embodiment, determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame includes:
[0082] Determine if the key value of the current frame is 0;
[0083] If so, proceed to the step of replacing the current frame with the previous frame;
[0084] Determine if the key value of the current frame is 1;
[0085] If so, proceed to the step of keeping the current frame and returning to the step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, until the last frame ends.
[0086] Specifically, in this example, when the convolution result of the current frame is the same as that of the previous frame, the key value of the current frame is assigned a value of 0 by appending key-value pairs. This indicates that the current frame has similar graphic information to the previous frame and is a static frame. In this case, to save memory space, the previous frame is used to replace and overwrite the current frame, and the memory space of the previous frame is retained, allowing the memory space of the previous frame to cover the memory space of the current frame. This preserves the current video information and saves space. The process continues frame by frame. When the convolution result of the current frame is different from that of the previous frame, the key value of the current frame is assigned a value of 1 by appending key-value pairs. This indicates that the current frame has different graphic information from the previous frame and is a dynamic frame. In this case, to retain useful graphic information, the dynamic frame with a key value of 1 needs to be retained. The MCU continues to judge each frame in the video to be compressed in the same way as above until the last frame is judged.
[0087] As can be seen, by comparing the key values of the current frame with those of the previous frame frame by frame, the previous frame is replaced when the key value is 0, and the current frame is retained when the key value is 1. The MCU continues to determine whether the convolution result of the next frame matches the convolution result of the current frame, and displays this as a key-value pair. Compression ends after the last frame of the video to be compressed is determined. This embodiment replaces and overwrites static frames to save space for video compression, while retaining dynamic frames to provide users with a better viewing experience.
[0088] Based on the above embodiments, as a preferred embodiment, after obtaining the video to be compressed, the method further includes:
[0089] Obtain the required video quality level for the video to be compressed;
[0090] If the demand level is lower than the preset level, proceed to the step of obtaining the convolution result of each frame in the video to be compressed and the convolution kernel matrix.
[0091] If the required level is higher than the preset level, then the compression of the video to be compressed will be abandoned.
[0092] Specifically, in this embodiment, after obtaining the video to be compressed, it is also necessary to obtain the required image quality level of the video. Since different usage needs and the memory occupied by the current video vary, if the user's required image quality level for the video to be compressed is high, exceeding the preset level, it means that each frame of the video needs higher video quality. Therefore, compression of the video will be abandoned to ensure that each frame maintains its original clarity, bitrate, and other parameters. Conversely, if the required image quality level of the video is lower than the preset level, it means that each frame of the video does not need to be retained. Static frames with the same video information can be replaced and compressed to save storage space. Furthermore, the actual size of the space occupied by the video under test or the percentage of disk space it occupies can also be determined. If the actual memory is small but the percentage of disk space occupied is high, compression can be performed. Similarly, if the actual space occupied by the video under test is large but the percentage of disk space occupied is not high and will not affect the storage of other data on the disk, compression can be avoided. It should be noted that the preset video quality requirement level, the actual space occupied by the video to be compressed, and the percentage of disk space occupied can all be set according to the user's needs, and no special limitations are made in this embodiment.
[0093] It is evident that determining the required video quality level, memory size, and percentage of memory occupied by the video before compression can lead to more efficient video processing.
[0094] Based on the above embodiments, as a preferred embodiment, each value in the convolution kernel matrix is randomly generated by a random function according to the size of the convolution kernel.
[0095] Specifically, in this embodiment, the size of the convolution kernel is determined based on factors affecting the video's image quality and length, which influence the video's storage space. For example, if the current video occupies a large amount of memory, has a high resolution, and a long duration, a larger convolution kernel is needed to provide greater computing power for processing the video to be compressed. Conversely, a smaller convolution kernel can be chosen to provide sufficient computing power for processing the video to be compressed. Furthermore, the convolution kernel matrix is randomly generated by a random function. If the convolution kernel matrix has a regular distribution, during subsequent convolution kernel operations, even if the current frame is not identical to the previous frame, using the same or similarly distributed convolution kernel matrix might result in the convolution result of the current frame being the same as that of other frames. However, since the two frames are different images, this could lead to system misjudgment, incorrect compression, and loss of important video frames. It should be noted that the convolution kernel algorithm mentioned above is only a preferred implementation method and is not specifically limited in this embodiment. Other methods capable of extracting video image frame features can also be used. In this embodiment, no special restrictions are placed on the size of the convolution kernel or the random generation method of the convolution matrix. The size of the convolution kernel can be set according to the user's needs, and the random generation method of the convolution matrix is not limited to random generation by the random function. It is only necessary that the generated matrix is irregular and random.
[0096] It is evident that by using convolution kernels capable of extracting video image frame features to randomly generate convolution matrices, the features of the compressed video can be extracted more effectively for subsequent processing and analysis.
[0097] Based on the above embodiments, as a preferred embodiment, the convolution operation formula is as follows:
[0098]
[0099] Where in, out, core, and f represent the dimensions of the video frame matrix to be compressed, the convolution kernel matrix, the output feature matrix, and the convolution kernel matrix, respectively; x and y represent the coordinates in the video frame matrix to be compressed; and i and j represent the coordinates of the output feature matrix after calculation using the convolution operation formula.
[0100] Specifically, this embodiment adopts
[0101]
[0102] The convolution kernel is operated on. The parameters in, out, core, and f represent the dimensions of the video image frame matrix to be compressed, the convolution kernel matrix, the output feature matrix, and the convolution kernel matrix, respectively. x and y represent the coordinates in the video image frame matrix to be compressed, and i and j represent the coordinates of the output feature matrix after the convolution operation. These parameters are substituted into the formula to output the convolution kernel result for the MCU to judge and refer to.
[0103] The video compression method has been described in detail in the above embodiments. This application also provides embodiments corresponding to a video compression system. It should be noted that this application describes the system-related embodiments from two perspectives: one based on functional modules and the other based on hardware.
[0104] From the perspective of functional modules Figure 3 A structural diagram of the video compression system provided in the embodiments of this application, such as... Figure 3 As shown, the system includes:
[0105] The first acquisition module 10 is used to acquire the convolution kernel matrix;
[0106] The second acquisition module 11 is used to acquire the video to be compressed;
[0107] The third acquisition module 12 is used to acquire the convolution result of each frame in the video to be compressed and the convolution kernel matrix.
[0108] The judgment module 13 is used to determine whether the convolution result of the current frame is consistent with the convolution result of the previous frame;
[0109] Replacement module 14, used to replace the current frame with the previous frame;
[0110] Module 15 is used to retain the current frame.
[0111] Since the embodiments of the system part correspond to the embodiments of the method part, please refer to the description of the embodiments of the method part for the embodiments of the system part, and they will not be repeated here.
[0112] The video compression system provided in this embodiment corresponds to the method described above, and therefore has the same beneficial effects as the method described above.
[0113] From a hardware perspective Figure 4 A structural diagram of a video compression apparatus provided in another embodiment of this application is shown below. Figure 4 As shown, the video compression device includes: a memory 20 for storing computer programs;
[0114] The processor 21 is used to implement the steps of the video compression method mentioned in the above embodiments when executing a computer program.
[0115] The video compression device provided in this embodiment may include, but is not limited to, other devices that can achieve video compression functions.
[0116] The processor 21 may include one or more processing cores, such as a quad-core processor or an octa-core processor. The processor 21 may be implemented using at least one of the following hardware forms: Digital Signal Processor (DSP), Field-Programmable Gate Array (FPGA), or Programmable Logic Array (PLA). The processor 21 may also include a main processor and a coprocessor. The main processor, also known as the Central Processing Unit (CPU), is used to process data in the wake-up state; the coprocessor is a low-power processor used to process data in the standby state. In some embodiments, the processor 21 may integrate a Graphics Processing Unit (GPU), which is responsible for rendering and drawing the content to be displayed on the screen. In some embodiments, the processor 21 may also include an Artificial Intelligence (AI) processor, which is used to handle computational operations related to machine learning.
[0117] The memory 20 may include one or more computer-readable storage media, which may be non-transitory. The memory 20 may also include high-speed random access memory and non-volatile memory, such as one or more disk storage devices or flash memory devices. In this embodiment, the memory 20 is used to store at least the following computer program 201, which, after being loaded and executed by the processor 21, is capable of implementing the relevant steps of the video compression method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may also include an operating system 202 and data 203, and the storage method may be temporary or permanent storage. The operating system 202 may include Windows, Unix, Linux, etc. The data 203 may include, but is not limited to, data related to the video compression method.
[0118] In some embodiments, the video compression device may further include a display screen 22, an input / output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
[0119] Those skilled in the art will understand that Figure 4 The structure shown does not constitute a limitation on the video compression device and may include more or fewer components than illustrated.
[0120] The video compression apparatus provided in this application includes a memory and a processor. When the processor executes a program stored in the memory, it can implement the video compression method mentioned above.
[0121] Finally, this application also provides an embodiment corresponding to a computer-readable storage medium. The computer-readable storage medium stores a computer program, which, when executed by a processor, implements the steps described in the above-described video compression method embodiment.
[0122] It is understood that if the methods in the above embodiments are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and executes all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0123] The foregoing has provided a detailed description of a video compression method, system, apparatus, and medium provided in this application. The various embodiments in the specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple; relevant parts can be referred to in the method section. It should be noted that those skilled in the art can make several improvements and modifications to this application without departing from the principles of this application, and these improvements and modifications also fall within the protection scope of the claims of this application.
[0124] It should also be noted that, in this specification, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
Claims
1. A video compression method, characterized in that, include: Obtain the convolution kernel matrix; Get the video to be compressed; Obtain the convolution result of each frame in the video to be compressed and the convolution kernel matrix; Each value in the convolution kernel matrix is randomly generated by a random function based on the size of the convolution kernel. The size of the convolution kernel is determined by factors affecting the video resolution and length of the current video, which in turn affect the video's file size. The size of the convolution kernel is directly proportional to the video resolution and length of the current video. Determine whether the convolution result of the current frame is consistent with the convolution result of the previous frame; If so, then the previous frame replaces the current frame; If not, then retain the current frame and return to the step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, until the last frame ends; After determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, the method further includes: The key-value pairs are appended to each frame of the video to be compressed and stored. The step of storing the key-value pairs appended to each frame of the video to be compressed includes: When the convolution result of the current frame is consistent with the convolution result of the previous frame, the key value of the current frame is assigned to 0; When the convolution result of the current frame is inconsistent with the convolution result of the previous frame, the key value of the current frame is assigned the value 1; The step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame includes: Determine whether the key value of the current frame is 0; If so, proceed to the step of replacing the current frame with the previous frame; Determine whether the key value of the current frame is 1; If so, proceed to the step of retaining the current frame and returning to the step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, until the last frame ends.
2. The video compression method according to claim 1, characterized in that, After obtaining the video to be compressed, the following steps are also included: Obtain the video quality requirement level corresponding to the video to be compressed; If the required level is lower than the preset level, then proceed to the step of obtaining the convolution result of each frame in the video to be compressed and the convolution kernel matrix. If the required level is higher than the preset level, then the compression of the video to be compressed will be abandoned.
3. The video compression method according to claim 1, characterized in that, The convolution operation formula is: Wherein, out, core, and f represent the dimensions of the video frame matrix to be compressed, the convolution kernel matrix, the output feature matrix, and the convolution kernel matrix, respectively; x and y represent the coordinates in the video frame matrix to be compressed; and i and j represent the coordinates of the output feature matrix calculated by the convolution operation formula.
4. A video compression system, characterized in that, include: The first acquisition module is used to acquire the convolution kernel matrix; The second acquisition module is used to acquire the video to be compressed; The third acquisition module is used to acquire the convolution result of each frame in the video to be compressed and the convolution kernel matrix; each value in the convolution kernel matrix is randomly generated by the random function according to the size of the convolution kernel, and the size of the convolution kernel is determined according to the video resolution and video length of the current video, which affect the space occupied by the video; the size of the convolution kernel is proportional to the video resolution and video length of the current video. The judgment module is used to determine whether the convolution result of the current frame is consistent with the convolution result of the previous frame; A replacement module, used to replace the current frame with the previous frame; A retention module is used to retain the current frame; After determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, the method further includes: The key-value pairs are appended to each frame of the video to be compressed and stored. The step of storing the key-value pairs appended to each frame of the video to be compressed includes: When the convolution result of the current frame is consistent with the convolution result of the previous frame, the key value of the current frame is assigned to 0; When the convolution result of the current frame is inconsistent with the convolution result of the previous frame, the key value of the current frame is assigned the value 1; The step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame includes: Determine whether the key value of the current frame is 0; If so, proceed to the step of replacing the current frame with the previous frame; Determine whether the key value of the current frame is 1; If so, proceed to the step of retaining the current frame and returning to the step of determining whether the convolution result of the current frame is consistent with the convolution result of the previous frame, until the last frame ends.
5. A video compression device, characterized in that, Includes memory used to store computer programs; A processor, configured to implement the steps of the video compression method as described in any one of claims 1 to 3 when executing the computer program.
6. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the video compression method as described in any one of claims 1 to 3.