Video transmission system, video compression device, and video compression method

JP2026095810APending Publication Date: 2026-06-12SUMITOMO ELECTRIC INDUSTRIES LTD +2

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: SUMITOMO ELECTRIC INDUSTRIES LTD
Filing Date: 2024-12-02
Publication Date: 2026-06-12

Application Information

Patent Timeline

02 Dec 2024

Application

12 Jun 2026

Publication

JP2026095810A

IPC: H04N19/167; H04N19/46; H04N19/85

AI Tagging

Application Domain

Digital video signal modification

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Coding method, apparatus, device, storage medium and computer program
CN122205090ABiological neural network modelsDigital video signal modification
Method, system, device and storage medium for rate control based on reinforcement learning
CN122205083ABiological models Character and pattern recognition
Method for parallel image processing and routing
US20260172584A1Closed circuit television systems Digital video signal modification
Multi-core parallel decoding method, device and computer equipment
CN122205096ADigital video signal modification
Encoding device and encoding method
JP7873072B2Digital video signal modification

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing hardware encoders, particularly low-cost or older models, lack region-specific compression functionality, making it difficult to efficiently reduce video data without specialized hardware.

⚗Method used

A video transmission system that includes a video preprocessing unit to differentiate image quality and transmission capacity effects between regions of interest and non-interest regions, using preprocessing techniques such as contrast conversion, followed by uniform compression, allowing general-purpose hardware encoders to achieve region-specific compression.

🎯Benefits of technology

Efficient reduction of video data volume is achieved without specialized hardware, balancing image quality and transmission capacity, and preventing network congestion.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure 2026095810000001_ABST

Patent Text Reader

Abstract

This system provides a video transmission system that can efficiently reduce the amount of video data without the need for special hardware encoders. [Solution] The video transmission system comprises a video preprocessing unit that generates a second video from a first video by performing preprocessing on at least one of a region of interest and a non-interest region different from the region of interest included in the first video, which affects both image quality and transmission capacity of the transmission line, such that the effects on the region of interest and the non-interest region after the preprocessing are different; a video compression unit that compresses the second video and generates a compressed video to be sent to the transmission line; and a video decompression unit that decompresses the compressed video received from the transmission line.

Need to check novelty before this filing date? Find Prior Art

Description

【Technical Field】【0001】 The present disclosure relates to a video transmission system, a video compression device, and a video compression method. 【Background Art】【0002】 Conventionally, a system has been proposed that reduces the amount of video code (data amount) by performing compression encoding based on different criteria between a region of interest and a non-region of interest in a video (see, for example, Patent Document 1 and Patent Document 2). 【0003】 For example, when adopting H.264 / MPEG-4 AVC or H.265 / MPEG-H HEVC as the compression encoding method, the quantization parameter QP (Quantization Parameter) of the non-region of interest is made larger than the QP of the region of interest. As a result, the image quality of the non-region of interest deteriorates compared to the image quality of the region of interest, so the data amount of the video can be reduced. 【Prior Art Documents】【Patent Documents】【0004】【Patent Document 1】 Japanese Patent Application Laid-Open No. 2018-129688 【Patent Document 2】 Japanese Patent Application Laid-Open No. 2007-13471 【Summary of the Invention】【Problems to be Solved by the Invention】【0005】 However, among commonly used hardware encoders (e.g., dedicated chips, SoCs (System on a Chip), FPGAs (Field Programmable Gate Arrays), GPUs (Graphics Processing Units)), some allow region-specific compression by specifying the QP, while others do not. Low-cost, older, or lower-grade hardware encoders often lack region-specific compression functionality, and adding this functionality later is difficult if it is not built into the chip. This challenge is also common to hardware encoders that perform region-specific compression using methods other than changing the QP, such as changing the compression ratio for each region. 【0006】 This disclosure is made in view of these circumstances and aims to provide a video transmission system, a video compression device, and a video compression method that can efficiently reduce the amount of video data without using a special hardware encoder. [Means for solving the problem] 【0007】 A video transmission system according to one aspect of the present disclosure includes a video preprocessing unit that generates a second video from a first video by performing preprocessing on at least one of a region of interest and a non-interest region different from the region of interest included in the first video, such that the effects on both image quality and transmission capacity of the transmission line are different after the preprocessing on the region of interest and the non-interest region; a video compression unit that compresses the second video and generates a compressed video to be sent to the transmission line; and a video decompression unit that decompresses the compressed video received from the transmission line. [Effects of the Invention] 【0008】 According to this disclosure, the amount of video data can be efficiently reduced without using a special hardware encoder. [Brief explanation of the drawing] 【0009】 [Figure 1] Figure 1 is a block diagram showing the overall configuration of the video transmission system according to the embodiment of this disclosure. [Figure 2] Figure 2 is a block diagram showing the functional configuration of a video compression device according to an embodiment of this disclosure. [Figure 3] Figure 3 is a diagram illustrating an example of the process by which the region of interest determination unit determines the region of interest. [Figure 4] Figure 4 illustrates an example of the process of extracting non-focus areas by the video preprocessing unit. [Figure 5] Figure 5 is a diagram illustrating an example of contrast conversion processing by the video preprocessing unit. [Figure 6] Figure 6 shows the relationship between the pixel value before conversion and the pixel value after conversion. [Figure 7] Figure 7 shows the relationship between the pixel value before conversion and the pixel value after conversion. [Figure 8] Figure 8 is a diagram illustrating an example of video synthesis processing by the video preprocessing unit. [Figure 9] Figure 9 is a block diagram showing the functional configuration of an image decompression device according to the present disclosure. [Figure 10] Figure 10 is a flowchart showing an example of the operation of a video compression device according to an embodiment of this disclosure. [Figure 11] Figure 11 is a flowchart showing an example of the operation of the video decompression device according to the present disclosure. [Figure 12] Figure 12 shows the relationship between file size and VMAF (Video Multi-Method Assessment Fusion) for each preprocessing step. [Figure 13] Figure 13 shows the processing time per frame, file size, and characteristics for each pre-processing step when a video compression device processes the same input video. [Modes for carrying out the invention] 【0010】 [Summary of the embodiments of this disclosure] First, an overview of the embodiments of the present disclosure will be listed and described. 【0011】 (1) A video transmission system according to an embodiment of the present disclosure performs preprocessing that affects both the image quality and the transmission capacity of the transmission path on at least one of the attention area included in the first video and the non-attention area different from the attention area, so that the effects exerted by the attention area and the non-attention area after the preprocessing are different, thereby generating a second video from the first video, a video compression unit that compresses the second video to generate a compressed video to be sent to the transmission path, and a video decompression unit that decompresses the compressed video received from the transmission path. 【0012】 According to this configuration, preprocessing is performed on at least one of the attention area and the non-attention area so that the effects on both the image quality and the transmission capacity of the transmission path are different between the attention area and the non-attention area. Also, a compressed video is generated by compressing the second video after the preprocessing is performed. Therefore, even without using a special hardware encoder capable of region-by-region compression, compression processing equivalent to the case of performing region-by-region compression becomes possible. Thus, the data amount of the video can be efficiently reduced. 【0013】 (2) In the above (1), the video transmission system may further include a preprocessing control unit that determines parameters of the preprocessing based on the communication status of the transmission path. 【0014】 According to this configuration, when the transmission capacity of the transmission path decreases and the transmission band is tight, packet loss or transmission delay on the transmission path can be prevented by determining parameters so that the data amount of the compressed video becomes smaller. 【0015】 (3) In the above (1) or (2), the video compression unit may uniformly compress the attention area and the non-attention area in the second video. 【0016】 This configuration allows for the efficient reduction of video data volume using commonly available hardware encoders. 【0017】 (4) In any of (1) to (3) above, the video transmission system may further include a video restoration unit that performs a conversion process in the reverse of the preprocessing on the video expanded by the video expansion unit. 【0018】 This configuration allows for the reconstruction of an image equivalent to the first image. 【0019】 (5) In any of (1) to (4) above, the preprocessing may be a process that reduces the contrast of the non-focus region relative to the contrast of the focus region. 【0020】 By relatively reducing the contrast of non-focus areas, the amount of color information, high-frequency components, and edges in those areas are reduced, resulting in an efficient reduction in the amount of video data. 【0021】 (6) A video compression device according to another embodiment of the present disclosure comprises a video preprocessing unit that generates a second video from a first video by performing preprocessing on at least one of a region of interest and a non-interest region different from the region of interest included in the first video, which affects both image quality and transmission capacity of the transmission line, such that the effects of the region of interest and the non-interest region after the preprocessing are different; and a video compression unit that compresses the second video and generates a compressed video to be sent to the transmission line. 【0022】 In this configuration, preprocessing is applied to at least one of the focus region and the non-focus region so that the impact on both image quality and transmission capacity of the transmission path differs between the focus region and the non-focus region. Furthermore, compressed video is generated by compressing the second video after preprocessing. Therefore, the amount of video data can be efficiently reduced without using special hardware encoders capable of region-specific compression processing. 【0023】 (7) A video compression method according to another embodiment of the present disclosure includes the steps of generating a second video from a first video by performing preprocessing on at least one of a region of interest and a non-interest region different from the region of interest included in the first video, such that the effects on both image quality and transmission capacity of the transmission line are different after the preprocessing on the region of interest and the non-interest region; and compressing the second video to generate a compressed video to be sent to the transmission line. 【0024】 This configuration includes the characteristic processing steps of the aforementioned video compression device. Therefore, it produces the same functions and effects as the aforementioned video compression device. 【0025】 [Details of the embodiments of this disclosure] The embodiments of this disclosure will be described below with reference to the drawings. The embodiments described below are all specific examples of this disclosure. The numerical values, shapes, materials, components, arrangement and connection configurations of components, steps, and the order of steps shown in the following embodiments are examples and do not limit this disclosure. Furthermore, components in the following embodiments that are not described in the independent claims are optional components that can be added. Also, the figures are schematic diagrams and do not necessarily represent the exact details. 【0026】 Furthermore, identical components will be assigned the same symbols. Since their functions and names are also identical, their explanations will be omitted as appropriate. 【0027】 [Overall configuration of the video transmission system] Figure 1 is a block diagram showing the overall configuration of the video transmission system according to the embodiment of this disclosure. 【0028】 The video transmission system 100 comprises a video compression device 1, a video decompression device 2, a camera 3, and a display 4. 【0029】 Camera 3 is installed on a vehicle such as a car or motorcycle, and outputs video data (hereinafter simply referred to as "video") generated by imaging the area around the vehicle. The video consists of time-series image data (hereinafter simply referred to as "images"). 【0030】 The video compression device 1 is connected to the camera 3 and receives the video output from the camera 3. It then performs predetermined pre-processing on the received video. The pre-processing will be described later. The video compression device 1 compresses the pre-processed video. The video compression device 1 transmits the multiplexed data, which consists of the compressed video and additional information (described later) containing information about the pre-processing, to the video decompression device 2 via a network 5 (transmission path) such as the Internet. 【0031】 The video decompressor 2 receives multiplexed data from the video decompressor 2 via the network 5. The video decompressor 2 separates the compressed video and additional information from the received multiplexed data. The video decompressor 2 decompresses the separated compressed video to generate the decompressed video. The video decompressor 2 restores the original video by applying video restoration processing to the decompressed video based on the separated additional information. The video restoration processing will be described later. The video decompressor 2 outputs the restored video to the display 4 connected to the video decompressor 2. 【0032】 Display 4 receives the video from the video decompression device 2 and displays the received video on the screen. 【0033】 [Configuration of video compression device 1] Figure 2 is a block diagram showing the functional configuration of the video compression device 1 according to the present disclosure. 【0034】 The video compression device 1 comprises a video acquisition unit 11, a focus area determination unit 12, a video preprocessing unit 13, a video compression unit 14, a communication status acquisition unit 15, a preprocessing control unit 16, a multiplexing unit 17, and a data transmission unit 18. 【0035】 Each processing unit constituting the video compression device 1 is implemented, for example, by a processing circuit (Circuitry) including one or more processors. 【0036】 <Video acquisition unit 11> The video acquisition unit 11 receives video output from the camera 3. The video acquisition unit 11 outputs the received video to the area of interest determination unit 12 and the video preprocessing unit 13. 【0037】 <Focus Area Determination Unit 12> The area of interest determination unit 12 acquires video from the video acquisition unit 11 and determines the area of interest within the video. 【0038】 For example, the focus region determination unit 12 determines the focus region of each image for each image constituting the video using a learning model. Specifically, the focus region determination unit 12 divides the images constituting the video into multiple blocks of predetermined sizes. The focus region determination unit 12 uses a learning model to determine whether or not a predetermined extraction target is included in each block image. Examples of learning models include DNN (Deep Neural Network), CNN (Convolutional Neural Network), and AutoEncoder. For example, if the extraction target is a person, images containing various people are used as training data, and the learning model's parameters are determined through machine learning techniques such as deep learning, as the model learns about people. The focus region determination unit 12 determines the block containing the extraction target as the focus region. 【0039】 Figure 3 is a diagram illustrating an example of the process by the focus area determination unit 12 to determine the focus area. The focus area determination unit 12 determines the area containing a person and the area containing a car in the image 50 as focus area 61 and focus area 62, respectively. 【0040】 The area of focus determination unit 12 outputs area of focus information, indicating the position and size of the area of focus for each determined image, to the preprocessing control unit 16 and the multiplexing unit 17. If the area of focus is a rectangular area, the area of focus information may include, for example, the coordinates of the upper-left corner of the area of focus, as well as the width and height of the area of focus. Alternatively, the area of focus information may also be the block number of the block containing the area of focus. 【0041】 <Video pre-processing unit 13> The video preprocessing unit 13 acquires video and area of interest information from the video acquisition unit 11 and the area of interest determination unit 12, respectively. Based on the acquired video and area of interest information, the video preprocessing unit 13 performs preprocessing on the video that affects both image quality and transmission capacity of the transmission path. In this process, the video preprocessing unit 13 performs the preprocessing on at least one of the areas of interest and non-areas of interest that are different from the area of interest included in the image that constitutes the acquired video. The video preprocessing unit 13 performs the preprocessing in such a way that the effects of each area differ between the area of interest and the non-area of interest after the preprocessing. 【0042】 For example, the video preprocessing unit 13 performs a contrast conversion process as a preprocessing step for each image that makes up the video. This process assumes that the contrast of the non-focus areas (areas other than the focus area) in the image is reduced compared to the contrast of the focus area. As an example, this preprocessing step reduces the contrast of the non-focus areas. By reducing the contrast of the non-focus areas, color information, high-frequency components, and edges are smoothed. This reduces the amount of data in the compressed video when the video is compressed by the video compression unit 14, which will be described later. Furthermore, by reducing the amount of data in the compressed video that flows through the transmission line, it is possible to prevent congestion of the transmission line bandwidth. On the other hand, reducing the contrast reduces the image quality of the video. In other words, there is a trade-off between image quality and transmission line bandwidth congestion (transmission capacity). Thus, the contrast conversion process affects both image quality and the transmission capacity of the transmission line. Reducing the contrast may cause block noise in the video, but block noise also occurs when the QP is increased during video compression encoding. Therefore, by performing contrast conversion processing, it is possible to achieve results equivalent to those of QP operation in terms of image quality and compression ratio without significantly increasing the processing load. 【0043】 Specifically, the video preprocessing unit 13 extracts non-focus areas from the acquired video, which are areas excluding the focus areas indicated in the focus area information. 【0044】 Figure 4 illustrates an example of the process of extracting non-focus areas by the video preprocessing unit 13. The video preprocessing unit 13 extracts the non-focus area 63 shown in Figure 4 by removing the focus areas 61 and 62 from the image 50 shown in Figure 3. In Figure 4, the removed focus areas 61 and 62 are indicated by hatching. 【0045】 The video preprocessing unit 13 reduces the contrast of the non-focus region by converting the pixel value (pre-conversion pixel value) of each pixel constituting the extracted non-focus region to a post-conversion pixel value according to the following equation 1. For example, if the pixel value is expressed in YUV format, the video preprocessing unit 13 may perform the conversion based on equation 1 only on the luminance signal (Y). Alternatively, if the pixel value is expressed in RGB format, the video preprocessing unit 13 may perform the conversion based on equation 1 on the R, G, and B values, respectively. 【0046】 Converted pixel value = Original pixel value × α + β …(Equation 1) Here, α: contrast value (0 < α < 1) β: Contrast offset value (β≧0) 【0047】 Furthermore, if the pre-conversion pixel value is between 0 and 255 (256 gradations), α and β may be determined so that the post-conversion pixel value falls within the range of 0 to 255. Also, if the post-conversion pixel value calculated using Equation 1 exceeds 255, the post-conversion pixel value may be forcibly set to 255. 【0048】 The values of α and β shall be set by the preprocessing control unit 16, which will be described later. 【0049】 Figure 5 is a diagram illustrating an example of contrast conversion processing by the video preprocessing unit 13. In Figure 5, the removed areas of interest 61 and 62 are shown with hatching. The video preprocessing unit 13 reduces the contrast of the non-area of interest 63, as shown in Figure 5, by converting the pixel values of each pixel in the non-area of interest 63 shown in Figure 4 according to Equation 1. 【0050】 Figure 6 shows the relationship between the pixel value before conversion and the pixel value after conversion. In the graph shown in Figure 6, the horizontal axis represents the pixel value before conversion, and the vertical axis represents the pixel value after conversion. The solid line in Figure 6 shows the relationship in Equation 1, and the slope of this solid line is α. The intercept of the solid line with the vertical axis is β. The dashed line in Figure 6 shows the relationship between the pixel value before conversion and the pixel value after conversion (converted pixel value = pre-conversion pixel value) when no conversion is performed. 【0051】 The contrast conversion process is not limited to Equation 1. For example, the relationship between the converted pixel value and the pre-conversion pixel value may be nonlinear. 【0052】 Figure 7 shows the relationship between the pixel values before and after conversion. The horizontal and vertical axes, as well as the dashed lines in Figure 7, are the same as those shown in Figure 6. The solid lines in Figure 7 show the nonlinear relationship between the pixel values before and after conversion. By performing contrast conversion processing according to the solid lines in Figure 7, the range of pixel values for the entire image can be narrowed compared to the range of pixel values for the original image. 【0053】 Furthermore, the preprocessing performed by the video preprocessing unit 13 is not limited to contrast conversion processing; other processing that affects both image quality and transmission capacity of the transmission path may also be performed. For example, the video preprocessing unit 13 may apply a Blur filter, Median filter, Convolution filter, or Gaussian filter to non-focus areas to smooth spatial pixel values. This reduces color information, reduces high-frequency components, and smooths edges, similar to contrast conversion processing, thereby reducing the amount of data in the compressed video. 【0054】 Furthermore, the video preprocessing unit 13 may perform preprocessing on multiple time-series images that make up the video. For example, if there are consecutive images that do not contain the region of interest, image decimation (e.g., decimation every other frame) may be performed on the video. This reduces the amount of video data flowing through the transmission path, although it may reduce the image quality. 【0055】 Furthermore, the video preprocessing unit 13 may perform preprocessing to smooth non-focus areas along the time axis. For example, the video preprocessing unit 13 may use the moving average value of the pixel values in the non-focus areas along the time axis as the converted pixel value. This reduces color information, reduces high-frequency components, and smooths edges, resulting in a decrease in image quality, but it can reduce the amount of video data flowing through the transmission path. 【0056】 The video preprocessing unit 13 generates an image by combining the area of interest and the non-interesting area after preprocessing. 【0057】 Figure 8 is a diagram illustrating an example of video synthesis processing by the video preprocessing unit 13. The video preprocessing unit 13 extracts the areas of interest 61 and 62 from the image 50 shown in Figure 3, and generates an image 53 by combining the extracted areas of interest 61 and 62 with the non-area of interest 63 after contrast conversion processing shown in Figure 5. The video preprocessing unit 13 synthesizes the video by performing this synthesis processing for each image in the time series. 【0058】 The video preprocessing unit 13 outputs the synthesized video to the video compression unit 14. 【0059】 <Video compression unit 14> The video compression unit 14 receives the synthesized video from the video preprocessing unit 13 and performs compression encoding processing on the received video. For example, the video compression unit 14 compresses the video according to a lossy compression encoding scheme such as H.264 / MPEG-4 AVC or H.265 / MPEG-H HEVC. However, the compression encoding scheme is not limited to these and may be a lossless compression scheme. 【0060】 The video compression unit 14 compresses the video uniformly without distinguishing between areas of interest and areas of non-interest. In other words, the video compression unit 14 compresses the video according to a uniform standard (for example, the same image quality or the same compression ratio). For this reason, a general-purpose hardware encoder can be used as the video compression unit 14. 【0061】 The video compression unit 14 outputs the compressed video to the multiplexing unit 17. 【0062】 <Communication status acquisition unit 15> The communication status acquisition unit 15 acquires network statistics information from the data transmission unit 18, and from the acquired network statistics information, it acquires data indicating the communication status of network 5. 【0063】 When the data transmission unit 18 transmits data using SRT (Secure Reliable Transport), a type of video transmission protocol, the data transmission unit 18 can obtain network statistics information through simple communication with the video decompression device 2. Network statistics information includes data indicating the communication status of network 5, such as round-trip time (hereinafter referred to as "RTT") or the number of packet losses. For this reason, for example, the communication status acquisition unit 15 obtains RTT or the number of packet losses from the network statistics information as data indicating the communication status of network 5. 【0064】 The communication status acquisition unit 15 outputs data indicating the communication status of the acquired network 5 to the preprocessing control unit 16. 【0065】 <Preprocessing control unit 16> The preprocessing control unit 16 determines the parameters for the preprocessing to be performed by the video preprocessing unit 13. By changing these parameters, both image quality and the transmission capacity of the transmission path can be changed. Specifically, the preprocessing control unit 16 determines the values of α and β in equation 1 described above. For example, the preprocessing control unit 16 may determine the value of α based on data indicating the communication status of network 5 obtained from the communication status acquisition unit 15. As an example, the preprocessing control unit 16 may calculate the value of α from the RTT obtained from the communication status acquisition unit 15 using a predetermined relational expression in which the value of α decreases as the RTT increases. This allows the video preprocessing unit 13 to apply preprocessing to non-focus areas that lowers the contrast as the RTT increases, that is, as the bandwidth of network 5 becomes constrained. Alternatively, a table showing the relationship between RTT and the value of α may be used instead of the relational expression. 【0066】 As another example, the preprocessing control unit 16 may calculate the value of α from the number of packet losses obtained from the communication status acquisition unit 15 using a predetermined relational expression in which the value of α decreases as the number of packet losses increases. This allows the video preprocessing unit 13 to apply preprocessing to the non-focus area, lowering the contrast as the number of packet losses increases, that is, as the communication status of network 5 worsens. Alternatively, a table showing the relationship between the number of packet losses and the value of α may be used instead of the relational expression. 【0067】 The preprocessing control unit 16 may determine β as a predetermined constant. Alternatively, the preprocessing control unit 16 may determine β according to α. For example, the preprocessing control unit 16 may determine β from α using a relational expression or table data in which β increases as α decreases. 【0068】 The preprocessing control unit 16 outputs the determined α and β values to the video preprocessing unit 13 and the multiplexing unit 17. 【0069】 <Multiplex section 17> The multiplexing unit 17 generates multiplexed data by multiplexing the compressed video acquired from the video compression unit 14 with the α and β values used for preprocessing of the video acquired from the preprocessing control unit 16 and the area of interest information acquired from the area of interest determination unit 12 as additional information, and outputs it to the data transmission unit 18. 【0070】 <Data transmission unit 18> The data transmission unit 18 transmits the multiplexed data acquired from the multiplexing unit 17 to the video decompression device 2 based on the SRT. The data transmission unit 18 also acquires network statistics information through simple communication with the video decompression device 2. The data transmission unit 18 outputs the acquired network statistics information to the communication status acquisition unit 15. 【0071】 Furthermore, the video transmission protocol used by the data transmission unit 18 is not limited to SRT; other protocols such as RTP (Real-time Transport Protocol) may also be used. 【0072】 [Configuration of the video expansion device 2] Figure 9 is a block diagram showing the functional configuration of the image decompression device 2 according to the present disclosure. 【0073】 The video decompression device 2 comprises a data receiving unit 21, an additional information separation unit 22, a video decompression unit 23, a video restoration unit 24, and a display control unit 25. 【0074】 Each processing unit constituting the video decompression device 2 is implemented, for example, by a processing circuit (Circuitry) including one or more processors. 【0075】 <Data receiving unit 21> The data receiving unit 21 receives the multiplexed data transmitted from the video compression device 1 via the network 5. The data receiving unit 21 outputs the received multiplexed data to the additional information separation unit 22. 【0076】 <Additional Information Separation Unit 22> The additional information separation unit 22 acquires multiplexed data from the data receiving unit 21, separates additional information (in this case, the values of α and β and the region of interest information) from the acquired multiplexed data, and generates compressed video and additional information. The additional information separation unit 22 outputs the compressed video to the video decompression unit 23 and outputs the additional information to the video restoration unit 24. 【0077】 <Image expansion unit 23> The video decompression unit 23 acquires the compressed video from the additional information separation unit 22. The video decompression unit 23 decompresses the acquired compressed video according to the decompression decoding method corresponding to the compression encoding method of the video compression device 1. This restores the video before compression processing by the video compression device 1 (corresponding to image 53 in Figure 8). The video decompression unit 23 outputs the decompressed video to the video restoration unit 24. 【0078】 <Video Restoration Section 24> The video restoration unit 24 performs video restoration processing to restore the video before preprocessing, based on the decompressed video obtained from the video decompression unit 23 and the additional information obtained from the additional information separation unit 22. 【0079】 Specifically, the video restoration unit 24 sets the values of α and β included in the additional information in the following equation 2, which is the inverse transform of equation 1. 【0080】 Converted pixel value = (Original pixel value - β) × 1 / α …(Equation 2) Here, α: contrast value (0 < α < 1) β: Contrast offset value (β≧0) 【0081】 The image restoration unit 24 identifies the position and size of the area of interest in each image from the area of interest information included in the additional information. Based on the identified position and size of the area of interest, the image restoration unit 24 generates an image of the non-area of interest (corresponding to the image in Figure 5) and an image of the area of interest by cutting out the non-area of interest excluding the area of interest for each image that makes up the expanded image. 【0082】 The image restoration unit 24 increases the contrast of the non-focus region by converting the pixel value (pre-conversion pixel value) of each pixel constituting the non-focus region to a post-conversion pixel value according to Equation 2. For example, if the pixel value is expressed in YUV format, the image restoration unit 24 may perform the conversion based on Equation 2 only on the luminance signal (Y). Alternatively, if the pixel value is expressed in RGB format, the image restoration unit 24 may perform the conversion based on Equation 2 on each of the R, G, and B values. 【0083】 The image restoration unit 24 generates an image (corresponding to the image in Figure 3) by combining the image of the area of interest with an image of the non-area of interest with increased contrast. The image restoration unit 24 restores the video output by camera 3 by arranging the generated composite images in chronological order. The image restoration unit 24 outputs the restored video to the display control unit 25. 【0084】 <Display Control Unit 25> The display control unit 25 acquires video from the video restoration unit 24 and outputs the acquired video to the outside. Specifically, the display control unit 25 displays the video on the display 4 by outputting the acquired video to the display 4. 【0085】 [Processing procedure for video compression device 1] Figure 10 is a flowchart showing an example of the operation of the video compression device 1 according to the present disclosure. 【0086】 The video compression device 1 acquires video from the camera 3 (step S11). 【0087】 The video compression device 1 determines a region of interest in each image by inputting each image constituting the video acquired in step S11 into a predetermined learning model (step S12). 【0088】 The video compression device 1 acquires SRT network statistics information and obtains RTT from the acquired network statistics information as data indicating the communication status of network 5 (step S13). 【0089】 The video compression device 1 determines the values of α and β, which are parameters for the contrast conversion process according to Equation 1, based on the acquired RTT (step S14). 【0090】 The video compression device 1 extracts the non-focus region from each image that makes up the video acquired in step S11, excluding the region of interest (step S15). 【0091】 The video compression device 1 converts the contrast of the non-focus region by converting the pixel value of each pixel constituting the non-focus region according to Equation 1 (step S16). 【0092】 The video compression device 1 generates a video in which images obtained by combining the image of the region of interest determined in step S12 and the image of the non-region of interest whose contrast has been transformed in step S16 are arranged in chronological order (step S17). 【0093】 The video compression device 1 performs compression encoding on the composite video generated in step S17 (step S18). 【0094】 The video compression device 1 generates multiplexed data by multiplexing the compressed video with the values of α and β in Equation 1 and information of the region of interest indicating the position and size of the region of interest as additional information (step S19). 【0095】 The video compression device 1 transmits the generated multiplexed data to the video decompression device 2 (step S20). 【0096】 The video compression device 1 may repeatedly perform the process shown in Figure 10 at predetermined time intervals, or it may repeatedly perform it in real time. This allows the communication status of network 5 to be immediately reflected in the parameters of the contrast conversion process. Therefore, congestion of the communication bandwidth of network 5 can be prevented. 【0097】 [Processing procedure for video decompression device 2] Figure 11 is a flowchart showing an example of the operation of the video decompression device 2 according to the present disclosure. 【0098】 The video decompressor 2 receives multiplexed data from the video compression device 1 (step S21). 【0099】 The video decompressor 2 separates additional information from the received multiplexed data and generates compressed video and additional information (step S22). 【0100】 The video decompression device 2 decompresses the generated compressed video (step S23). 【0101】 The video decompression device 2 sets the values of α and β included in the generated additional information to Equation 2 (step S24). 【0102】 The video decompression device 2 identifies the position and size of the region of interest from the region of interest information included in the additional information. Based on the identified position and size of the region of interest, the video decompression device 2 generates an image of the non-region of interest and an image of the region of interest by cutting out the non-region of interest (excluding the region of interest) for each image that makes up the decompressed video (step S25). 【0103】 The image decompression device 2 converts the pixel values of each pixel constituting the non-focus region according to Equation 2 (step S26). 【0104】 The video decompression device 2 combines the image of the region of interest generated in step S25 with the image of the non-region of interest after conversion in step S26, and restores the video output by camera 3 by arranging the combined images in chronological order (step S27). 【0105】 The video decompression device 2 outputs the restored video to the display 4, thereby displaying the video on the display 4 (step S28). 【0106】 The video decompression device 2 may repeatedly perform the process shown in Figure 11 at predetermined time intervals, or it may repeatedly perform it in real time. 【0107】 [Comparison of preprocessing in video compression device 1] This study compares the properties of video compression devices 1 when contrast conversion, Blur filtering, or Gaussian filtering is performed as preprocessing. 【0108】 Figure 12 shows the relationship between file size and VMAF for each preprocessing step. The horizontal axis of Figure 12 shows the file size of the compressed video output from the video compression device 1. The vertical axis of Figure 12 shows the VMAF value of the original video restored from the compressed video by the video decompression device 2. Here, VMAF is a type of video quality evaluation index developed by Netflix, Inc., and the higher the VMAF value, the less the video quality is degraded. 【0109】 The lines shown in Figure 12 represent the relationship between file size and VMAF for the same video. 【0110】 The polyline 80Q, indicated by the legend "No preprocessing (QP specified)," shows the relationship between file size and VMAF when region-specific compression is performed without preprocessing of the video, while varying the QP of non-focus regions. In the polyline 80Q, QP decreases as you move to the right. In other words, the polyline 80Q shows that a smaller QP results in a larger file size and a larger VMAF value. 【0111】 The polygraph 80B, indicated in the legend "Blur Filtering," shows the relationship between file size and VMAF when the Blur filter is applied to the non-focus region as a preprocessing step while varying the kernel size of the Blur filter, and the video is compressed by the video compression unit 14. In the polygraph 80B, the kernel size decreases as you move to the right. In other words, the polygraph 80B shows that the smaller the kernel size of the Blur filter, the larger the file size and the larger the VMAF value. 【0112】 The polygraph 80G, indicated by the legend "Gaussian filtering," shows the relationship between file size and VMAF when a Gaussian filter is applied to the non-focus region as a preprocessing step while varying the kernel size of the Gaussian filter, and the video is compressed by the video compression unit 14. In the polygraph 80G, the kernel size decreases as you move to the right. In other words, the polygraph 80G shows that the smaller the kernel size of the Gaussian filter, the larger the file size and the larger the VAMF value. 【0113】 The polygraph line 80C, indicated by the legend "Contrast Conversion Processing," shows the relationship between file size and VMAF when the contrast conversion processing shown in Equation 1 is applied to the non-focus area as a preprocessing step, and the video is compressed by the video compression unit 14. In polygraph line 80C, the value of parameter α in Equation 1 increases as you move to the right. In other words, polygraph line 80C shows that the larger the value of parameter α in Equation 1, the larger the file size and the larger the VMAF value. 【0114】 Polygraphs 80Q and 80C are close together. Therefore, it can be seen that contrast conversion processing and region-specific compression processing have equivalent properties in terms of file size and VMAF. On the other hand, polygraphs 80B and 80G show that while Blur filtering and Gaussian filtering processing cause a sharp decrease in VMAF value if the kernel size is made too large, they have equivalent properties to region-specific compression processing when the kernel size is set appropriately. 【0115】 Figure 13 shows the processing time per frame, file size, and characteristics for each preprocessing step when the video compression device 1 processes the same input video. The "processing name" includes no preprocessing (QP specified), Blur filter processing, Gaussian filter processing, and contrast conversion processing. In other words, each processing step corresponds to the process shown in Figure 12, specifically the polyline 80Q, polyline 80B, polyline 80G, and polyline 80C. 【0116】 "Processing time per frame (msec)" indicates the time required to apply the preprocessing step indicated by "Processing name" to one frame of image. 【0117】 "File size (Mbyte)" indicates the file size of the compressed video output from video compression device 1 for the same input video. 【0118】 The "Features" section describes the characteristics of each process. 【0119】 In other words, when region-specific compression is performed by changing the QP of non-focus regions without preprocessing, the preprocessing time per frame is 0.00 msec, and the compressed video file size is 5.1 Mbyte. When Blur filtering is applied, the preprocessing time per frame is 6.92 msec, and the compressed video file size is 7.5 Mbyte. When Gaussian filtering is applied, the preprocessing time per frame is 5.71 msec, and the compressed video file size is 7.6 Mbyte. When contrast conversion is applied, the preprocessing time per frame is 3.18 msec, and the compressed video file size is 5.5 Mbyte. 【0120】 From these findings, it can be seen that the file size is smallest when no preprocessing is performed. Furthermore, it can be seen that contrast conversion processing has a shorter processing time per frame compared to Blur filtering and Gaussian filtering, and the file size is about the same as when no preprocessing is performed. Conversely, it can be seen that Blur filtering and Gaussian filtering take longer per frame than contrast conversion processing, and the file size is slightly larger. 【0121】 As explained above, the video compression device 1 generates compressed video by compressing video that has been preprocessed for the region of interest in the video acquired from the camera 3. Therefore, it is possible to perform compression equivalent to that performed by region-specific compression without using a special hardware encoder capable of region-specific compression. Thus, the amount of video data can be efficiently reduced. 【0122】 Furthermore, the video compression device 1 determines the parameter α in Equation 1 so that the amount of compressed video data becomes smaller when the transmission capacity of the network 5 decreases and the transmission bandwidth becomes congested. This prevents packet loss or transmission delay in the network 5. In particular, temporary congestion of the transmission bandwidth may occur due to deterioration of radio wave conditions in wireless communication or an increase in the number of terminals connected to the network 5. Even in such cases, the video compression device 1 can perform real-time video transmission without causing packet loss or transmission delay. 【0123】 Furthermore, the video compression device 1 compresses and encodes the video uniformly without distinguishing between areas of interest and areas of non-interest. Therefore, it is possible to efficiently reduce the amount of video data using commonly available hardware encoders. 【0124】 Furthermore, the video decompression device 2 performs a conversion process on the non-focused region that is the reverse of the preprocessing performed by the video compression device 1. As a result, it is possible to reconstruct an image equivalent to the image acquired by the video compression device 1 from the camera 3. 【0125】 Furthermore, the video compression device 1 performs a preprocessing step that reduces the contrast of non-focus areas relative to the contrast of focus areas. This reduces color information in non-focus areas, reduces high-frequency components, and smooths edges, thereby efficiently reducing the amount of video data. 【0126】 [Example 1] In the above embodiment, a contrast processing step was described as a preprocessing step performed by the video preprocessing unit 13 of the video compression device 1, which reduces the contrast of non-focus areas in the image constituting the video. However, the preprocessing is not limited to this. For example, the video preprocessing unit 13 may perform a contrast conversion step as a preprocessing step to increase the contrast of the focus area. This makes it possible to relatively reduce the contrast of non-focus areas compared to the contrast of focus areas. 【0127】 For example, camera 3 reduces the exposure compared to normal conditions to capture images of the area around the vehicle with low contrast. Camera 3 outputs the resulting image to the video compression device 1. 【0128】 The video preprocessing unit 13 of the video compression device 1 performs preprocessing to increase the contrast of the area of interest described above. As a result, the video preprocessing unit 13 can generate an image similar to the one generated by the video preprocessing unit 13 in the above embodiment and output it to the video compression unit 14. In other words, the area of interest has normal contrast, while the non-interesting area is generated with low contrast. 【0129】 [Differentiation 2] The above embodiments and Modification 1 describe a case in which an image is classified into two types of regions: a region of interest and a region of non-interest. However, the types of regions are not limited to two, and this disclosure is also applicable when an image is classified into three or more types of regions. 【0130】 For example, the video compression device 1 may classify the video into three different regions. As an example, the focus area determination unit 12 of the video compression device 1 determines a focus area, a non-focus area, and an intermediate area that is neither a focus area nor a non-focus area. For example, a focus area is an area that includes people or cars and requires high-definition transmission. A non-focus area is an area that includes the sky or the sea and does not require high-definition transmission. An intermediate area is an area that does not belong to either a focus area or a non-focus area. For example, an intermediate area is an area that includes roads, trees, or buildings and it is difficult to determine whether or not high-definition transmission is required. 【0131】 The video preprocessing unit 13 performs the preprocessing described in the above embodiment or modification 1 on the video. The video preprocessing unit 13 also treats the intermediate processing in the same way as the region of interest or the region of non-interest. This allows the video preprocessing unit 13 to perform preprocessing on one or two types of regions from the region of interest, the region of non-interest, and the intermediate region, and not to perform preprocessing on the other regions. 【0132】 The video preprocessing unit 13 may perform different processing on the three types of regions. For example, the video preprocessing unit 13 may prepare contrast conversion to reduce contrast and blur filtering as preprocessing. The video preprocessing unit 13 may not perform preprocessing on the region of interest. The video preprocessing unit 13 may perform contrast conversion and blur filtering on the region of non-interest. The video preprocessing unit 13 may perform only contrast conversion on the intermediate region. 【0133】 The preprocessing applied to each region is not limited to those described above; any process that reduces the contrast of the non-focus region to that of the focus region is acceptable. 【0134】 [Note] Although a video transmission system 100 according to an embodiment of this disclosure has been described above, this disclosure is not limited to this embodiment. 【0135】 The use of camera 3 is not limited to monitoring the surroundings of the vehicle. For example, camera 3 may also be used for monitoring the driver, creating road maps, etc. 【0136】 The installation location of camera 3 is not limited to vehicles. For example, camera 3 may be installed in a factory and used for factory monitoring purposes. 【0137】 The focus region determination unit 12 of the video compression device 1 determined the focus region in the image using a learning model, but the focus region may be determined by other methods. For example, the focus region determination unit 12 may detect the driver's gaze by some method and determine the area that the driver is looking at in the direction of their gaze as the focus region. 【0138】 Furthermore, the area of interest determination unit 12 may accept the position and size of the area of interest via user input. 【0139】 The area of interest may be a pre-defined, fixed area. 【0140】 Furthermore, the preprocessing control unit 16 of the video compression device 1 may determine the value of α in Equation 1 based on the total area of the region of interest. In other words, the value of α may be determined from the total area of the region of interest using a relational expression or table data in which the value of α decreases as the total area of the region of interest increases. When α is the same, the amount of data in the compressed video increases as the total area of the region of interest increases, so the amount of data in the compressed video can be limited by determining the value of α according to the total area of the region of interest. This prevents congestion of the transmission bandwidth of the network 5. Note that instead of the total area of the region of interest, the ratio of the total area of the region of interest to the total area of the image may be used. 【0141】 Each process (each function) of the above-described embodiment is implemented by a processing circuit (Circuitry) including one or more processors. The processing circuit may consist of one or more memories, various analog circuits, various digital circuits, etc., in addition to the one or more processors, as well as an integrated circuit. The one or more memories store programs (instructions) that cause the one or more processors to execute each of the above processes. The one or more processors may execute each of the above processes according to the programs read from the one or more memories, or they may execute each of the above processes according to logic circuits that have been pre-designed to execute each of the above processes. The processors may be various processors suitable for computer control, such as a CPU (Central Processing Unit), GPU (Graphics Processing Unit), DSP (Digital Signal Processor), FPGA (Field Programmable Gate Array), ASIC (Application Specific Integrated Circuit), etc. The physically separated multiple processors may cooperate with each other to execute each of the above processes. For example, the processors installed in each of several physically separate computers may cooperate with each other via a network such as a LAN (Local Area Network), WAN (Wide Area Network), or the Internet to perform the above processes. The program may be installed in the memory via the network from an external server device, or it may be distributed on a recording medium such as a CD-ROM (Compact Disc Read Only Memory), DVD-ROM (Digital Versatile Disk Read Only Memory), or semiconductor memory, and then installed in the memory from the recording medium. 【0142】 The embodiments disclosed herein should be considered in all respects to be illustrative and not restrictive. The scope of the invention is indicated by the claims, not in the sense described above, and all modifications are intended to be in the sense and scope equivalent to the claims. [Explanation of Symbols] 【0143】 1. Video compression device 2. Image expansion device 3 cameras 4 displays 5 Network 11. Video Acquisition Unit 12. Area of Focus Determination Unit 13. Video Preprocessing Unit 14. Video Compression Section 15 Communication Status Acquisition Unit 16 Preprocessing Control Unit 17 Multiplex section 18. Data transmission section 21 Data receiving unit 22 Additional Information Separation Unit 23. Video expansion unit 24. Video Restoration Department 25 Display Control Unit 50 images 51 images 52 images 53 images 61 Areas of Interest 62 Areas of Interest 63 Non-attention area 100 Video Transmission Systems 80B broken line 80C broken line 80G broken line 80Q broken line

Claims

[Claim 1] A video preprocessing unit that generates a second video from the first video by performing preprocessing on at least one of a region of interest and a non-interest region different from the region of interest included in the first video, which affects both image quality and transmission capacity of the transmission path, such that the effects on the region of interest and the non-interest region after the preprocessing are different, A video compression unit compresses the second video and generates a compressed video to be sent to the transmission path, A video transmission system comprising: a video decompression unit for decompressing the compressed video received from the transmission line. [Claim 2] The video transmission system according to claim 1, further comprising a preprocessing control unit that determines the parameters of the preprocessing based on the communication status of the transmission path. [Claim 3] The video compression unit uniformly compresses the area of interest and the area of non-interest in the second video, as described in claim 1 or claim 2. [Claim 4] The video transmission system according to claim 1 or claim 2, further comprising a video restoration unit that performs a conversion process in the reverse of the preprocessing on the video expanded by the video expansion unit. [Claim 5] The video transmission system according to claim 1 or claim 2, wherein the preprocessing is a process that reduces the contrast of the non-focus area relative to the contrast of the focus area. [Claim 6] A video preprocessing unit that generates a second video from the first video by performing preprocessing on at least one of a region of interest and a non-interest region different from the region of interest included in the first video, which affects both image quality and transmission capacity of the transmission path, such that the effects on the region of interest and the non-interest region after the preprocessing are different, A video compression device comprising: a video compression unit that compresses the second video and generates a compressed video to be sent to the transmission path. [Claim 7] A step of generating a second image from the first image by performing preprocessing on at least one of a region of interest and a non-interest region different from the region of interest included in the first image, such that the effects on both image quality and transmission capacity of the transmission path are different after the preprocessing, A video compression method comprising the steps of compressing the second video and generating a compressed video to be sent to the transmission path.