An artificial intelligence image processing method based on line scanning

By employing a line-scan-based artificial intelligence image processing method, and utilizing a spatiotemporal task collaboration model of convolutional layers and inter-segment information transfer modules, the memory wall problem in deep learning image processing is solved, achieving high-efficiency image processing quality and real-time performance, especially excelling in denoising and super-resolution tasks.

CN116385843BActive Publication Date: 2026-06-12SHANGHAI YUKAN TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANGHAI YUKAN TECH CO LTD
Filing Date
2022-11-22
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing deep learning image processing algorithms suffer from the memory wall problem in image sensors, resulting in huge storage requirements and power consumption, which limits real-time performance and image processing quality.

Method used

An AI image processing method based on row scanning is adopted. An encoder and decoder are constructed through convolutional layers to perform spatial task processing. A temporal task model is built by combining an inter-segment information transfer module. A spatiotemporal task collaboration model is created, and the image is segmented by horizontal rows and processed through the spatiotemporal task collaboration model.

🎯Benefits of technology

It solves the problem of large storage requirements while retaining the advantages of deep learning algorithms in image processing quality, achieving real-time and high-quality image processing, especially in denoising and super-resolution tasks, achieving results comparable to the best algorithms currently available.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116385843B_ABST
    Figure CN116385843B_ABST
Patent Text Reader

Abstract

The application discloses an artificial intelligence image processing method and system based on line scanning, comprising the following steps: constructing an encoder and a decoder through a convolution layer, performing spatial task processing, and obtaining a spatial task model; modeling through an inter-segment information transmission module, performing inter-segment information transmission according to a time sequence, and obtaining a time task model; performing space-time task collaboration on the spatial task model and the time task model, creating a deep learning image processing algorithm architecture based on line scanning, and obtaining a space-time task collaboration model; cutting an image according to horizontal lines, grouping any multiple image lines into an arbitrary height image segment after cutting, and performing artificial intelligence image processing on the arbitrary height image segment through the space-time task collaboration model.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence image processing technology, and more specifically, to an artificial intelligence image processing method based on line scanning. Background Technology

[0002] Currently, image quality and real-time performance are two crucial performance indicators in image processing, particularly in photography and security surveillance. Achieving high-quality image processing in real-time remains a key research focus. Traditional image processing algorithms, supported by image signal processors (ISPs), offer good real-time performance, but their processing quality is generally mediocre, especially under less than ideal conditions, such as low light, or challenging tasks like super-resolution. The introduction of deep learning in recent years has significantly improved image processing quality, but it has also brought challenges in meeting real-time performance requirements. Similar to other deep learning algorithms, deep learning image processing algorithms face the memory wall problem in their corresponding accelerator implementations, leading to power consumption and speed limitations that restrict practical applications. To achieve good image processing quality, existing deep learning image processing algorithms need to extract global information from the image, requiring a large receptive field—meaning a pixel in the output image depends on a large area of ​​the input image. To facilitate this, existing deep learning algorithms employ a full-image processing approach, simultaneously feeding the entire image into the algorithm model for processing and outputting the entire image. This full-image processing method requires intermediate data to be cached during the processing related to the entire image, resulting in huge storage demands. These demands increase quadratically with image resolution; for 720p, the storage requirement exceeds 100MB. This creates a severe memory wall problem in the design of deep learning accelerators, causing significant power consumption and latency, making it difficult to leverage the image processing quality advantages of deep learning in practical applications. Although some technologies have attempted to address the memory wall problem through hardware design, such as in-memory computing, many issues remain before practical application. Image sensors transmit image data line by line through scanning; therefore, it is necessary to propose a line-scanning-based artificial intelligence image processing method and system to at least partially solve the problems existing in the current technology. Summary of the Invention

[0003] The summary of this invention introduces a series of simplified concepts, which will be further explained in detail in the detailed description section. The summary of this invention does not mean that it attempts to limit the key features and essential technical features of the claimed technical solution, nor does it mean that it attempts to determine the scope of protection of the claimed technical solution.

[0004] To at least partially solve the above problems, the present invention provides an artificial intelligence image processing method based on line scanning, comprising:

[0005] S100 constructs encoders and decoders through convolutional layers to perform spatial task processing and obtain spatial task models;

[0006] S200 uses the inter-segment information transfer module for modeling, and performs inter-segment information transfer according to the time sequence to obtain a time task model;

[0007] S300 combines spatial and temporal task models to create a deep learning image processing algorithm architecture based on line scanning, thereby obtaining a spatiotemporal task collaboration model.

[0008] S400 divides the image into horizontal rows, and then combines any number of image rows into image segments of arbitrary height. The spatiotemporal task collaboration model is used to perform artificial intelligence image processing on the image segments of arbitrary height.

[0009] Preferably, S100 includes:

[0010] S101: The encoder is constructed using the first classical convolutional layer; the decoder is constructed using the second classical convolutional layer.

[0011] S102 inputs an image segment of arbitrary height into the encoder, processes it through multiple information transmission modules, outputs it through the decoder, executes the processing of the current segment of the image segment of arbitrary height, and performs spatial task processing.

[0012] S103 performs space mission processing by constructing an encoder and a decoder to obtain a space mission model.

[0013] Preferably, S200 includes:

[0014] S201 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information; the inter-segment information transfer module includes: an inter-segment correlation extraction module and a relevant information fusion module;

[0015] S202, construct a time sequence from multiple time values, and sequentially correspond multiple image segments of arbitrary height from top to bottom to multiple time values ​​in the time sequence from first to last;

[0016] S203, according to the time sequence, performs inter-segment information transmission through the inter-segment information transmission module, breaks the receptive field limitation, and obtains the time task model.

[0017] Preferably, S300 includes:

[0018] S301 coordinates the spatial task model and the temporal task model in a spatiotemporal manner, and sets the encoder before the inter-segment information transmission module as the input end of the image segment at any height.

[0019] S302, the decoder is set after the inter-segment information transmission module, and serves as the output end of the image segment at any height;

[0020] S303, through the interconnection of encoders, decoders and multiple inter-segment information transmission modules, creates a deep learning image processing algorithm architecture based on line scanning, and obtains a spatiotemporal task collaborative model.

[0021] Preferably, S400 includes:

[0022] S401, the image is divided into horizontal rows, and after division, any number of image rows are combined into an image segment of arbitrary height; each image segment of arbitrary height corresponds to each time value in the first time sequence; the height of the image segment of arbitrary height is set to an arbitrary height by the number of image rows it contains; if the image is divided to the last segment and the height does not meet the set height, measures can be taken to fill in the missing rows, including: directly filling in the missing rows with zeros, filling in with the data of the last row of the current segment, or performing mirror filling;

[0023] S402, arrange multiple arbitrary height image segments in spatial vertical order; each arbitrary height image segment corresponds to each time value in the second time sequence after processing;

[0024] S403: When the image sensor scans the input image line by line, it accumulates to a set height and obtains an image segment of arbitrary height, which is then sent to the spatiotemporal task collaboration model for processing by an algorithm; each image segment of arbitrary height is sequentially input into the spatiotemporal task collaboration model for processing; and artificial intelligence image processing is performed on the image segments of arbitrary height through the spatiotemporal task collaboration model.

[0025] Preferably, S201 includes:

[0026] S2011 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information;

[0027] S2012 uses the inter-segment correlation extraction module to extract correlations and remove useless information.

[0028] S2013, through the relevant information fusion module, uses residual feature enhancement to fuse useful information with information from the current image segment at any height.

[0029] Preferably, S2012 includes:

[0030] S2112, establish a correlation extraction solution model; in the correlation extraction solution model, the image information is divided into two parts: first global information G and first information U useful for the current segment, which will be continuously updated with the input image rows; at the same time, the image information size remains unchanged, and the size is the same as that of the current segment information, maintaining storage-friendly characteristics; the global information G will be updated according to the second global information Gn-1 of the previous state, the current segment information xn, and the second information Un-1 useful for the current segment of the previous state to obtain the third global information Gn, and the first information U will be updated according to the second information Un-1, the current segment information xn, and the third global information Gn to obtain the third information Un useful for the current segment;

[0031] S2212, based on the correlation extraction solution model, constructs a correlation extraction module. The specific architecture of the correlation extraction module includes: i represents the input unit; if represents the input information processing unit; o represents the output unit; of represents the output information processing unit; s represents the filtering unit; where o, i, and s units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a sigmoid activation function; if units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a tanh activation function; of units are composed of tanh activation functions; through the inter-segment correlation extraction module, correlations are extracted and useless information is removed.

[0032] Preferably, S2013 includes:

[0033] S2113, through the relevant information fusion module, residual feature enhancement is adopted, and the current information xn and the third information Un useful for the current segment are channel concatenated;

[0034] S2213, after passing through a 3*3 kernel size, stride of 1, and channel number consistent with the current segment information xn, and then the residual is added to the current segment information xn;

[0035] S2313, which fuses useful information with information from the current image segment at any altitude.

[0036] Preferably, S403 includes:

[0037] S4031, each segment of multiple arbitrary height image segments is sequentially input into the spatiotemporal task collaborative model according to the first time sequence, and processed by a deep learning image processing algorithm based on row scanning; the spatiotemporal task collaborative model only stores intermediate data related to the current segment during the processing; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence includes: assigning the first segment of the arbitrary height image segment to the first time value of the first time sequence; assigning the second segment of the arbitrary height image segment to the second time value of the first time sequence; sequentially assigning each arbitrary height image segment to each time value in the first time sequence; dividing the image by rows, and then combining any number of image rows into arbitrary height image segments; when the timing is the first time value, inputting the first segment of the multiple arbitrary height image segments into the spatiotemporal task collaborative model; when the timing is the second time value, inputting the second segment of the multiple arbitrary height image segments into the spatiotemporal task collaborative model; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence;

[0038] S4032, the image segment at any height is processed by the spatiotemporal task collaboration model using artificial intelligence image processing, and the processed image segment at any height is output sequentially according to the second time sequence; the above steps are repeated to perform artificial intelligence image processing on the image segment at any height.

[0039] The present invention provides an artificial intelligence image processing system based on line scanning, comprising: a system employing any of the aforementioned artificial intelligence image processing methods based on line scanning.

[0040] Compared with the prior art, the present invention has at least the following beneficial effects:

[0041] This invention provides an artificial intelligence image processing method and system based on line scanning. It constructs an encoder and decoder through convolutional layers to perform spatial task processing and obtain a spatial task model. A segment information transfer module is used for modeling, and inter-segment information transfer is performed according to a time sequence to obtain a temporal task model. The spatial and temporal task models are then combined to create a line-scan-based deep learning image processing algorithm architecture, resulting in a spatiotemporal task collaborative model. The image is segmented horizontally, and multiple image rows are combined to form image segments of arbitrary height. The spatiotemporal task collaborative model is used to perform artificial intelligence image processing on these image segments. This invention solves the problem of large storage requirements while retaining the advantages of deep learning algorithms in image processing quality. The line-scan-based deep learning image processing algorithm solves the memory problem at the algorithm level, ensuring that the deep learning image processing algorithm meets real-time requirements for practical applications. In this invention, the line scanning process is modeled as a spatiotemporal collaborative task, thus breaking the limitation of the receptive field. It achieves image processing quality comparable to or even better than the best current deep learning algorithms in denoising and super-resolution tasks with minimal storage requirements.

[0042] The present invention provides an artificial intelligence image processing method and system based on line scanning. Other advantages, objectives and features of the present invention will be partly apparent from the following description, and partly understood by those skilled in the art through study and practice of the present invention. Attached Figure Description

[0043] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings:

[0044] Figure 1 This is a structural diagram of the information transmission module of an artificial intelligence image processing method and system based on line scanning, as described in this invention.

[0045] Figure 2 This is a processing mode diagram of an embodiment of an artificial intelligence image processing method and system based on line scanning according to the present invention.

[0046] Figure 3 The diagram shows the solution model a and architecture b of the correlation extraction module in an embodiment of an artificial intelligence image processing method and system based on row scanning, as described in this invention.

[0047] Figure 4 This is a diagram illustrating the deep learning image processing algorithm architecture based on row scanning, as described in this invention, for an artificial intelligence image processing method and system embodiment.

[0048] Figure 5This is a comparison chart of denoising and super-resolution tasks in an embodiment of an artificial intelligence image processing method and system based on line scanning described in this invention. Detailed Implementation

[0049] The present invention will now be described in further detail with reference to the accompanying drawings and embodiments, so that those skilled in the art can implement it based on the description; for example Figure 1-5 As shown, the present invention provides an artificial intelligence image processing method based on line scanning, comprising:

[0050] S100 constructs encoders and decoders through convolutional layers to perform spatial task processing and obtain spatial task models;

[0051] S200 uses the inter-segment information transfer module for modeling, and performs inter-segment information transfer according to the time sequence to obtain a time task model;

[0052] S300 combines spatial and temporal task models to create a deep learning image processing algorithm architecture based on line scanning, thereby obtaining a spatiotemporal task collaboration model.

[0053] S400 divides the image into horizontal rows, and then combines any number of image rows into image segments of arbitrary height. The spatiotemporal task collaboration model is used to perform artificial intelligence image processing on the image segments of arbitrary height.

[0054] The working principle of the above technical solution is as follows: This invention provides an artificial intelligence image processing method based on line scanning, including:

[0055] S100 constructs encoders and decoders through convolutional layers to perform spatial task processing and obtain spatial task models;

[0056] S200 uses the inter-segment information transfer module for modeling, and performs inter-segment information transfer according to the time sequence to obtain a time task model;

[0057] S300 combines spatial and temporal task models to create a deep learning image processing algorithm architecture based on line scanning, thereby obtaining a spatiotemporal task collaboration model.

[0058] S400 divides the image into horizontal rows, and then combines any number of image rows into image segments of arbitrary height. The spatiotemporal task collaboration model is used to perform artificial intelligence image processing on the image segments of arbitrary height.

[0059] The beneficial effects of the above technical solution are as follows: This invention provides an artificial intelligence image processing method based on line scanning. It constructs an encoder and decoder through convolutional layers to perform spatial task processing and obtain a spatial task model; it models the process through an inter-segment information transfer module, performing inter-segment information transfer according to a time sequence to obtain a temporal task model; it coordinates the spatial task model and the temporal task model to create a deep learning image processing algorithm architecture based on line scanning, obtaining a spatiotemporal task coordination model; it segments the image horizontally, and after segmentation, combines any number of image rows into image segments of arbitrary height, performing artificial intelligence image processing on these image segments through the spatiotemporal task coordination model; this invention... This invention presents an artificial intelligence image processing method based on row scanning, which solves the problem of large storage requirements while retaining the advantages of deep learning algorithms in image processing quality. The adopted row scanning-based deep learning image processing algorithm solves the memory problem at the algorithm level, which helps ensure that the deep learning image processing algorithm meets real-time requirements for practical applications. In this invention, the row scanning process is modeled as a spatiotemporal collaborative task. The spatial part is responsible for processing the current row, while the spatial task is responsible for transmitting inter-segment information, thereby breaking the limitation of the receptive field. In denoising and super-resolution tasks, it achieves image processing quality comparable to or even better than the best current deep learning algorithms with minimal storage requirements.

[0060] In one embodiment, S100 includes:

[0061] S101: The encoder is constructed using the first classical convolutional layer; the decoder is constructed using the second classical convolutional layer.

[0062] S102 inputs an image segment of arbitrary height into the encoder, processes it through multiple information transmission modules, outputs it through the decoder, executes the processing of the current segment of the image segment of arbitrary height, and performs spatial task processing.

[0063] S103 performs space mission processing by constructing an encoder and a decoder to obtain a space mission model.

[0064] The working principle of the above technical solution is as follows: S100 includes:

[0065] S101: The encoder is constructed using the first classical convolutional layer; the decoder is constructed using the second classical convolutional layer.

[0066] S102 inputs an image segment of arbitrary height into the encoder, processes it through multiple information transmission modules, outputs it through the decoder, executes the processing of the current segment of the image segment of arbitrary height, and performs spatial task processing.

[0067] S103 performs space mission processing by constructing an encoder and a decoder to obtain a space mission model.

[0068] The beneficial effects of the above technical solution are as follows: an encoder is constructed through a first classical convolutional layer; a decoder is constructed through a second classical convolutional layer; an image segment of arbitrary height is input into the encoder, processed by multiple information transmission modules, and output through the decoder to perform the processing of the current segment of the image segment of arbitrary height, thereby performing spatial task processing; by constructing the encoder and the decoder, spatial task processing is performed to obtain a spatial task model.

[0069] In one embodiment, S200 includes:

[0070] S201 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information; the inter-segment information transfer module includes: an inter-segment correlation extraction module and a relevant information fusion module;

[0071] S202, construct a time sequence from multiple time values, and sequentially correspond multiple image segments of arbitrary height from top to bottom to multiple time values ​​in the time sequence from first to last;

[0072] S203, according to the time sequence, performs inter-segment information transmission through the inter-segment information transmission module, breaks the receptive field limitation, and obtains a time task model;

[0073] The working principle of the above technical solution is as follows: S200 includes:

[0074] S201 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information; the inter-segment information transfer module includes: an inter-segment correlation extraction module and a relevant information fusion module;

[0075] S202, construct a time sequence from multiple time values, and sequentially correspond multiple image segments of arbitrary height from top to bottom to multiple time values ​​in the time sequence from first to last;

[0076] S203, according to the time sequence, performs inter-segment information transmission through the inter-segment information transmission module, breaks the receptive field limitation, and obtains a time task model;

[0077] The beneficial effects of the above technical solution are as follows: modeling is performed through the inter-segment information transmission module, and correlations are extracted and related information is fused; the inter-segment information transmission module includes: an inter-segment correlation extraction module and a related information fusion module; multiple time values ​​are used to form a time sequence, and multiple image segments of arbitrary height from top to bottom are sequentially mapped to multiple time values ​​in the time sequence from first to last; inter-segment information is transmitted through the inter-segment information transmission module according to the time sequence, breaking the receptive field limitation and obtaining a time task model;

[0078] In one embodiment, S300 includes:

[0079] S301 coordinates the spatial task model and the temporal task model in a spatiotemporal manner, and sets the encoder before the inter-segment information transmission module as the input end of the image segment at any height.

[0080] S302, the decoder is set after the inter-segment information transmission module, and serves as the output end of the image segment at any height;

[0081] S303, through the interconnection of encoders, decoders and multiple inter-segment information transmission modules, creates a deep learning image processing algorithm architecture based on line scanning, and obtains a spatiotemporal task collaborative model.

[0082] The working principle of the above technical solution is as follows: S300 includes:

[0083] S301 coordinates the spatial task model and the temporal task model in a spatiotemporal manner, and sets the encoder before the inter-segment information transmission module as the input end of the image segment at any height.

[0084] S302, the decoder is set after the inter-segment information transmission module, and serves as the output end of the image segment at any height;

[0085] S303, through the connection and configuration of encoder, decoder and multiple inter-segment information transmission modules, creates a deep learning image processing algorithm architecture based on line scanning to obtain a spatiotemporal task collaborative model;

[0086] Alternatively, the spatial task model and the temporal task model can be coordinated in a spatiotemporal manner. The encoder is placed before the information transmission module as the image row input end, and the decoder is placed after the information transmission module as the image row output end. By connecting the encoder, decoder and multiple information transmission modules, a deep learning image processing algorithm architecture that processes images row by row can be created, thus obtaining a spatiotemporal task coordination model.

[0087] The beneficial effects of the above technical solution are as follows: the spatial task model and the temporal task model are coordinated in a spatiotemporal manner; the encoder is set before the inter-segment information transmission module as the input end of the image segment at any height; the decoder is set after the inter-segment information transmission module as the output end of the image segment at any height; by connecting the encoder, decoder and multiple inter-segment information transmission modules, a deep learning image processing algorithm architecture based on line scanning is created to obtain a spatiotemporal task coordination model.

[0088] By coordinating spatial and temporal task models, the encoder is placed before the information transmission module as the image row input, and the decoder is placed after the information transmission module as the image row output. By connecting the encoder, decoder, and multiple information transmission modules, a deep learning image processing algorithm architecture that processes images row by row is created, resulting in a spatiotemporal task collaborative model.

[0089] In one embodiment, S400 includes:

[0090] S401, the image is divided into horizontal rows, and after division, any number of image rows are combined into an image segment of arbitrary height; each image segment of arbitrary height corresponds to each time value in the first time sequence; the height of the image segment of arbitrary height is set to an arbitrary height by the number of image rows it contains; if the image is divided to the last segment and the height does not meet the set height, measures can be taken to fill in the missing rows, including: directly filling in the missing rows with zeros, filling in with the data of the last row of the current segment, or performing mirror filling;

[0091] S402, arrange multiple arbitrary height image segments in spatial vertical order; each arbitrary height image segment corresponds to each time value in the second time sequence after processing;

[0092] S403: When the image sensor scans the input image line by line, it accumulates to a set height and obtains an image segment of arbitrary height, which is then sent to the spatiotemporal task collaboration model for processing by an algorithm; each image segment of arbitrary height is sequentially input into the spatiotemporal task collaboration model for processing; and artificial intelligence image processing is performed on the image segments of arbitrary height through the spatiotemporal task collaboration model.

[0093] The working principle of the above technical solution is as follows: S400 includes:

[0094] S401, the image is divided into horizontal rows, and after division, any number of image rows are combined into an image segment of arbitrary height; each image segment of arbitrary height corresponds to each time value in the first time sequence; the height of the image segment of arbitrary height is set to an arbitrary height by the number of image rows it contains; if the image is divided to the last segment and the height does not meet the set height, measures can be taken to fill in the missing rows, including: directly filling in the missing rows with zeros, filling in with the data of the last row of the current segment, or performing mirror filling;

[0095] S402, arrange multiple arbitrary height image segments in spatial vertical order; each arbitrary height image segment corresponds to each time value in the second time sequence after processing;

[0096] S403: When the image sensor scans the input image line by line, it accumulates to a set height and obtains an image segment of arbitrary height, which is then sent to the spatiotemporal task collaboration model for processing by an algorithm; each image segment of arbitrary height is sequentially input into the spatiotemporal task collaboration model for processing; and artificial intelligence image processing is performed on the image segments of arbitrary height through the spatiotemporal task collaboration model.

[0097] The beneficial effects of the above technical solution are as follows: the image is divided into horizontal rows, and after division, multiple image rows are combined into image segments of arbitrary height; each image segment of arbitrary height corresponds to each time value in the first time sequence; the height of the image segment of arbitrary height is set to an arbitrary height by the number of image rows it contains; if the image is divided to the last segment and does not meet the set height, measures can be taken to fill in the missing rows, including: directly filling in the missing rows with zeros, supplementing with the data of the last row of the current segment, or performing mirror supplementation; multiple image segments of arbitrary height are arranged in spatial vertical order; each image segment of arbitrary height corresponds to each time value in the second time sequence after processing; when the image sensor scans the input image row by row, the accumulated height reaches the set height, and an image segment of arbitrary height is obtained, which is then sent to the spatiotemporal task collaborative model and processed by the algorithm; each image segment of arbitrary height is sequentially input into the spatiotemporal task collaborative model for processing; artificial intelligence image processing is performed on the image segments of arbitrary height through the spatiotemporal task collaborative model.

[0098] In one embodiment, S201 includes:

[0099] S2011 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information;

[0100] S2012 uses the inter-segment correlation extraction module to extract correlations and remove useless information.

[0101] S2013, through the relevant information fusion module, uses residual feature enhancement to fuse useful information with information from the current image segment at any height.

[0102] The working principle of the above technical solution is as follows: S201 includes:

[0103] S2011 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information;

[0104] S2012 uses the inter-segment correlation extraction module to extract correlations and remove useless information.

[0105] S2013, through the relevant information fusion module, uses residual feature enhancement to fuse useful information with information of the current arbitrary height image segment;

[0106] Alternatively, modeling can be performed through the information transmission module to extract correlations and fuse relevant information; the correlation extraction module can extract correlations and remove useless information; and the relevant information fusion module can use residual feature enhancement to fuse useful information with the information of the current segmented image row.

[0107] The beneficial effects of the above technical solution are as follows: modeling is performed through the inter-segment information transmission module to extract correlations and fuse relevant information; correlations are extracted and useless information is removed through the inter-segment correlation extraction module; and useful information is fused with information from the current arbitrary height image segment by using residual feature enhancement through the relevant information fusion module.

[0108] The information transmission module is used for modeling, and the correlation is extracted and related information is fused. The correlation extraction module is used to extract the correlation and remove useless information. The related information fusion module uses residual feature enhancement to fuse useful information with the information of the current segmented image row.

[0109] In one embodiment, S2012 includes:

[0110] S2112, establish a correlation extraction solution model; in the correlation extraction solution model, the image information is divided into two parts: first global information G and first information U useful for the current segment, which will be continuously updated with the input image rows; at the same time, the image information size remains unchanged, and the size is the same as that of the current segment information, maintaining storage-friendly characteristics; the global information G will be updated according to the second global information Gn-1 of the previous state, the current segment information xn, and the second information Un-1 useful for the current segment of the previous state to obtain the third global information Gn, and the first information U will be updated according to the second information Un-1, the current segment information xn, and the third global information Gn to obtain the third information Un useful for the current segment;

[0111] S2212, based on the correlation extraction solution model, constructs a correlation extraction module. The specific architecture of the correlation extraction module includes: i represents the input unit; if represents the input information processing unit; o represents the output unit; of represents the output information processing unit; s represents the filtering unit; where o, i, and s units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a sigmoid activation function; if units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a tanh activation function; of units are composed of tanh activation functions; through the inter-segment correlation extraction module, correlations are extracted and useless information is removed.

[0112] The working principle of the above technical solution is as follows: S2012 includes:

[0113] S2112, establish a correlation extraction solution model; in the correlation extraction solution model, the image information is divided into two parts: first global information G and first information U useful for the current segment, which will be continuously updated with the input image rows; at the same time, the image information size remains unchanged, and the size is the same as that of the current segment information, maintaining storage-friendly characteristics; the global information G will be updated according to the second global information Gn-1 of the previous state, the current segment information xn, and the second information Un-1 useful for the current segment of the previous state to obtain the third global information Gn, and the first information U will be updated according to the second information Un-1, the current segment information xn, and the third global information Gn to obtain the third information Un useful for the current segment;

[0114] S2212, based on the correlation extraction solution model, constructs a correlation extraction module. The specific architecture of the correlation extraction module includes: i represents the input unit; if represents the input information processing unit; o represents the output unit; of represents the output information processing unit; s represents the filtering unit; where o, i, and s units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a sigmoid activation function; if units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a tanh activation function; of units are composed of tanh activation functions; through the inter-segment correlation extraction module, correlations are extracted and useless information is removed.

[0115] The beneficial effects of the above technical solution are as follows: A correlation extraction solution model is established; in this model, image information is divided into two parts: first global information G and first information U useful for the current segment, which are continuously updated with the input image rows; simultaneously, the image information size remains unchanged, maintaining the same size as the current segment information, thus preserving storage-friendly characteristics; global information G is updated based on the second global information Gn-1 from the previous state, the current segment information xn, and the second information Un-1 useful for the current segment from the previous state to obtain third global information Gn; first information U is updated based on the second information Un-1, the current segment information xn, and the third global information Gn to obtain third information Un useful for the current segment; based on correlation... The solution model is extracted, and a correlation extraction module is constructed. The specific architecture of the correlation extraction module includes: i represents the input unit; if represents the input information processing unit; o represents the output unit; of represents the output information processing unit; and s represents the filtering unit. Among them, the o, i, and s units are composed of convolutions with a kernel size of 3*3, a stride of 1, and the same number of channels as the input information, and a sigmoid activation function. The if unit is composed of convolutions with a kernel size of 3*3, a stride of 1, and the same number of channels as the input information, and a tanh activation function. The of unit is composed of a tanh activation function. Through the inter-segment correlation extraction module, correlations are extracted and useless information is removed. This helps ensure that the deep learning image processing algorithm meets the real-time requirements for practical applications.

[0116] In one embodiment, S2013 includes:

[0117] S2113, through the relevant information fusion module, residual feature enhancement is adopted, and the current information xn and the third information Un useful for the current segment are channel concatenated;

[0118] S2213, after passing through a 3*3 kernel size, stride of 1, and channel number consistent with the current segment information xn, and then the residual is added to the current segment information xn;

[0119] S2313, which fuses useful information with information from the current image segment at any altitude.

[0120] The working principle of the above technical solution is as follows: S2013 includes:

[0121] S2113, through the relevant information fusion module, residual feature enhancement is adopted, and the current information xn and the third information Un useful for the current segment are channel concatenated;

[0122] S2213, after passing through a 3*3 kernel size, stride of 1, and channel number consistent with the current segment information xn, and then the residual is added to the current segment information xn;

[0123] S2313, which fuses useful information with information from the current image segment at any altitude.

[0124] The beneficial effects of the above technical solution are as follows: through the relevant information fusion module, residual feature enhancement is adopted, and the current information xn and the third information Un useful for the current segment are concatenated. After passing through a convolution with a kernel size of 3*3, a stride of 1, and the same number of channels as the current segment information xn, and a ReLU activation function, the residual is added to the current segment information xn. The useful information is fused with the information of the current arbitrary height image segment.

[0125] In one embodiment, S403 includes:

[0126] S4031, each segment of multiple arbitrary height image segments is sequentially input into the spatiotemporal task collaborative model according to the first time sequence, and processed by a deep learning image processing algorithm based on row scanning; the spatiotemporal task collaborative model only stores intermediate data related to the current segment during the processing; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence includes: assigning the first segment of the arbitrary height image segment to the first time value of the first time sequence; assigning the second segment of the arbitrary height image segment to the second time value of the first time sequence; sequentially assigning each arbitrary height image segment to each time value in the first time sequence; dividing the image by rows, and then combining any number of image rows into arbitrary height image segments; when the timing is the first time value, inputting the first segment of the multiple arbitrary height image segments into the spatiotemporal task collaborative model; when the timing is the second time value, inputting the second segment of the multiple arbitrary height image segments into the spatiotemporal task collaborative model; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence;

[0127] S4032, the image segment at any height is processed by the spatiotemporal task collaboration model using artificial intelligence image processing, and the processed image segment at any height is output sequentially according to the second time sequence; the above steps are repeated to perform artificial intelligence image processing on the image segment at any height.

[0128] The working principle of the above technical solution is as follows: S403 includes:

[0129] S4031, each segment of multiple arbitrary height image segments is sequentially input into the spatiotemporal task collaborative model according to the first time sequence, and processed by a deep learning image processing algorithm based on row scanning; the spatiotemporal task collaborative model only stores intermediate data related to the current segment during the processing; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence includes: assigning the first segment of the arbitrary height image segment to the first time value of the first time sequence; assigning the second segment of the arbitrary height image segment to the second time value of the first time sequence; sequentially assigning each arbitrary height image segment to each time value of the first time sequence; dividing the image by rows, and then combining multiple image rows into arbitrary height image segments; when the timing is... At the first time value, the first segment of multiple arbitrary-height image segments is input into the spatiotemporal task collaborative model; when the time reaches the second time value, the second segment of multiple arbitrary-height image segments is input into the spatiotemporal task collaborative model; each segment of multiple arbitrary-height image segments is input into the spatiotemporal task collaborative model sequentially according to the first time sequence; each segment of multiple arbitrary-height image segments is input into the spatiotemporal task collaborative model sequentially according to the first time sequence, and processed by a deep learning image processing algorithm based on line scanning; the spatiotemporal task collaborative model only stores intermediate data related to the current segment during the processing; inputting each segment of multiple arbitrary-height image segments sequentially according to the first time sequence into the spatiotemporal task collaborative model also includes: dividing the color image into multiple color image segments of arbitrary height;

[0130] If the color image segment meets the input requirements of the spatiotemporal task collaborative model set by the system, then the color image segment is processed and input into the spatiotemporal task collaborative model; if the color image segment does not meet the input requirements of the spatiotemporal task collaborative model set by the system, then each color image segment is processed and input into the spatiotemporal task collaborative model; processing and inputting into the spatiotemporal task collaborative model includes: padding the end of the split color image segment with non-equivalent height insufficient row image segments to obtain image segments with the same height as the image segments of arbitrary height; inputting each segment of multiple image segments of arbitrary height and the image segments with the same height into the spatiotemporal task collaborative model in sequence according to the first time sequence;

[0131] S4032, the image segment at any height is processed by the spatiotemporal task collaboration model using artificial intelligence image processing, and the processed image segment at any height is output sequentially according to the second time sequence; the above steps are repeated to perform artificial intelligence image processing on the image segment at any height.

[0132] The beneficial effects of the above technical solution are as follows: Each segment of multiple arbitrary height image segments is sequentially input into the spatiotemporal task collaborative model according to the first time sequence, and processed by a deep learning image processing algorithm based on line scanning; the spatiotemporal task collaborative model only stores intermediate data related to the current segment during the processing; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence includes: dividing the color image of the arbitrary height image segment into multiple color image segments of a certain width by pixels; if the color image segments meet the input requirements of the spatiotemporal task collaborative model set by the system, then organizing the color image segments and inputting them into the spatiotemporal task collaborative model; performing artificial intelligence image processing on the arbitrary height image segments after processing by the spatiotemporal task collaborative model, and outputting the processed arbitrary height image segments sequentially according to the second time sequence; the above steps are repeated to perform artificial intelligence image processing on arbitrary height image segments; it can solve the problem of large storage requirements. In this invention, the line scanning process is modeled as a spatiotemporal collaborative task; it breaks the limitation of the receptive field; and achieves image processing quality comparable to or even better than the best current deep learning algorithms in denoising and super-resolution tasks with minimal storage requirements.

[0133] A line-scan-based artificial intelligence image processing system includes: a system employing any of the line-scan-based artificial intelligence image processing methods.

[0134] The working principle of the above technical solution is as follows: The present invention provides an artificial intelligence image processing system based on line scanning, comprising: a system employing any of the aforementioned artificial intelligence image processing methods based on line scanning; constructing an encoder and decoder through convolutional layers to perform spatial task processing and obtain a spatial task model; performing modeling through an inter-segment information transfer module and transferring inter-segment information according to a time sequence to obtain a temporal task model; performing spatiotemporal task collaboration between the spatial task model and the temporal task model to create a deep learning image processing algorithm architecture based on line scanning and obtain a spatiotemporal task collaboration model; dividing the image into horizontal rows, and then combining any number of image rows into image segments of arbitrary height, and performing artificial intelligence image processing on the image segments of arbitrary height through the spatiotemporal task collaboration model.

[0135] The beneficial effects of the above technical solution are as follows: The present invention provides an artificial intelligence image processing system based on line scanning, comprising: a system employing any of the aforementioned artificial intelligence image processing methods based on line scanning; the present invention's artificial intelligence image processing system based on line scanning can solve the problem of large storage requirements while retaining the advantages of deep learning algorithms in image processing quality; the deep learning image processing algorithm based on line scanning solves the memory problem at the algorithm level, which is beneficial to ensuring that the deep learning image processing algorithm meets real-time requirements for practical application; in the present invention, the line scanning process is modeled as a spatiotemporal collaborative task; wherein the spatial part is responsible for processing the current line, while the spatial task is responsible for transmitting inter-segment information, thereby breaking the limitation of the receptive field; in denoising and super-resolution tasks, it achieves image processing quality comparable to or even better than the best current deep learning algorithms with minimal storage requirements.

[0136] Although embodiments of the present invention have been disclosed above, they are not limited to the applications listed in the specification and embodiments. They can be applied to various fields suitable for the present invention. Other modifications can be easily made by those skilled in the art. Therefore, without departing from the general concept defined by the claims and their equivalents, the present invention is not limited to the specific details and illustrations shown and described herein.

Claims

1. An artificial intelligence image processing method based on line scanning, characterized in that, include: S100 constructs encoders and decoders through convolutional layers to perform spatial task processing and obtain spatial task models; S200 uses the inter-segment information transfer module for modeling, and performs inter-segment information transfer according to the time sequence to obtain a time task model; S300 combines spatial and temporal task models to create a deep learning image processing algorithm architecture based on line scanning, thereby obtaining a spatiotemporal task collaboration model. S400 divides the image into horizontal rows, and then combines any number of image rows into an image segment of arbitrary height. The spatiotemporal task collaboration model is used to perform artificial intelligence image processing on the image segment of arbitrary height. S200 includes: S201 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information; the inter-segment information transfer module includes: an inter-segment correlation extraction module and a relevant information fusion module; S202, construct a time sequence from multiple time values, and sequentially correspond multiple image segments of arbitrary height from top to bottom to multiple time values ​​in the time sequence from first to last; S203, according to the time sequence, performs inter-segment information transmission through the inter-segment information transmission module, breaks the receptive field limitation, and obtains a time task model; S201 includes: S2011 uses the inter-segment information transfer module for modeling, extraction of correlations, and fusion of relevant information; S2012 uses the inter-segment correlation extraction module to extract correlations and remove useless information. S2013, through the relevant information fusion module, uses residual feature enhancement to fuse useful information with information from the current image segment at any height.

2. The artificial intelligence image processing method based on line scanning according to claim 1, characterized in that, S100 includes: S101: The encoder is constructed using the first classical convolutional layer; the decoder is constructed using the second classical convolutional layer. S102 inputs an image segment of arbitrary height into the encoder, processes it through multiple information transmission modules, outputs it through the decoder, executes the processing of the current segment of the image segment of arbitrary height, and performs spatial task processing. S103 performs space mission processing by constructing an encoder and a decoder to obtain a space mission model.

3. The artificial intelligence image processing method based on line scanning according to claim 1, characterized in that, The S300 includes: S301 coordinates the spatial task model and the temporal task model in a spatiotemporal manner, and sets the encoder before the inter-segment information transmission module as the input end of the image segment at any height. S302, the decoder is set after the inter-segment information transmission module, and serves as the output end of the image segment at any height; S303, through the interconnection of encoders, decoders and multiple inter-segment information transmission modules, creates a deep learning image processing algorithm architecture based on line scanning, and obtains a spatiotemporal task collaborative model.

4. The artificial intelligence image processing method based on line scanning according to claim 1, characterized in that, The S400 includes: S401, the image is divided into horizontal rows, and after division, any number of image rows are combined into an image segment of arbitrary height; each image segment of arbitrary height corresponds to each time value in the first time sequence; the height of the image segment of arbitrary height is set to an arbitrary height by the number of image rows it contains; if the image is divided to the last segment and does not meet the set height, measures are taken to fill in the missing rows, including: directly filling in the missing rows with zeros, filling in with the data of the last row of the current segment, or performing mirror filling; S402, arrange multiple arbitrary height image segments in spatial vertical order; each arbitrary height image segment corresponds to each time value in the second time sequence after processing; S403: When the image sensor scans the input image line by line, it accumulates to a set height and obtains an image segment of arbitrary height, which is then sent to the spatiotemporal task collaboration model for processing by an algorithm; each image segment of arbitrary height is sequentially input into the spatiotemporal task collaboration model for processing; and artificial intelligence image processing is performed on the image segments of arbitrary height through the spatiotemporal task collaboration model.

5. The artificial intelligence image processing method based on line scanning according to claim 1, characterized in that, S2012 includes: S2112, establish a correlation extraction solution model; in the correlation extraction solution model, the image information is divided into two parts: first global information G and first information U useful for the current segment, which will be continuously updated with the input image rows; at the same time, the image information size remains unchanged, and the size is the same as that of the current segment information, maintaining storage-friendly characteristics; the global information G will be updated according to the second global information Gn-1 of the previous state, the current segment information xn, and the second information Un-1 useful for the current segment of the previous state to obtain the third global information Gn, and the first information U will be updated according to the second information Un-1, the current segment information xn, and the third global information Gn to obtain the third information Un useful for the current segment; S2212, based on the correlation extraction solution model, constructs a correlation extraction module. The specific architecture of the correlation extraction module includes: i represents the input unit; if represents the input information processing unit; o represents the output unit; of represents the output information processing unit; s represents the filtering unit; where o, i, and s units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a sigmoid activation function; if units are composed of convolutions with a 3*3 kernel size, a stride of 1, and the same number of channels as the input information, and a tanh activation function; of units are composed of tanh activation functions; through the inter-segment correlation extraction module, correlations are extracted and useless information is removed.

6. The artificial intelligence image processing method based on line scanning according to claim 1, characterized in that, S2013 includes: S2113, through the relevant information fusion module, residual feature enhancement is adopted, and the current information xn and the third information Un useful for the current segment are channel concatenated; S2213, after passing through a 3*3 kernel size, stride of 1, and channel number consistent with the current segment information xn, and then the residual is added to the current segment information xn; S2313, which fuses useful information with information from the current image segment at any altitude.

7. The artificial intelligence image processing method based on line scanning according to claim 4, characterized in that, S403 includes: S4031, each segment of multiple arbitrary height image segments is sequentially input into the spatiotemporal task collaborative model according to the first time sequence, and processed by a deep learning image processing algorithm based on row scanning; the spatiotemporal task collaborative model only stores intermediate data related to the current segment during the processing; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence includes: assigning the first segment of the arbitrary height image segment to the first time value of the first time sequence; assigning the second segment of the arbitrary height image segment to the second time value of the first time sequence; sequentially assigning each arbitrary height image segment to each time value in the first time sequence; dividing the image by rows, and then combining any number of image rows into arbitrary height image segments; when the timing is the first time value, inputting the first segment of the multiple arbitrary height image segments into the spatiotemporal task collaborative model; when the timing is the second time value, inputting the second segment of the multiple arbitrary height image segments into the spatiotemporal task collaborative model; inputting each segment of multiple arbitrary height image segments into the spatiotemporal task collaborative model according to the first time sequence; S4032, the image segment at any height is processed by the spatiotemporal task collaboration model using artificial intelligence image processing, and the processed image segment at any height is output sequentially according to the second time sequence; the above steps are repeated to perform artificial intelligence image processing on the image segment at any height.