Feature graph up-sampling method, terminal and storage medium
A storage medium and feature map technology, applied in the field of image processing, can solve the problem of lack of feature semantic information, and achieve the effect of improving performance
Pending Publication Date: 2021-03-05
PENG CHENG LAB
0 Cites 0 Cited by
AI-Extracted Technical Summary
Problems solved by technology
[0004] Aiming at the above-mentioned defects of the prior art, the present invention provides a feature map upsampling method, a terminal and a storage medium, aiming to solve the ...
Method used
In summary, the present embodiment provides a feature map upsampling method, by dividing the feature map to be processed in the channel dimension, and performing feature maps for each feature map with the number of channels less than the feature map to be processed After processing, aggregation is performed to complete upsampling, which realizes the independent processing of the features on each channel of the feature map to be processed, which can extract features with richer semantics and improve the performance of image processing.
Specifically, the feature processing module can be the feature processing module in the upsampling process in the existing image processing neural network, that is, the feature m...
Abstract
The invention discloses a feature map up-sampling method, a terminal and a storage medium, and the method comprises the steps of segmenting a to-be-processed feature map in a channel dimension, carrying out the feature map processing of all feature maps with the number of channels being less than the number of channels of the to-be-processed feature map, and then carrying out the aggregation to complete the up-sampling, so that the features on each channel of the to-be-processed feature map are independently processed, the features with richer semantics can be extracted, and the image processing performance is improved.
Application Domain
Image enhancementImage analysis
Technology Topic
Pattern recognitionImaging processing +1
Image
Examples
- Experimental program(3)
Example Embodiment
[0033] Example one
[0034] The sample method provided by the present invention can be applied to the terminal, and the terminal can sample the feature map by the sample method by the feature map provided by the present invention. Terminals can, but are not limited to, various computers, mobile phones, tablets, on-board computers, and portable wearable devices.
[0035] Such as figure 1 As shown, in one embodiment of the sample method on the feature map, including steps:
[0036] S100, acquire the characteristic map to be processed, the first neural network to be processed to the first training completion, and acquire a plurality of intermediate features through the first neural network.
[0037] Specifically, the feature map provided in this embodiment is sampled as part of the image processing task, and the image processing task includes, but is not limited to, image segmentation, image segmentation, image division When various image processing tasks, when the image processing task includes the step of being sampled on the feature map, the sample method can be sampled on the feature map provided by the present embodiment, and the tortured feature can be an image. The feature map of the initial feature is subjected to the latter generation, and the feature map output in the preamble of the image processing task.
[0038] In the present embodiment, when the desired feature is sampled, the first neural network is first performed by the first neural network, and the first neural network includes a plurality of segmentation modules. The input of each split module is the first input feature map, and the first input feature map is divided in each of the split modules, and after obtaining a plurality of first feature patterns, it is input to the feature processing module, respectively, and acquires the feature processing module. The second feature of the output, the channel number of each of the first feature is 1 / n, n is a positive integer, and n is a positive integer, n to sample the desired feature. The multiple, the lengths and width of the features obtained after n-fold on the feature map, respectively, the length and width of the initial feature map, respectively, the number of channels remain unchanged, for example, when n is equal to 2, if The size of the desired feature is H * w * c, h, w, c is long, wide, and the number of channels, then, the target feature map is obtained after N / Double to the processing feature map. The size should be 2H * 2W * c.
[0039] Specifically, the feature processing module may be a feature processing module in an existing image processing neural network, i.e., the structure of the feature map processing module in the feature map provided in this embodiment. The feature drawing module in the existing upper sampling process is the same, for example, may be a feature extraction module, and a plurality of convolution calculations can be specifically. It is not difficult to see that the sampling method of the feature provided in this embodiment may be in the above-in-sample process of various different image processing neural networks, and the existing feature processing is changed to the segmentation. The respective features of each of the features are performed independently, and the performance improvement of various image processing neural networks is realized.
[0040]The parameters of each module in the first neural network are determined during the training of the first neural network, that is, the parameters of each feature processing module are determined during the training of the first neural network. The first neural network is part of the image processing task neural network, and other neural network modules in the image processing task neural network (such as the second neural network) as a whole, the image processing task neural network is Multi-group training data training is completed, each group of training data includes sample target images corresponding to a sample image and a sample image, for example, when the image processing task is an image super resolution task, the sample image is a low-resolution image, a sample object corresponding to a sample image. The image is a high resolution image corresponding to a low-resolution image. After the training is completed, the image processing task neural network can perform the corresponding image processing task to process the input image. The first neural network is completed as part of the image processing task neural network, that is, the first neural network is done according to multi-group training data training, each set of training data including sample images and sample images corresponding to the sample. Target image.
[0041] It is not difficult to see from the above description, since each of the feature processing modules is independently processed, and therefore, when the first neural network training is completed, the parameters of each of the feature processing modes can be different, that is to say In the present embodiment, the characteristic processing is not integrally performed on the tortured feature, but the segmentation of the feature map is divided into multiple intermediate feature patterns in the channel dimension. In this way, relative to all The channel is characterized by the same feature processing module as the same parameter, and the feature processing module different from the parameter can be used for feature extraction, and more semantic information can be acquired by different channels of the parameter.
[0042] The number of channels of the intermediate feature is smaller than the number of channels to be processed, in one possible implementation, the number of the divided modules in the first neural network may be a fixed value 1, at this time The first input feature is a feature pattern, which is the intermediate feature map, that is, directly dividing the tolerant feature in the channel dimension. After obtaining a plurality of first feature patterns, input to the feature processing module, output a plurality of the intermediate feature, that is, dividing the tolerant feature map into a plurality of the intermediate feature.
[0043] In the present embodiment, in order to extract a richer semantic information, the segmentation module group is set to gradually split the tolerant feature, and more of the intermediate profile, specifically, the first neural network comprising Including M split module groups, including n in the split module group 2(m-1) A split module, the input of the first division module group is the desired feature, and the input of each divided module in each of the divided modules is respectively the various characteristics of the previously divided module group. . The first input feature is divided, acquires a plurality of first feature, including:
[0044] S110, channel expansion of the first input feature, resulting in a fourth feature, and the channel number of the fourth feature is n (N times) of the number of the first input feature.
[0045] S120, segmenting the fourth feature map, acquiring N 2 A first feature.
[0046] Such as figure 2 As shown, in each of the divided modules, the first input feature map input to the division module is first expanded. Channel expansion can be achieved by a convolution layer, specifically, for the first input feature GF ∈R H×W×C , Use the convolution layer, will f GF ∈R H×W×C The channel number is mapped to 2C, and the fourth feature of the extended channel is generated. EGF The arithmetic process can be expressed as the formula:
[0047] Fly EGF = H EC (F GF ) (1)
[0048] Where H EC On behalf of the channel expansion operation.
[0049] After the first input feature is drawn to obtain the fourth feature, the fourth feature is divided, and N is obtained. 2 A first feature, the number of channels of each of the first feature is 1 / n of the first input feature drawing channel, specifically, the division is (C in the fourth feature) 4 ) / (N 2 ) Extraction as a new feature, C 4 The number of channels of the fourth feature. Such as figure 2 As shown, the above sample multiple is 2 as an example. In each of the divided modules, the first input feature of the input size is h * w * c, first expanded to 2 times, and the size is h. * The fourth feature of W * 2C is divided, and the first feature of the size of H * w * c / 2 is obtained.
[0050] In each split module, after the first feature is obtained, the first feature is input to the feature processing module to obtain a second feature, the size of the second feature and the first feature map. The size is the same. The specific operation process can be expressed as follows:
[0051]
[0052] among them, The calculation of the i-th feature processing module in the segmentation module, Represents to the fourth feature picture f EGF The first feature map obtained after segmentation is performed; For the second feature of the second.
[0053] It is not difficult to see that in this embodiment, each of the divided modules can output N 2 A second feature, the number of segmentation modules in the next segmentation module is equal to the number of feature maps output in the previous segmentation module group, and the first segmentation module group includes a split module, output N 2 The second feature, the second segmentation module group including N 2 The segmentation module, the first segmentation module group outputs N 2 The second feature is respectively input as the input of the segmentation modules in the second segmentation module group, and the output of the last divided module group is the intermediate feature map, which is not difficult to see, in this embodiment. , Can be obtained by the first neural network 2M The intermediate feature, the number of channels of each of the intermediate features is 1 / (N of the characteristic map to be processed. 2M ).
[0054] The value of M can take any positive integer, such as 1, 2, 3, etc., in a possible implementation, M satisfy: n = x M X is positive and integer, such as X can take 2, 3 equivalents.
[0055] Please refer again figure 1 After obtaining a plurality of intermediate feature through the first neural network, the feature map provided in the present embodiment is sampled, and the steps are also included.
[0056] S200, the plurality of intermediate features are input to a second neural network completed in advance, and the target feature is obtained by the second neural network.
[0057] The lengths and width of the target feature are N times the length and width of the desired feature, specifically, the second neural network comprising at least one aggregation module, in each of the aggregation modules, respectively, for the second input The feature map performs aggregation acquisition of third feature, each of the length and width of each of the third feature, respectively, a long and width of the second input feature.
[0058] The parameters of the respective polymeric modules in the second neural network are determined during the training of the second neural network, and the second neural network is part of the image processing task neural network, and the image processing task neural network. Other neural network modules (such as the first neural network in the previous paper) are used as a whole, and the image processing task neural network is completed by multiple sets of training data training, and each group of training data includes sample target images corresponding to a sample image and a sample image, for example. When the image processing task is an image super resolution task, the sample image is a low-resolution image, the sample target image corresponding to the sample image is a high resolution image corresponding to the low-resolution image. After the training is completed, the image processing task neural network can Perform the corresponding image processing task processes the input image. The second neural network is completed as part of the image processing task neural network, that is, the second neural network is done according to multi-group training data training, each set of training data including sample images and sample images corresponding to the sample image. Target image.
[0059] In a possible implementation, the number of polymeric modules in the second neural network is a fixed value 1, i.e., polymerization of the plurality of intermediate features to obtain the target feature, in this embodiment In the middle, the intermediate feature is gradually polymerized, in particular, the second neural network comprises a M polymeric module group, and the first polymerization module group includes N. 2(M-m) A aggregation module, the input of the first polymeric module group is an output of the first neural network, and the input of each of the aggregation modules in each of the respective polymeric modules is N. 2 The output of the aggregation module, the objective feature map through the second neural network, including:
[0060] In the target aggregation module in each aggregate module group:
[0061] S210, each of the respective second input feature maps respectively perform channel expansion, obtain each fifth feature, and the channel number of the fifth feature is N times of the second input feature.
[0062] S220, polymerize each of the fifth features to obtain a sixth feature, the number of channels of the sixth feature is N of the fifth feature. 2 Multiplier;
[0063] S230, rearrangement of pixel points on each passage of the sixth feature to obtain a third feature, wherein the number of channels of the third feature is equal to the channel number of the fifth feature. The length and width of the third feature map are X times of the sixth feature.
[0064] The input of each aggregate module is n 2 A feature, that is, for each aggregation module, the second input feature is n 2 In each of the aggregation modules, the channel expansion is performed on each of the second input feature maps, respectively, and each fifth feature is obtained. image 3 As shown, the above sample multiple is 2 as an example, and the size of the second input feature is h * w * c in , The size of the fifth feature drawn obtained after the second input feature is subjected to the size of H * w * c out , C out = 2C in. The channel expansion can be implemented by a convolution layer, and in particular, it can be referred to the explanation of the channel expansion of the first input feature.
[0065] In a single polymeric module, after the fifth feature is obtained, the fifth feature is obtained, and the sixth feature is obtained, specifically:
[0066] The channels of each of the fifth feature are combined to obtain the sixth feature.
[0067] In a single aggregation module, such as image 3 As shown, the number of the fifth feature is equal to the number of the second input feature map,2 , Combine each channel of each of the fifth features to obtain N of the channel number of the channel number is the number of channels of the fifth feature. 2 The sixth feature of the sixth, specifically, the arrangement of the channels of each of the fifth features may be combined in accordance with the order of the fifth feature map or in other orders.
[0068] After acquiring the sixth feature, the pixel points on each passage of the sixth feature are rearranged, and the third feature map can be implemented using a Pixelshuffle operation, including:
[0069] The X of the sixth feature 2 The pixel of the same spatial position on a channel is a pixel block of x * x to obtain a channel of the third feature.
[0070] The number of channels of the third feature is equal to the channel number of the fifth feature. The X of the sixth feature 2 Pixel extraction of the same spatial position on a channel to get X 2 The pixel point, combined these pixel points to obtain a pixel block size of x * x, thus, X XX in the sixth feature 2 A channel can get a new channel, and the length of the new channel and the X-fold of the sixth feature. Each is not difficult to see, in all of the previous operations, only the number of channels in the size of the feature, the length and width are constant, and therefore, the length and width of the third feature map is the fifth feature grade and Width x times, the number of channels is N times of the second input feature, such as image 3 As shown, the size of the second input feature is h * w * c in , Then the size of the third feature is (x * h) * (x * w) * c out , C out = N * C in
[0071] The second input feature is the input of the target polymerization module, and the inputs of each of the aggregate modules in each of the aggregate module groups are previously previously aggregated module groups. 2 The output of the aggregate module, in the first aggregation module group, including N 2(M-1) Module, the input of the first aggregate module group is the output of the first neural network, namely N 2M The intermediate feature, the channel number of each intermediate feature is 1 / (n 2M ), Each N 2 The intermediate feature is an input to a polymerization module in the first aggregation module group, and the output of each aggregate module is a sixth feature, so that the output of the first aggregation module group is n. 2(M-1) The sixth feature, the length and width of the sixth feature of the first polymerization module group, respectively, respectively, the length and width of the tortured feature, and the number of channels is the characteristic of the tolerant. Number of channels (N 2 ) / (N 2M Double, the second aggregation module group includes N 2(M-2) A polymeric module, the first aggregate module group output N 2(M-1) Sixth feature, each N 2 The sixth feature is the input of the respective polymeric modules in the second polymeric module, and the length and width of the sixth feature map output by the second polymeric module group are long and wide respectively. X 2 Double, channel number is the number of channels to be processed (N 4 ) / (N 2M ) Times, according to the secondary push, the last polymeric module group includes a polymerization module, and the length and width of the sixth feature map are respectively, respectively, respectively, respectively. M Double, the channel is the characteristic drawing of the above (N) 2M ) / (N 2M ) Times, that is, the number of channels is equal to the segmentation feature drawings, and N = x M That is, the sixth feature map output by the last polymeric module is the target feature.
[0072] In summary, the present embodiment provides a feature pattern on the sampling method, by dividing the segmentation characteristic in the channel dimension, each of the characteristics of the channel number is less than the feature map to be processed, respectively, after processing The polymerization is carried out to complete the upper sampling, realizing the independent processing on the features on each channel of the processing feature, which can extract a richer semantic feature, and improve the performance of image processing.
[0073] It should be understood that although various steps in the flowcharts given in the drawings of the present invention are sequentially displayed in accordance with the instructions of the arrow, these steps are not necessarily executed in the order indicated by the arrow. Unless otherwise stated herein, the implementation of these steps does not have a strict order, which can be performed in other orders. Moreover, at least one of the steps in the flowchart may include a plurality of sub-steps or multiple stages, which do not necessarily perform completion at the same time, but can be performed at different times, the execution of these sub-steps or stages. The order is not necessarily performed in turn, but can be performed or alternately performed with at least a portion of the sub-step or stage of other steps or other steps.
[0074] One of ordinary skill in the art will appreciate that all or part of the flow in the above-described embodiment method is to be done by a hardware that can be related to the computer program, and the computer program can be stored in one non-volatile computer readable storage. In the medium, the computer program may include the flow of embodiments of each method described above. Among them, any reference to memory, storage, database, or other medium used in the various embodiments provided herein can include nonvolatile and / or volatile memory. Non-volatile memory can include read only memory (ROM), programmable ROM (PROM), electrical programmable ROM (EEPROM), electrical erase programmable ROM (EEPROM) or flash memory. Volatile memory can include a random access memory (RAM) or an external cache. As an explanation rather than the limit, RAM can be obtained in a variety of forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), sync chain Synchlink DRAM (SLDRAM), Memory Bus (Rambus) Direct RAM (RDRAM), Direct Memory Bus Dynamics RAM (DRDRAM), and Memory Bus Dynamics RAM (RDRAM), and the like.
Example Embodiment
[0075] Example 2
[0076] Based on the above embodiment, the present invention also provides a terminal, such as Figure 4 As shown, the terminal includes a processor 10 and a memory 20. It can be understood that Figure 4 Only some components of the terminal are shown, but it should be understood that all displayed components are not required to replace more or fewer components.
[0077] The memory 20 can be in some embodiments, the internal storage unit of the terminal, such as a hard disk or memory of the terminal. The memory 20 may also be an external storage device of the terminal, such as a plug-in hard disk, a smart media card, SMC, a secure number (SecureDigital, SD), secure digital, SD, SMART MEDIA CARD, SMC ) Card, flash card (Flash Card), etc. Further, the memory 20 can also include both internal storage units of the terminal also include an external storage device. The memory 20 is configured to store applications and various types of data mounted on the terminal. The memory 20 can also be used to temporarily store data that has been output or will output. In one embodiment, the memory 20 stores a feature 3 sample program 30, which can be executed by the processor 10, thereby implementing the sampling method on the features of the present invention.
[0078] The processor 10 may be a central processor (CPU), a microprocessor, or other chip for running the program code or processing data stored in the memory 20, such as performing implementation. Sampling method, etc. on the features described in the first embodiment.
Example Embodiment
[0079] Example three
[0080] The present invention also provides a storage medium in which one or more programs are stored, the one or more programs can be performed by one or more processors to implement the steps of the sample method as described above.
PUM


Description & Claims & Application Information
We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Similar technology patents
Transient fault detection system and method using Hidden Markov Models
Owner:HONEYWELL INT INC
Adaptive mode control apparatus and method for adaptive beamforming based on detection of user direction sound
Owner:SAMSUNG ELECTRONICS CO LTD +1
Server load balancing apparatus and method using MPLS session
Owner:ELECTRONICS & TELECOMM RES INST
Master-slave flip-flop and clocking scheme
Owner:MARVELL ASIA PTE LTD
Storage system and data processing system
Owner:HITACHI LTD
Classification and recommendation of technical efficacy words
- improve performance
Tool position and identification indicator displayed in a boundary area of a computer display screen
Owner:INTUITIVE SURGICAL OPERATIONS INC
Personalizable semantic taxonomy-based search agent
Owner:GEORGE MASON INTPROP INC
Method for detecting voice section from time-space by using audio and video information and apparatus thereof
Owner:KOREA UNIV IND & ACADEMIC CALLABORATION FOUND