# Image processing device and image processing method

## An image processing device and image technology, applied in image data processing, image analysis, instruments, etc., can solve the problems of huge parameters and limited scope of application.

Active Publication Date: 2014-01-22
SONY COMPUTER ENTERTAINMENT INC
6 Cites 3 Cited by

## AI-Extracted Technical Summary

### Problems solved by technology

However, since the input data is based on pixel values, the parameters are larger than those of general information analysis.
Therefore, there is a problem that the highe...
View more

## Abstract

The invention provides an image processing device and an image processing method. A pixel set formation unit in an image analysis unit of an image processing device forms pixel sets from original images subject to analysis. A principal analysis unit of the image analysis unit performs principal component analysis in units of pixel sets. A synthesis unit synthesizes results of analysis in units of pixel sets so as to generate images of eigenvectors of a size of the original images. An image generation unit displays the images of the eigenvectors and stores data for an image generated by using the images of the eigenvectors in a generated image storage unit.

Application Domain

Image analysisCharacter and pattern recognition

Technology Topic

Image analysisImage storage +8

## Image

• • • ## Examples

• Experimental program(1)

### Example Embodiment

 The present invention will describe preferred embodiments for reference. The embodiment does not limit the scope of the present invention, but is an illustration of the present invention.
 In this embodiment, principal component analysis is performed on the entire plurality of original images, so that the feature vector representing the principal component can be confirmed as an image. If the original image to be analyzed is defined as N frames, the principal component analysis is performed roughly according to the following steps. First, the input data I of each original image to be analyzed is generated as follows 1 ~I N.
 〔Formula 1〕
 I 1 =(p 1 (1), p 1 (2), p 1 (3), p 1 (4),...,p 1 (m)
 I 2 =(p 2 (1), p 2 (2), p 2 (3), p 2 (4),...,p 2 (m)
 I 3 =(p 3 (1), p 3 (2), p 3 (3), p 3 (4),...,p 3 (m)
....
 I N =(p N (1), p N (2), p N (3), p N (4),...,p N (m)
 Here, p n (I) represents the i-th value when each component of the pixel value of the pixel as input data in the n-th image is listed in a predetermined pixel order. That is, the maximum value m of i is a value obtained by multiplying the number of pixels as input data by the number of components of each pixel. In the case of an RGB image, it is multiplied by "3". Based on such input data as follows I 1 ~I N Generate correlation matrix.
 [Equation 2]
 a ( 1 ) · a ( 1 ) , a ( 1 ) · a ( 2 ) , a ( 1 ) · a ( 3 ) , · · · , a ( 1 ) · a ( m ) a ( 2 ) · a ( 1 ) , a ( 2 ) · a ( 2 ) , a ( 2 ) · a ( 3 ) , · · · , a ( 2 ) · a ( m ) a ( 3 ) · a ( 1 ) , a ( 3 ) · a ( 2 ) , a ( 3 ) · a ( 3 ) , · · · , a ( 3 ) · a ( m ) · · · · · a ( m ) a ( 1 ) , a ( m ) · a ( 2 ) , a ( m ) · a ( 3 ) , · · · , a ( m ) · a ( m )
 Here, a(i) is a vector whose elements are the difference between the i-th pixel value of each image and the average value. When the average value is denoted as ave(i), it is expressed as follows.
 [Equation 3]
 a(1)=(p 1 (1)-ave(1),p 2 (l)-ave(1),p 3 (1)-ave(1),...,p N (1)-ave(1))
 a(2)=(p 1 (2)-ave(2),p 2 (2)-ave(2),p 3 (2)-ave(2),...,p N (2)-ave(2))
 a(3)=(p 1 (3)-ave(3),p 2 (3)-ave(3),p 3 (3)-ave(3),...,p N (3)-ave(3))
....
 a(m)=(p 1 (m)-ave(m), p 2 (m)-ave(m), p 3 (m)-ave(m),..., p N (m)-ave(m))
 Perform eigenvalue decomposition on the above correlation matrix to find eigenvalues ​​and eigenvectors. The eigenvectors corresponding to the largest eigenvalues ​​are denoted as the first principal component e1, the second principal component e2, the third principal component e3, ... in order. Generally, feature vectors with feature values ​​above 1 are extracted as principal components. The feature vector is an m-dimensional vector, so when the original image is used as input data I to generate an m-dimensional vector, when the pixel arrangement is restored by a process opposite to the process of generating an m-dimensional vector, they respectively constitute an image.
 By linearly combining the feature vectors e1, e2, e3,... and the images obtained based on them as follows, it is possible to restore the original image, emphasize specific components, or generate an intermediate image.
 G(x,y)=ave(x,y)+K1×e1(x,y)+K2×e2(x,y)+K3×e3(x,y)+...
 Here, G(x,y) is the generated image, ave(x,y) is the average image, e1(x,y), e2(x,y), e3(x,y),... are The pixel value of the position coordinate (x, y) in the image of the feature vector. By changing the coefficients K1, K2, K3, ... of the linear combination, the image can be changed variously.
 If the principal component analysis is performed on the entire image at one time, the image of the feature vector can be obtained corresponding to the original image, so the user can select the components of the image that he wants to emphasize with his eyes. As a result, there is an advantage that a desired image can be generated freely and easily. On the other hand, when considering high-resolution images that are common in recent years, there are the following problems. That is, the dimension m of the input data I corresponding to one image is the number of components of the horizontal resolution×vertical resolution×pixel value of the image. The correlation matrix of such m-dimensional input data has m rows and m columns, that is, two elements (horizontal resolution×vertical resolution×number of components of pixel value).
 For example, in the case of an RGB image of 1024×768 pixels, the data size of the correlation matrix is ​​(1024×768×3)×(1024×768×3)×4=about 22TB.
 In order to obtain the feature vector, it is necessary to temporarily expand the correlation matrix into the memory, but it is difficult to store data of the above-mentioned size in a general memory. Therefore, in this embodiment, a pixel set constituted by extracting pixels from the original image every predetermined number is continuously formed, thereby dividing one image into a plurality of pixel sets, and the pixel set is defined as the unit of input data.
 figure 1 The configuration of the image processing system in this embodiment is shown. in figure 1 And later Picture 12 In terms of hardware, the various elements described as functional blocks that perform various processing can be implemented by CPU (Central Processing Unit), GPU (Graphics Processing Unit), memory, and other LSIs. The structure, from the perspective of software, is realized by programs for image processing and various calculations. Therefore, those skilled in the art should understand that these functional blocks can be implemented by hardware only, software only, or a combination of them in various forms, and they are not limited to one.
 The image processing system 2 includes: an input device 12 that receives instructions on image processing from a user; an image processing device 10 that performs image analysis based on principal component analysis to generate required data; and a display device 14 that displays an input screen and The generated image. The image processing device 10 includes: an image analysis unit 20 that performs principal component analysis to obtain a feature vector; an image generation unit 32 that uses the feature vector to generate image data that meets user requirements; and an original image data storage unit 28 that stores the analysis target original Image data; feature vector storage unit 30, which stores feature vectors; and generated image data storage unit 34, which stores image data created using feature vectors.
 In addition, since the processing performed by the image analysis unit 20 and the processing performed by the image generation unit 32 can be implemented independently, the two may not be provided in the same device, but may be made into different devices with respective functions. In addition, the image processing device 10 may include functions of games, content display, and various information processing in which images generated using feature vectors are incorporated. In addition, the data of the original image obtained by another device is obtained via the recording medium, or the data of the original image is obtained from the server via the network and stored in the original image data storage unit 28. Alternatively, the image processing device 10 may be provided in an imaging device such as a camera, and the captured image may be instantly stored in the original image data storage unit 28. Alternatively, the original image data storage unit 28 may be directly input to the image analysis unit 20 from another device.
 The input device 12 receives instructions input from the user to start image analysis and selection of an image to be analyzed, and notifies the image analysis unit 20 of the input. In addition, the user receives input related to designation of an image desired to be displayed, such as an image of a feature vector, and generation of a new image using it, and notifies the image generation unit 32 of input. The input device 12 may be any of general input devices such as a mouse, a keyboard, a controller, and a joystick, or may be a touch panel or the like installed on the screen of the display device 14. The display device 14 may be a display device that displays images alone, such as a liquid crystal display or a plasma display device, or may be a combination of a projector and a screen that project images, or the like.
 The image analysis section 20 includes: a pixel set forming section 22 that forms a set of pixels based on the original image to be analyzed; a principal component analysis section 24 that performs principal component analysis for each set of pixels; and an integration section 26 that integrates the analysis results of each set of pixels . The pixel set forming unit 22 reads out data of a plurality of designated images from the original image data storage unit 28 in accordance with an image analysis instruction from the user received by the input device 12. Then, the pixels constituting these images are divided into sets of pixels every predetermined number.
 The pixels belonging to one set may be pixels in block units after dividing the original image, or pixels that are extracted according to a predetermined rule such as every few pixels. In addition, it can also be randomly selected pixels. In either case, pixels at the same position are extracted from multiple images to be analyzed. The principal component analysis unit 24 considers the capacity of a memory (not shown) for expanding the correlation function, and sets the number of pixels included in one pixel set in advance. The principal component analysis unit 24 generates a vector whose element is the pixel value of the pixel belonging to one pixel set for each image, and uses it as the aforementioned input data I 1 ~I N To perform principal component analysis. Then, for each pixel set, feature vectors e1, e2, e3, ... for the principal components of a plurality of original images are derived.
 The integration unit 26 integrates the feature vectors obtained for each pixel set in consideration of the original pixel arrangement. In this way, it is possible to obtain an image of the feature vector of the principal component according to the size of the original image. The image data of the feature vector generated in this manner is stored in the feature vector storage unit 30. The image generating unit 32 draws the feature vector read out from the feature vector storage unit 30 as an image in accordance with an instruction from the user or the like, and displays it on the display device 14. The image generating unit 32 also causes the image generated by linear combination of the feature vectors to be displayed on the display device 14 and then receives an instruction to adjust the image from the user. That is, accept the designation of the feature vector that you want to emphasize, the feature vector that you want to reduce the influence, or the adjustment of the coefficient. The data of the image generated in this way is output to the display device 14 and displayed, or stored in the generated image data storage unit 34.
 Next, a method of extracting pixels when the pixel set forming unit 22 forms a pixel set based on an image will be described. figure 2 It schematically shows a method of dividing an image into blocks and forming pixel sets for each block. In this method, pixel sets (for example, pixel sets 52a, 52b, 52c, 52d, ...) are formed for each block obtained by dividing the analysis target original image 50a, 50b, 50c, 50d, ... into predetermined sizes. ). For example, if the original image is divided into 16 in the vertical and horizontal directions, each original image is formed into a set of 256 pixels.
 Then, the pixel values ​​of the N pixel sets formed based on the blocks at the same position in the N analysis target original images are used as the input data of the principal component analysis I 1 ~I N. For example, when a 1024×768 pixel RGB image is divided into 16 in both vertical and horizontal directions, the input data is 64×48×3 dimensions, and the data size of the correlation matrix is
 By doing this, it is enough to expand the correlation matrix into a general memory.
 The principal component analysis unit 24 repeats principal component analysis for each divided block for all regions of the original image. When 16 divisions are performed in the vertical and horizontal directions, 256 principal component analysis is performed. In this way, the feature vector of the principal component is obtained for each block. image 3 It is a diagram for explaining the positional relationship in the image between the unit of the input data used in the principal component analysis and the feature vector obtained therefrom.
 First, as described above, a plurality of analysis target original images 54 are divided into blocks of a predetermined size. In the figure, two of the blocks divided in this way are referred to as a block A and a block B to represent them. The same applies to the other blocks described below. Then, the pixels constituting block A, block B, ... are extracted from each original image, thereby forming pixel sets 56a, 56b, ... for each block. Principal component analysis is performed using the sets 56a, 56b, ... of the pixel set as 1 unit of input data.
 As a result, the feature vector of the principal component can be obtained for each block. In this figure, the feature vector set 58a composed of feature vectors e1A, e2A, e3A, ... is obtained for block A, and the feature vector set 58a composed of feature vectors e1B, e2B, e3B, ... is obtained for block B. Vector set 58b. As described above, the order of feature vectors is determined according to the magnitude of the corresponding feature values.
 The integration unit 26 reconstructs an image of a feature vector of a size corresponding to the image plane of the original image based on the feature vector thus obtained. Specifically, each feature vector is rearranged on the image plane of the original block, thereby generating a block image of the feature vector. Furthermore, the block images in the same order are connected in the vertical and horizontal directions according to the arrangement order of the blocks in the original image. Thereby, the image 60a of the first principal component, the image 60b of the second principal component, the image 60c of the third principal component, ... corresponding to the original image are obtained.
 Figure 4 Shows an example of restoring an image using the feature vector obtained by dividing the image into blocks and performing principal component analysis. The restored image 70 is an image obtained by linearly combining the image 60a of the first principal component, the image 60b of the second principal component, and the image 60c of the third principal component obtained as described above with appropriate coefficients. The original image of the space centered on the pen holder. However, if you look closely, you will see the boundary of the block in the direction indicated by the arrow.
 The cause of this situation is that the order of the feature vectors obtained after principal component analysis for each block is shifted between blocks. If the feature vectors are sorted for each block based on the size of the feature value, the order of feature vectors representing the same component may be reversed between blocks due to noise or calculation errors included in the pixel value. If the images of the feature vectors of the same order are connected in a state where the order is shifted in this way, multiple components will be mixed in one image. The resulting block noise is represented as boundary lines.
 Figure 5 Expressed in Figure 4 An image of the feature vector used in the generation of the restored image 70. Observing the first principal component image 72a, the second principal component image 72b, and the third principal component image 72c in this order, the block noise becomes more obvious in this order. As a result, in an image generated by linearly combining such images, it becomes easier to see the block boundaries like the restored image 70 according to the content of the image.
 Image 6 As another example of the method of extracting pixels when forming a pixel set, schematically shows a method of extracting pixels from the original image at regular intervals. In this method, pixels (for example, the black dots in the figure) are extracted at predetermined intervals in the vertical and horizontal directions from the original image to be analyzed 80a, 80b, 80c, 80d, ..., and used as pixel sets 82a, 82b, 82c, 82d,.... By sequentially extracting unextracted pixels in the original images 80a, 80b, 80c, 80d,... (For example, the white dots in the figure) at the same interval, the entire image is divided into pixel sets.
 Then, the pixel value of the N pixel set extracted from the same position in the N analysis target images is determined as the input data I of the principal component analysis 1 ~I N. For example, if pixels are extracted every 15 pixels in the vertical and horizontal directions from an RGB image of 1024×768 pixels, it becomes the same as figure 2 The data size described in this section is the same when each block divided into 16 in the vertical and horizontal directions forms a pixel set. The principal component analysis unit 24 repeats principal component analysis on the set of all pixel sets segmented from the original image. If the pixel sets 82a, 82b, 82c, 82d, ... are replaced by blocks, the subsequent processing is the same as the processing when performing block division.
 However, each element of the feature vector obtained by the primary principal component analysis corresponds to a pixel at a discrete position in the original image. Therefore, the integration unit 26 returns each element of the feature vector obtained for each pixel set to the position before the extraction of the corresponding pixel, thereby repurchasing the image of the first principal component and the second principal component corresponding to the original image. The image of the component, the image of the third principal component,....
 Figure 7 Shows an example of a restored image using the feature vector obtained by principal component analysis of pixels extracted at regular intervals from the original image. Original image and Figure 4 The situation is the same. The restored image 84 is an image obtained by linearly combining the images of the feature vectors of the principal components generated as described above with appropriate coefficients. versus Figure 4 Compared with the restored image 70, there is no noise like the boundary line of a larger unit, but a careful observation shows that the striped noise can be found mainly in the longitudinal direction.
 The cause is the same as the above-mentioned block noise. The order of the feature vectors obtained after the principal component analysis for each pixel set is shifted between the pixel sets. As a result, there are multiple components in an image of one feature vector. . Figure 8 An image showing the feature vector at this time. Observing in detail the image 86a of the first principal component, the image 86b of the second principal component, and the image 86c of the third principal component, it can be seen that, in particular, the image 86b of the second principal component exhibits vertical and horizontal stripes close to the pixel unit. That kind of fine noise.
 Picture 9 As another example of a method of extracting pixels when forming a pixel set, schematically shows a method of randomly (irregularly) extracting pixels from the original image. In this method, pixels (for example, the black dots in the figure) are randomly extracted from the original image to be analyzed 90a, 90b, 90c, 90d, ..., and become pixel sets 92a, 92b, 92c, 92d, .... The unextracted pixels (for example, the white dots in the figure) in the original images 90a, 90b, 90c, 90d,... Are sequentially and similarly extracted, thereby dividing the entire image into pixel sets. It is preferable to extract pixels forming one pixel set from positions that are not deviated as much as possible. Therefore, for example, a random number or pseudo-random number is generated for the position coordinates of the pixel to determine the extraction target.
 Then, the pixel value of the N pixel set extracted from the same position in the N analysis target images is used as the input data of the principal component analysis I 1 ~I N. For example, if the RGB image of 1024×768 pixels is extracted every 64×48=3072, it becomes the figure 2 Each block divided into 16 in the vertical and horizontal directions described above has the same data size when forming a pixel set. The principal component analysis unit 24 repeats principal component analysis on the set of all pixel sets segmented from the original image. If the pixel sets 92a, 92b, 92c, 92d, ... are replaced with blocks, the processing thereafter is the same as in block division.
 However, as with the method of extracting pixels at equal intervals, each element of the feature vector obtained by the primary principal component analysis corresponds to a pixel at a discrete position in the original image. Therefore, the integration unit 26 returns each element of the feature vector obtained for each pixel set to the position before the extraction of the corresponding pixel, thereby reconstructing the image of the first principal component and the second principal component corresponding to the original image. The image of the third principal component,....
 Picture 10 Shows an example of an image restored using a feature vector obtained after principal component analysis of pixels randomly extracted from the original image. Original image and Figure 4 The situation is the same. The restored image 94 is an image obtained by linearly combining the images of the feature vectors of the principal components generated as described above with appropriate coefficients. versus Figure 4 Restored image 70 and Figure 7 Compared with the restored image 84, no noise such as boundary lines or stripes is observed. This is because even if the order of the feature vectors obtained after the principal component analysis is performed for each pixel set is shifted between the pixel sets, since the position of the corresponding pixel in the image plane is random, the feature vector The elements of are scattered in the screen.
 In addition, the "random", that is, "irregular" extraction in this solution may not be "irregular" in a strict sense that is completely unrelated to the relative positions of the pixels extracted at one time. In other words, even if the relative positions of the pixels to be extracted are regular, as long as the pixels to be extracted are not continuous in a visible length or scattered to the extent that they form a certain shape such as a line or a rectangle, they are also regarded as "not" in a broad sense. rule". Within this range, even if multiple extraction patterns are prepared in advance, they are defined as "irregular". No matter which one of such generalized "irregular" extraction methods is used, it is as Picture 10 No more noise like border lines or stripes as shown.
 Picture 11 The image 96a of the first principal component, the image 96b of the second principal component, and the image 96c of the third principal component at this time are shown. As the pixels constituting the pixel set are scattered as described above, the noise generated by dividing the pixels is not seen in any image. As described above, there are various methods for pixel extraction to form a pixel set. In view of the generation of noise as described above, it is considered that the method of randomly extracting pixels is the most effective. On the other hand, as for the processing load during extraction and image reconstruction, the simpler the method such as block division, the better. Considering such advantages, for example, when the original image itself contains a lot of high-frequency components and other noises are difficult to see, sometimes it is better to choose a block division method or an equal interval extraction method.
 Therefore, the pixel set forming unit 22 can perform image analysis such as frequency analysis on the original image, and adaptively select the extraction method based on the result. In addition, different methods may be used to extract pixels for each area in the original image. When the extraction method is to be changed adaptively in this way, a table is prepared in advance that associates the image analysis results such as the frequency band with the appropriate extraction method so that the pixel set forming unit 22 can refer to it. The pixel set forming unit 22 notifies the unifying unit 26 of the selected method, information about the positions of the pixels constituting the pixel set in the image plane, and the like. The integration unit 26 reconstructs the image of the feature vector based on the notified information.
 The image generating unit 32 reads out the image data of the generated feature vector from the feature vector storage unit 30 as described so far, and displays it on the display device 14. Then, the coefficient is changed according to the user's operation, or the selected feature vector is emphasized or deleted to generate the final image data requested by the user, and the data is stored in the generated image data storage unit 34. Here, the data of the final image is a data set containing the image data of the feature vector used in the linear combination and the coefficients involved.
 As described above, the function of the image generating unit 32 and the function of generating a feature vector by principal component analysis can be separated and used as different devices. Picture 12 Shows the structure of an image processing system that has the function of generating an image using an image of a feature vector. The image processing system 102 includes an input device 12 that receives instructions related to generating an image from a user, an image processing device 110 that generates image data according to the user's instruction, and a display device 114 that displays required data. The composition of the input device 112 and the display device 114 is the same as figure 1 The input device 12 and the display device 14 have the same configuration.
 The image processing device 110 includes: a feature vector storage unit 130, which stores feature vectors; a feature vector reading unit 132, which reads out the feature vector to be processed; and an image adjustment unit 134, which manipulates coefficients and the like as required by the user to adjust the generated Image; and the generated image data storage unit 136 stores the data of the generated image. The structure of the feature vector storage unit 130 and the generated image data storage unit 136 is the same as figure 1 The feature vector storage unit 30 has the same configuration as the generated image data storage unit 34.
 The feature vector readout unit 132 and the image adjustment unit 134 are figure 1 The image generating unit 32 corresponds to. Therefore, the image generating unit 32 can perform the operations described below in the same manner. The image of the feature vector obtained as one set of image data via a network, a recording medium, or the like is stored in the feature vector storage unit 130 in advance. The image of the feature vector may have been appropriately compressed and coded. The feature vector reading unit 132 obtains the image data name specified by the user via the input device 112 and the like, and reads the image data of the corresponding feature vector from the feature vector storage unit 130. The data of the feature vector may be obtained from the input device 112. The read image is displayed on the display device 114 and provided to the image adjustment unit 134.
 In the display device 114, for example, the arrangement of the images of all feature vectors is as follows Figure 5 Display as shown. The image generated by linear combination of them is also displayed. In addition, the coefficients involved in each feature vector can be displayed in an adjustable manner. That is, a window or a cursor GUI for directly inputting the coefficient value is displayed to accept the input of the increase or decrease coefficient. The image adjustment unit 134 obtains the coefficient value input by the user via the input device 12, the information of the increase/decrease coefficient, and the information related to the selection of the image of the feature vector, and recalculates the result of the linear combination accordingly. Then, the display of the image generated as a result is updated. When the user inputs an instruction to specify an image, the image adjustment unit 134 collects the data of the feature vector for displaying the image and the set of coefficients, and stores it in the generated image data storage unit 136 as data of the generated image.
 In addition, the original image to be processed in this embodiment may be a still image or each frame of a moving image. In the case of moving images, it is possible to uniformly set coefficients for multiple frames. By enabling the feature vector to be confirmed as an image, for example, it is possible to increase or decrease the degree of expression of shadow and light, or only emphasize the movement of a certain object. In addition, by using a still image taken with multiple changes in the direction of the light source as the original image, and changing the coefficient with time, it is possible to generate a moving image in which the direction of the light source gradually changes. Human facial expressions, movements, changes in objects, etc., can also be used to easily create moving images that continuously change with small-sized data composed of feature vector images.
 According to this embodiment described above, principal component analysis is performed with the entire original image as the processing target, and an image of feature vectors is generated for the entire region. By doing so, even if it is a feature vector, it is possible to display a meaningful image for the user, and it is possible to easily select a desired image or adjust the coefficient from it. As a result, a desired image can be easily generated.
 In the principal component analysis, pixels are extracted from the original image to form a pixel set, and the analysis is performed in this unit. As a result, even if an image with a high resolution is used as a processing target, it is possible to prevent the occurrence of an undesirable situation where the memory capacity is insufficient and the correlation matrix cannot be expanded. As a result, it is possible to perform stable processing in the same manner without depending on the resources of the image processing apparatus, the resolution of the image, and the like.
 The present invention has been described above based on the embodiments. Those skilled in the art should understand that the above-mentioned embodiments are examples, and various combinations of constituent elements and processing procedures may have various modifications, and such modifications are also included in the scope of the present invention.

## PUM  ## Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Who we serve
• R&D Engineer
• R&D Manager
• IP Professional
Why Eureka