Method for realizing classification of scene images

A technology of scene image and classification method, which is applied in the direction of instruments, character and pattern recognition, computer parts, etc., can solve the problem of not being able to effectively use various feature information of images, scene image classification has not yet achieved satisfactory results, and ignoring classifiers Complementary advantages of different features and other issues to achieve the effect of improving classification accuracy, fast computing speed, and good noise resistance

Active Publication Date: 2010-08-25
INST OF AUTOMATION CHINESE ACAD OF SCI
1 Cites 41 Cited by

AI-Extracted Technical Summary

Problems solved by technology

The existing scene image classification methods often use a single classifier, and only select a single-stage classifier to complete the final classification according to the empirical observation of the scene image. This method cannot effectively use va...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention discloses a method for realizing classification of scene images. In the method, the scene images are classified by adopting two stages of classifiers, wherein the first-stage classifier obtains candidate classifications by utilizing global structural information characteristics, and judges similar classification pairs according to classification results; and the second-stage classifier distinguishes similar classifications by utilizing local texture information characteristics. By adopting the cascade of the classifiers and comprehensively utilizing the global structural information characteristics and the local texture information characteristics of the scene images, the method can realize robust classification of different scene classifications and effectively distinguish similar scene classifications.

Application Domain

Technology Topic

Classification resultClassification methods

Image

  • Method for realizing classification of scene images
  • Method for realizing classification of scene images
  • Method for realizing classification of scene images

Examples

  • Experimental program(1)

Example Embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail in conjunction with specific embodiments and with reference to the accompanying drawings.
Figure 1 shows a schematic diagram of the system structure of the present invention. The basic hardware conditions required to realize the system structure of the present invention are: a computer with a main frequency of 2.4GHz and a memory of 1G; the required software conditions are: a programming environment ( Visual C++ 6.0), the system structure of the present invention is implemented in a computer, including: scene image reading module 1, gray image judgment module 2, gray image conversion module 3, image grade division module 4, local binary mode feature calculation Module 5, local binary pattern feature quantification module 6, histogram calculation module 7, principal component analysis calculation module 8, histogram feature fusion module 9, counting module 10, structural information feature fusion module 11, classifier training module 12, the first First-level classifier 13, similar category pair calculation module 14, local texture feature calculation module 15, second-level classifier 16 and classification result fusion module 17;
The scene image reading module 1 reads the scene image; the gray image judgment module 2 is connected to the scene image reading module 1, and the gray image judgment module 2 receives the scene image, and judges and outputs the color image or gray image of the scene image ; Grayscale image conversion module 3 is connected to grayscale image judgment module 2. Grayscale image conversion module 3 receives color images and converts the color images into grayscale images; image level division module 4 is connected with grayscale image judgment module 2 and The gray-scale image conversion module 3 is connected, and the image-level division module 4 divides the gray-scale image according to three levels to obtain the first-level division, the second-level division and the third-level division into 31 image blocks corresponding to the local binary The pattern feature calculation module 5 is connected to the image level division module 4. The local binary pattern feature calculation module 5 calculates the structural information feature of each pixel in the image block to obtain an 8-dimensional local binary pattern feature; local binary pattern feature The quantization module 6 is connected to the local binary pattern feature calculation module 5. The local binary pattern feature quantization module 6 quantizes the 8-dimensional local binary pattern features to obtain the 1-dimensional local binary pattern quantized features; the histogram calculation module 7 and the local The binary pattern feature quantization module 6 is connected, and the histogram calculation module 7 calculates the histogram of the quantized features of the 1-dimensional local binary pattern to obtain the 255-dimensional histogram feature H ps; The principal component analysis calculation module 8 is connected to the histogram calculation module 7, and the principal component analysis calculation module 8 pairs 255-dimensional histogram features H ps Perform principal component analysis to obtain 40-dimensional histogram feature H p; Then use the histogram calculation module 7 to calculate the histogram of the 8-dimensional local binary pattern feature to obtain the 8-dimensional histogram feature H b; The histogram feature fusion module 9 is connected to the principal component analysis calculation module 8, and the histogram feature fusion module 9 combines the 8-dimensional histogram feature H b And 40-dimensional histogram feature H p Fusion, obtain the 48-dimensional structural information feature H corresponding to an image block f =(H b , H p ); The counting module 10 is connected to the histogram feature fusion module 9, and the counting module 10 calculates the 48-dimensional structural information feature H of 31 image blocks f , When H f <31, output the uncompleted image block to the local binary pattern feature calculation module 5, H f =31, the 48-dimensional structural information feature H of 31 image blocks is output f; The structural information feature fusion module 11 is connected to the counting module 10, and the structural information feature fusion module 11 performs 48-dimensional structural information features of 31 image blocks. f Perform fusion to obtain the global structural information feature H of an image g =(H f1 ,..., H f31 ); The classifier training module 12 is connected to the structure information feature fusion module 11, the classifier training module 12 trains the global structure information features to obtain the first-level classifier 13; the first-level classifier 13 is connected to the classifier training module 12, The first-level classifier 13 classifies the scene image and obtains the classification result R in the form of probability 1; The similar category pair calculation module 14 is connected to the first-level classifier 13, and the similar category pair calculation module 14 counts the classification results R 1 The first two candidates in the possible situation, namely statistics (R 11 , R 12 ) To obtain N similar category pairs; the local texture feature calculation module 15 is connected to the similar category pair calculation module 14, and the local texture feature calculation module 15 calculates the N similar category pairs to obtain the local texture information features of the scene image ; The classifier training module 12 is connected to the local texture feature calculation module 15. The classifier training module 12 trains the local texture information feature of the scene image to obtain N second-level classifiers; the second-level classifier 16 and the classifier training module 12 connections, the second-level classifier 16 classifies the scene image, and obtains N classification results C in the form of probability i =(C i1 , C i2 ), i∈[1,N], where C i1 , C i2 Arrange from largest to smallest; the classification result fusion module 17 is connected to the second-level classifier 16, and the classification result fusion module 17 combines the result R1 obtained by the first-level classifier 13 with the second-level classifier 16 to obtain the result C i The final classification result of the scene image is obtained.
The specific steps of the classifier training module 12 are as follows:
The global structure feature output by the structural information feature fusion module 11 and the local texture feature output by the local texture feature calculation module 15 are used as the input learning sample x. For scene images, there are only two types of problems, the identification given by the support vector machine (SVM) The function equation is:
f ( x ) = X i = 1 N y i α i k ( x , x i ) + b
Among them, N is the number of learning samples, y i Is the learning sample x i The category (+1 represents a positive sample, -1 represents a negative sample), b is a constant, k(x, x i ) Is a kernel function, which is defined as follows:
k(x, x i )=Φ(x)·Φ(x i )
Among them, Φ(x) and Φ(x i ) Is a function that can transform x into a high-dimensional space. The above formula can be regarded as a linear equation of the weight vector w:
w = X i = 1 N y i α i · Φ ( x i )
Parameter α i , I = 1, 2,..., N, determined by the learning sample by solving the following optimization problems:
min J ( w ) = 1 2 | | w | | 2
s.t.y i f(x i )≥1-ξ i , Ξ i ≥0, i=1, 2,..., N
ξ i It is a slack variable. This is a quadratic programming problem, which can be solved by converting to its dual problem:
max W ( α ) = X i = 1 N α i - 1 2 X i , j = 1 N α i α j y i y j k ( x i , x j )
s . t . X i = 1 N α i y i = 0
0≤α i ≤C, i=1, 2,..., N
In the formula, C is called the penalty primer. It is a certain specified constant. It can control the degree of punishment for the wrong sample, and achieve this between the proportion of the wrong sample and the complexity of the algorithm.
The above quadratic programming problem can be solved by optimization algorithm, and finally some α i Learning samples that are not 0, these samples are called Support Vectors (SVs). For multi-class problems, a one-to-many method can be used to convert multi-class problems into two types of problems. The discriminating equation of each type is obtained by super-class training by itself and the rest of the other categories. Find the category corresponding to the maximum value of the formula output.
Specific steps of the local texture feature calculation module 15:
Let X be the set of pixels in the similar category picture output by the similar category pair calculation module 14, that is, ordinary vector quantization needs to satisfy the formula:
min V X m = 1 M min k = 1 . . . K | | x m - v k | | 2 2
Where V=[v 1 ,..., v K ] T The set of visual words in the codebook (or dictionary) formed by the similar category pictures output by the calculation module 14 representing similar categories is generally obtained by clustering X. This quantification is essentially used with the feature x m The nearest visual word v k ′ To represent x m However, this quantization is likely to cause larger quantization errors. In order to reduce quantization errors, the method of the present invention can use a linear combination of all pixels to represent x m At the same time, in order to prevent over-fitting, the coefficients of the linear combination need to be restricted, as shown in the formula:
min U , V X m = 1 M | | x m - u m V | | 2 2 + λ | | u m | | 0
||v k ||≤1,
Where U=[u 1 ,..., u M ] T Is the set of linear combination coefficients, ||u m || 0 Means u m The 0 norm of u m The number of non-zero elements in the codebook V should be over-complete, that is, K>D. Solve the above optimization equation, and set u m As a local texture feature.
As shown in Figure 2, the present invention provides a method for extracting global structural information features of scene images. The specific steps of the method are as follows:
Step S1: Use the scene image reading module 1 to read the scene image, and use the gray image judgment module 2 to determine whether the scene image is a color image, and if it is a color image, use the gray image conversion module 3 to convert the color image. Obtain a grayscale image, if it is a grayscale image, skip to step S2;
Step S2: Use the image level division module 4 to divide the gray image according to three levels to obtain image blocks corresponding to the first level division, the second level division and the third level division; after the three levels of division, 31 images are obtained Piece;
Step S3: Use the local binary pattern feature calculation module 5 to calculate the structural information feature of each pixel in the image block to obtain an 8-dimensional local binary pattern feature;
Step S4: Use the local binary pattern feature quantization module 6 to quantize the 8-dimensional local binary pattern features to obtain the 1-dimensional local binary pattern quantized features, and use the histogram calculation module 7 to calculate the quantized features of the 1-dimensional local binary pattern Histogram, get 255-dimensional histogram feature H ps; Use principal component analysis calculation module 8 pairs of 255-dimensional histogram feature H ps Perform principal component analysis to obtain 40-dimensional histogram feature H p; Then use the histogram calculation module 7 to calculate the histogram of the 8-dimensional local binary pattern feature to obtain the 8-dimensional histogram feature H b; Finally, use the histogram feature fusion module 9 to convert the 8-dimensional histogram feature H b And 40-dimensional histogram feature H p Fusion, obtain the 48-dimensional structural information feature H corresponding to an image block f =(H b , H p );
Step S5: Use the counting module 10 to determine the 48-dimensional structural information feature H of all 31 image blocks f Whether all calculations are completed, if steps S3 to S4 are not repeated, if all calculations are completed, skip to step S7;
Step S6: Use the structural information feature fusion module 11 to analyze the 48-dimensional structural information feature H of all 31 image blocks f Perform fusion to obtain the global structural information feature H of an image g =(H f1 ,..., H f31 );
Step S7: Use the classifier training module 12 to train the global structure information feature to obtain the first-level classifier 13;
Step S8: Classify the scene image with the first-level classifier 13 to obtain the classification result R in the form of probability 1 =(R 11 , R 12 ,..., R 1n ), where R 1i (i ∈ [1, n]) are arranged in descending order, n is the number of scene image categories;
Step S9: Use similar categories to count the classification results R of the calculation module 14 1 The first two candidates in the possible situation, namely statistics (R 11 , R 12 ) Possible combinations to obtain N similar category pairs, N ∈ [1, n(n-1)/2], in order to reduce the subsequent calculation complexity, N can generally be set to n/5;
Step S10: Use the local texture feature calculation module 15 to perform calculations for N pairs of similar categories to obtain local texture information features of the scene image;
Step S11: Use the classifier training module 12 to train the local texture information features to obtain N second-level classifiers 16;
Step S12: Classify the scene image with the second-level classifier 16, and obtain N classification results C expressed in probability form i =(C i1 , C i2 ), i∈[1,N], where C i1 , C i2 Arrange from largest to smallest;
Step S13: Use the classification result fusion module 17 to convert the result R obtained in step S8 1 Fuse with the result obtained in step S12 to obtain the final classification result of the scene image.
The specific steps of the image level division module 4 performing the first level division of the corresponding image blocks are as follows:
Step S211: First, the grayscale image is uniformly divided into 4×4 according to the aspect ratio to obtain 16 image blocks, as shown by the solid line of the first level division in FIG. 3;
Step S212: After cutting off 1/8 of the four sides of the grayscale image, the grayscale image is divided evenly according to the aspect ratio of 3×3 to obtain 9 image blocks. At this time, the first-level division obtains a total of 25 Image blocks, as shown by the dotted line in the first level of division in Figure 3.
Among them, the specific steps of the image level division module 4 performing the second level division of the corresponding image blocks are as follows:
Step S221: First, the grayscale image is reduced by one time according to the original aspect ratio, and then the reduced image is evenly divided into 2×2 according to the aspect ratio to obtain 4 image blocks, as shown in the second level of division in Figure 3 Shown by the solid line;
Step S222: After cutting off 1/4 of the periphery of the grayscale image, 1 image block is obtained. At this time, the second-level division obtains a total of 5 image blocks, as shown by the dotted line of the second-level division in FIG. 3.
Among them, the specific steps of the image level division module 4 performing the third level division of the corresponding image blocks are as follows: the grayscale image is reduced by a factor of the original aspect ratio to obtain 1 image block, as shown in the third level division in Figure 3 Line shown.
Among them, the specific steps for the local binary pattern feature calculation module 5 to obtain 8-dimensional local binary pattern features are as follows:
Step S31: First define the pixel P in the image block 0 The 8 neighborhood pixels are P i , Where i=1~8;
Step S32: Compare pixel P 0 The intensity of the gray scale and 8 neighborhood pixels P i The gray scale intensity, if P 0P i , It is recorded as 0, if P 0 i , Then it is recorded as 1, for each pixel, a total of 8-dimensional local binary pattern features are obtained: F b =(f b1 , F b2 , F b3 , F b4 , F b5 , F b6 , F b7 , F b8 ), f bi =0 or 1, i=1-8.
Among them, the similar category pair calculation module 14 determines the specific steps of similar category pairs as follows:
Step S91: Statistics (R 11 , R 12 ) The number of common occurrences C(R 11 , R 12 );
Step S92: Calculate R 11 And R 12 The classification accuracy P(R 11 ) And P(R 12 );
Step S93: Calculate the similarity S=C(R 11 , R 12 )/(P(R 11 )·P(R 12 ));
Step S94: Count the top N category pairs with the largest similarity S (R 11 , R 12 ) As a pair of similar categories.
Among them, the classification result fusion module 17 obtains the result R1 of the first-level classifier 13 and the second-level classifier 16 obtains the result C i The integration steps are as follows:
Step S131: Find the first-level classification result R of the classification result 1 =(R 11 , R 12 ,..., R 1n ) In R 11 And the second level classification result C i =(C i1 , C i2 ), C in i∈[1,N] i1 The maximum value Tmax;
Step S132: Determine whether Tmax is greater than θ, where θ is an empirical value set according to the scene image, if Tmax is greater than θ, the final classification result of the scene image is R 1 , If Tmax is less than or equal to θ, the final scene image classification result is R 1 And C i (i ∈ [1, N]) weighted average.
The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Anyone familiar with the technology can understand the changes or substitutions that are conceivable within the technical scope disclosed in the present invention. All should be covered within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Pedestrian re-identification method based on clustering guidance and paired measurement triple loss

PendingCN113158955AStrong discriminationImproving Metric Learning PerformanceCharacter and pattern recognitionNeural architecturesComputer visionCosine similarity
Owner:HANGZHOU DIANZI UNIV

Thin layer identification method for amino acid components in radix pseudostellariae

ActiveCN109406709AStrong discriminationComponent separationChemistryThin layer chromatographic
Owner:福建省中医药科学院

Method, system and device for intention recognition model training and medium

PendingCN114692737AStrong discriminationImprove model performanceDigital data information retrievalSemantic analysisMachine learningGraph recognition
Owner:浙江百应科技有限公司

Classification and recommendation of technical efficacy words

  • Strong discrimination
  • Improve classification accuracy

Image Hash retrieval method based on KPCA multi-table index

ActiveCN106815362AStrong discriminationOvercome the defect of low retrieval performanceStill image data indexingSpecial data processing applicationsImaging FeatureFeature mapping
Owner:FUZHOU UNIV

Efficient MSK direct spread communication detecting method

InactiveCN103929215AImprove low signal-to-noise ratio adaptabilityStrong discriminationTransmissionFrequency bandPseudorandom sequence
Owner:XUCHANG UNIV

Pedestrian re-identification method based on clustering guidance and paired measurement triple loss

PendingCN113158955AStrong discriminationImproving Metric Learning PerformanceCharacter and pattern recognitionNeural architecturesComputer visionCosine similarity
Owner:HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products