An image recognition method, device, storage medium and electronic device

By using pre-trained human body part recognition models and image pattern recognition models, combined with part labeling and preprocessing steps, the problems of low efficiency and poor accuracy in image recognition in existing technologies are solved, and efficient and accurate image type differentiation is achieved.

CN116824247BActive Publication Date: 2026-06-23SHANGHAI UNITED IMAGING INTELLIGENCE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANGHAI UNITED IMAGING INTELLIGENCE CO LTD
Filing Date
2023-06-27
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In existing technologies, the manual recognition method for distinguishing CT and MRI images is inefficient and has a large error. In particular, when the DICOM information is filled in incorrectly, it is impossible to accurately distinguish between enhanced and plain scan images, resulting in inaccurate image recognition.

Method used

A pre-trained human body part recognition model is used to determine the part labels of each slice image in the image sequence, and an image pattern recognition model is used to determine the image type. The part labels and recognition results are used to determine the final result, including preprocessing steps such as gray value normalization, image orientation correction and cropping.

Benefits of technology

It improves the accuracy and efficiency of image recognition, reduces human intervention, and ensures accurate differentiation of image types.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116824247B_ABST
    Figure CN116824247B_ABST
Patent Text Reader

Abstract

The specification discloses a method and device for image recognition, a storage medium and an electronic device. First, a sequence of images to be recognized is received, and a human body part recognition model is used to determine the part labels of each slice image in the sequence of images to be recognized, and a first image is determined according to the part labels. Then, the recognition result of each first image is determined. Finally, the final result of recognizing the images to be recognized is determined according to the recognition result of each first image. By determining the part labels of each slice image and selecting part of the slice images as the first image, the final recognition result of the images to be recognized is determined according to the recognition result of the recognized first image, so that when the images are recognized, part of the slice images are recognized using the pre-trained image pattern recognition model, the accuracy of image recognition is improved, and the efficiency of image recognition is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and in particular to a method, apparatus, storage medium and electronic device for image recognition. Background Technology

[0002] With the development of technology, artificial intelligence (AI) has increasingly attracted public attention. AI is also widely used in the medical field. For example, when using equipment such as Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) to diagnose diseases in the human body, plain or enhanced images can be obtained. In particular, the images obtained by injecting contrast agents into the human body are enhanced images. For instance, when injecting water-soluble organic iodine agents (such as 60%–76% meglumine diatrizoate) intravenously, the increased iodine concentration in the blood creates a density difference between the normal and diseased areas of an organ. During scanning, this makes the lesion more clearly visible, and the relationship between the lesion and blood vessels can be clearly displayed.

[0003] Therefore, when the diagnosis from plain CT scans is unclear and does not match the clinical diagnosis, contrast-enhanced imaging can be performed. Contrast-enhanced imaging can be used to observe vascular lesions and tissues with rich blood supply, as well as the blood supply to the lesion site. It can also be used to observe lesion tissue and normal tissue. Due to the different uptake capacity of the contrast agent, the signal intensity reflected is different, which can more accurately determine the nature of the lesion.

[0004] Currently, it's possible to determine whether an image is a plain scan or an enhanced image using manual identification methods. However, the sheer volume of both CT and MRI images makes manual identification both labor-intensive and time-consuming. While image differentiation based on Digital Imaging and Communications in Medicine (DICOM) information is problematic, as DICOM data is filled out by staff, omissions and errors are inevitable, hindering accurate differentiation. Furthermore, differentiating images based on grayscale values ​​is also problematic because the image segmentation for different regions can introduce errors, affecting the overall grayscale value of different areas and preventing accurate differentiation.

[0005] Therefore, how to achieve efficient and accurate intelligent recognition of enhanced and flat-scanned images is an urgent problem to be solved. Based on this, this specification provides an image recognition method. Summary of the Invention

[0006] This specification provides a method, apparatus, storage medium, and electronic device for image recognition, to at least partially solve the aforementioned problems existing in the prior art.

[0007] The following technical solution is adopted in this specification:

[0008] This specification provides an image recognition method, the method comprising:

[0009] The system receives a sequence of images to be identified, determines the part labels of each slice image in the sequence of images to be identified through a pre-trained human body part recognition model, and selects at least a portion of the slice images as the first image based on the part labels.

[0010] Each of the first images is input into a pre-trained image pattern recognition model to determine the recognition result corresponding to each first image;

[0011] Based on the determined recognition results, the final result of recognizing the image to be recognized is determined; wherein the final result is either an enhanced image or a flat scan image.

[0012] Optionally, a sequence of images to be identified is received, specifically including:

[0013] Receive a 3D image to be identified, slice the 3D image, and determine each slice image corresponding to the 3D image as the image sequence to be identified.

[0014] Optionally, receiving a sequence of images to be identified, determining the location labels of each slice image in the sequence of images to be identified using a human body part recognition model, and selecting at least a portion of the slice images as the first image based on the location labels, specifically including:

[0015] Receive a 3D image to be identified and divide the 3D image into several sub-3D images;

[0016] Based on the obtained sub-3D images, the part labels of each sub-3D image are determined by a pre-trained human part recognition model;

[0017] For each part label, select at least one sub-3D image from the sub-3D images corresponding to that part label as the first image.

[0018] Optionally, selecting at least a portion of the slice images from each slice image as the first image based on the location label specifically includes:

[0019] Retrieve the region tags of the area of ​​interest;

[0020] Based on the region label of the region of interest and the region identification labels of each slice image, the slice image of interest is determined as the first image;

[0021] The first images are processed according to a preset image processing method; wherein the preset image processing method includes at least one of grayscale value normalization, image orientation correction, and image cropping.

[0022] Optionally, the recognition result is either an enhanced image or a flat scan image;

[0023] Based on the determined recognition results, the final result of recognizing the image to be recognized is determined, specifically including:

[0024] Based on the proportion of recognition results belonging to the enhanced image among the determined recognition results;

[0025] If the percentage is greater than the first preset value, then the final result of the recognition of the image to be recognized is determined to be an enhanced image;

[0026] If the percentage is not greater than the first preset value, then the final result of recognizing the image to be recognized is determined to be a flat scan image.

[0027] Optionally, based on the determined recognition results, the final recognition result of the image to be recognized is determined, specifically including:

[0028] For each recognition result, determine the corresponding part label in the first image;

[0029] The weight corresponding to the recognition result is determined based on the preset weights of the labels for each part.

[0030] The sum of the weighted recognition results belonging to the enhanced image is determined as the first value, and the sum of the weighted recognition results belonging to the plain scan image is determined as the second value.

[0031] When the first value is greater than the second value, the final result is determined to be an enhanced image;

[0032] If the first value is not greater than the second value, the final result is determined to be a flat scan image.

[0033] Optionally, based on the determined recognition results, the final recognition result of the image to be recognized is determined, and the method further includes:

[0034] For each first image, determine a first probability that the first image belongs to an enhanced image and a second probability that it belongs to a plain scan image;

[0035] The first probability and the second probability are weighted according to the preset weights of the part labels corresponding to the first image.

[0036] Determine the sum of the weighted first probabilities corresponding to each first image, and the sum of the weighted second probabilities corresponding to each first image;

[0037] When the sum of the first probabilities is greater than the sum of the second probabilities, the final result is determined to be an enhanced image;

[0038] If the sum of the first probabilities is not greater than the sum of the second probabilities, the final result is determined to be a flat scan image.

[0039] Optionally, the part labels of each slice image in the image sequence to be identified are determined by a pre-trained human part recognition model, specifically including:

[0040] For each slice image in the image sequence to be identified, the adjacent slice images are determined, and the slice image and the determined adjacent slice images are used as the target slice image;

[0041] The target slice image is input into the human body part recognition model to obtain the human body part recognition result output by the human body part recognition model, which is used as the part label of the slice image.

[0042] This specification provides an image recognition device, comprising:

[0043] An image receiving module is used to receive a sequence of images to be identified, determine the part labels of each slice image in the sequence of images to be identified through a pre-trained human body part recognition model, and select at least a portion of the slice images as the first image based on the part labels.

[0044] The image input module is used to input each of the first images into a pre-trained image pattern recognition model to determine the recognition result corresponding to each of the first images;

[0045] The result determination module is used to determine the final result of recognizing the image to be recognized based on the determined recognition results; wherein the final result is either an enhanced image or a flat scan image.

[0046] Optionally, the image receiving module is specifically used to receive the three-dimensional image to be identified, slice the three-dimensional image, and determine each slice image corresponding to the three-dimensional image as the image sequence to be identified.

[0047] Optionally, the image receiving module is specifically used to: receive the three-dimensional image to be identified and divide the three-dimensional image into several sub-three-dimensional images; determine the part label of each sub-three-dimensional image based on the obtained sub-three-dimensional images through a pre-trained human part recognition model; and select at least one sub-three-dimensional image from the sub-three-dimensional images corresponding to each part label as the first image.

[0048] Optionally, the image receiving module is specifically used to: obtain the region label of the region of interest; determine the slice image of interest as the first image based on the region label of the region of interest and the region identification label of each slice image; and process each first image according to a preset image processing method; wherein the preset image processing method includes at least one of grayscale value normalization, image orientation correction, and image cropping.

[0049] Optionally, the recognition result is either an enhanced image or a flat scan image;

[0050] The result determination module is specifically used to determine the proportion of recognition results belonging to enhanced images among the determined recognition results; if the proportion is greater than a first preset value, the final result of the recognition of the image to be recognized is determined to be an enhanced image; if the proportion is not greater than the first preset value, the final result of the recognition of the image to be recognized is determined to be a flat scan image.

[0051] Optionally, the result determination module is specifically used to: for each recognition result, determine the part label of the first image corresponding to the recognition result; determine the weight corresponding to the recognition result according to the preset weight of each part label; determine the sum of each weighted recognition result belonging to the enhanced image as a first value, and the sum of each weighted recognition result belonging to the flat scan image as a second value; when the first value is greater than the second value, determine the final result as an enhanced image; when the first value is not greater than the second value, determine the final result as a flat scan image.

[0052] Optionally, the result determination module is further configured to: for each first image, determine a first probability that the first image belongs to an enhanced image and a second probability that it belongs to a plain scan image; weight the first probability and the second probability according to the preset weights of the part labels corresponding to the first image; determine the sum of the weighted first probabilities for each first image and the sum of the weighted second probabilities for each first image; when the sum of the first probabilities is greater than the sum of the second probabilities, determine the final result as an enhanced image; when the sum of the first probabilities is not greater than the sum of the second probabilities, determine the final result as a plain scan image.

[0053] Optionally, the image receiving module is specifically used to: for each slice image in the image sequence to be identified, determine the adjacent slice images of the slice image, and use the slice image and the determined adjacent slice images as target slice images; input the target slice image into the human body part recognition model to obtain the human body part recognition result output by the human body part recognition model, and use it as the part label of the slice image.

[0054] This specification provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described image recognition method.

[0055] This specification provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the above-described image recognition method.

[0056] The above-mentioned technical solutions adopted in this specification can achieve the following beneficial effects:

[0057] In the image recognition method provided in this specification, a sequence of images to be recognized is first received. Then, using a pre-trained human body part recognition model, the part labels of each slice image in the sequence are determined, and a first image is identified based on the part labels. Next, the recognition result of each first image is determined. Finally, based on the recognition results of each first image, the final result of recognizing the image to be recognized is determined.

[0058] As can be seen from the above method, by determining the part labels of each slice image and selecting a portion of the images from each slice image as the first image, the final recognition result of the image to be recognized can be determined based on the recognition result of the first image. This allows the use of a pre-trained image pattern recognition model to recognize a portion of the slice images when recognizing the image, thereby increasing the accuracy and efficiency of image recognition. Attached Figure Description

[0059] The accompanying drawings, which are included to provide a further understanding of this specification and form part of this specification, illustrate exemplary embodiments and their descriptions, serving to explain this specification and do not constitute an undue limitation thereof.

[0060] In the picture:

[0061] Figure 1 This is a flowchart illustrating one image recognition method described in this specification.

[0062] Figure 2 This is a schematic diagram of an image recognition device provided in this specification;

[0063] Figure 3 The corresponding information provided in this specification Figure 1 A schematic diagram of an electronic device. Detailed Implementation

[0064] To make the objectives, technical solutions, and advantages of this specification clearer, the technical solutions of this specification will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this specification, and not all of them. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this specification.

[0065] The technical solutions provided in the various embodiments of this specification are described in detail below with reference to the accompanying drawings.

[0066] Figure 1 This is a flowchart illustrating an image recognition method provided in this specification, which may specifically include the following steps:

[0067] S100: Receive the image sequence to be identified, determine the part label of each slice image in the image sequence to be identified through a pre-trained human body part recognition model, and select at least some slice images from each slice image as the first image according to the part label.

[0068] The execution subject of the technical solution in this specification can be any computing device with computing capabilities (such as a server, terminal, etc.). For ease of description, the server will be used as the execution subject in this description.

[0069] Generally, medical image sequences of the human body can be obtained through equipment such as CT or MRI. If the medical image sequence needs to be recognized to determine whether the slice image in the medical image sequence is a plain scan image or an enhanced image, the medical image sequence can be sent or input to the server as the image sequence to be recognized, and the server can receive the image sequence to be recognized.

[0070] It should be noted that the image sequence to be identified can be derived from a DICOM sequence or from 3D volume data; this specification does not impose any restrictions on the specific source.

[0071] In one or more embodiments of this specification, the server may receive a 3D image to be identified, slice the 3D image to be identified, determine each slice image corresponding to the 3D image, and use each slice image as a sequence of images to be identified. That is, when the server performs image recognition on a 3D image, i.e., the received image to be identified is a 3D image, it can slice the 3D image to obtain several slice images, i.e., 2D images, and use each slice image obtained by slicing the 3D image as a sequence of images to be identified.

[0072] Since the image sequence to be identified contains more than one slice image, and each slice image may come from different parts of the human body, the recognition results corresponding to slice images from different parts may be different. In other words, slice images from different parts have a certain impact on the results, and the image to be detected may only be a certain part of the human body. Therefore, in one or more embodiments of this specification, the server can input the image sequence to be identified into a pre-trained human body part recognition model to determine the part label of each slice image in the image sequence to be identified.

[0073] Among them, the human body part recognition model can use existing models.

[0074] After determining the location labels of each slice image, the server can select at least a portion of the slice images as the first image based on the location labels.

[0075] In one or more embodiments of this specification, when the image to be identified received by the server is a three-dimensional image, the three-dimensional image can be further divided into several sub-three-dimensional images. Based on each sub-three-dimensional image, a pre-trained human body part recognition model is used to determine the part label of each sub-three-dimensional image. Then, for each part label, at least one sub-three-dimensional image is selected from the sub-three-dimensional images corresponding to that part label as the first image. For example, if the three-dimensional image A to be identified is divided into three sub-three-dimensional images a, b, and c, and the sub-three-dimensional images a, b, and c are input into the human body part recognition model, the result is: sub-three-dimensional image a represents the stomach, sub-three-dimensional image b represents the stomach, and sub-three-dimensional image c represents the heart. Since each organ may be divided into multiple sub-three-dimensional images (such as sub-three-dimensional image a and sub-three-dimensional image b), at least one sub-three-dimensional image can be selected from the sub-three-dimensional images corresponding to each part label as the first image to represent the three-dimensional image of the part corresponding to that part label (e.g., sub-three-dimensional image b can be selected to represent the three-dimensional image of the stomach). This allows the image to be identified to be recognized in subsequent steps based on a more comprehensive slice image of the human body (i.e., the first image).

[0076] Furthermore, in one or more embodiments of this specification, when the server selects at least a portion of the slice images from each slice image as the first image based on the part labels, the server may first obtain the part labels of the parts of interest, and determine the slice images of interest as the first image based on the part labels of the parts of interest and the determined part identification labels of each slice image.

[0077] Furthermore, after obtaining the first image, in order to ensure that the specifications and styles of each first image are consistent, the server processes each first image using a preset image processing method.

[0078] The preset image processing methods include at least one of the following: grayscale normalization, image orientation correction, and image cropping. Of course, other image processing methods can also be used to preprocess the image, and this specification does not impose any restrictions on them.

[0079] It should be noted that when the image to be identified received by the server is a three-dimensional image, the segmentation of the three-dimensional image to obtain sub-three-dimensional images can be based on location or volume. This manual does not impose any restrictions on the specific segmentation method used, and existing mature technologies can be employed, which will not be elaborated upon here. Furthermore, this manual does not restrict the method used by the server when slicing the three-dimensional image; the specific method can be adaptively set according to the specific situation.

[0080] S102: Input each first image into a pre-trained image pattern recognition model to determine the recognition result corresponding to each first image.

[0081] Since each first image is selected from the sequence of images to be recognized, the recognition result of each first image must be determined first to determine the recognition result of the image to be recognized. Therefore, after the server obtains the first images, it can input each first image into a pre-trained image pattern recognition model to determine the recognition result of each first image. This allows the server to determine the final recognition result of the image to be recognized based on the recognition results of each first image in subsequent steps.

[0082] In one or more embodiments of this specification, the recognition result may be either an enhanced image or a plain scan image. That is, the image pattern recognition model is a model that identifies either a plain scan image or an enhanced image. Specifically, when training the image pattern recognition model, sample images and their corresponding labels can be obtained first. These labels represent the actual recognition result (i.e., enhanced image or plain scan image) corresponding to the sample images. The sample images can then be input into the image pattern recognition model to be trained to obtain prediction results. Finally, the image pattern recognition model to be trained is optimized with minimizing the difference between the label corresponding to the sample image and the prediction result. A large number of sample images and labels are used to train the model to obtain a pre-trained image pattern recognition model.

[0083] S104: Based on the determined recognition results, determine the final result of recognizing the image to be recognized; wherein the final result is either an enhanced image or a flat scan image.

[0084] The server can determine the final result of the image to be recognized based on the recognition results of each first image.

[0085] In one or more embodiments of this specification, when determining the final result of an image to be recognized, the server may determine the proportion of recognition results belonging to the enhanced image among the recognition results of each first image. If the proportion is greater than a first preset value, the final result of recognizing the image to be recognized is determined to be an enhanced image; if the proportion is not greater than the first preset value, the final result of recognizing the image to be recognized is determined to be a flat scan image. In other words, the recognition result that appears most frequently among all recognition results is taken as the final result of the image to be recognized.

[0086] When determining the final result of the image to be recognized, the server can also determine the part label of the first image corresponding to each recognition result, and determine the weight corresponding to the recognition result according to the preset weight of each part label. It also determines the sum of all weighted recognition results belonging to the enhanced image as a first value, and the sum of all weighted recognition results belonging to the plain scan image as a second value. Furthermore, if the first value is greater than the second value, the final result is determined to be an enhanced image; if the first value is not greater than the second value, the final result is determined to be a plain scan image.

[0087] For example: There are three images in the first image. One image shows the stomach, one shows the heart, and one shows the spleen. The recognition result of the first image corresponding to the stomach is an enhanced image, the recognition result of the first image corresponding to the heart is a plain scan image, and the recognition result of the first image corresponding to the spleen is an enhanced image. Assuming that the preset weight ratio of the stomach, heart, and spleen is 2:1:1, then the first value is 2×1+1×1=3, the second value is 1×1=1, and the final result is an enhanced image, that is, the image to be recognized corresponding to the first image is an enhanced image.

[0088] Furthermore, since image pattern recognition models typically output the recognition result with the highest probability, and sometimes the probabilities of different recognition results are not significantly different, to improve the accuracy of the final result, the server, when determining the final result, can determine a first probability that the first image belongs to an enhanced image and a second probability that it belongs to a plain scan image for each first image. Then, based on the preset weights of the part labels corresponding to the first image, the first probability and the second probability are weighted respectively, and the sum of the weighted first probabilities and the sum of the weighted second probabilities for each first image are determined. Finally, if the sum of the first probabilities is greater than the sum of the second probabilities, the final result is determined to be an enhanced image; if the sum of the first probabilities is not greater than the sum of the second probabilities, the final result is determined to be a plain scan image.

[0089] For example, there are three images: one depicting the stomach, one the heart, and one the spleen. Assume the probability that the stomach image is an enhanced image is 0.6, and the probability that it is a plain image is 0.4; the probability that the heart image is an enhanced image is 0.4, and the probability that it is a plain image is 0.6; the probability that the spleen image is an enhanced image is 0.6, and the probability that it is a plain image is 0.4. Assuming the preset weight ratios for the stomach, heart, and spleen are 1:3:1, the weighted first probability of the first image corresponding to the stomach is 0.6 × 1 = 0.6, and the weighted second probability is 0.4 × 1 = 0.4. Similarly, the weighted first probability of the first image corresponding to the heart is 0.4 × 3 = 1.2, and the weighted second probability of the first image corresponding to the stomach is 0.6 × 3 = 1.8. The weighted first probability of the first image corresponding to the spleen is 0.6 × 1 = 0.6, and the weighted second probability of the first image corresponding to the stomach is 0.4 × 1 = 0.4. Therefore, the sum of the weighted first probabilities is 0.6 + 1.2 + 0.6 = 2.4, and the sum of the weighted second probabilities is 0.4 + 1.8 + 0.4 = 2.6. The final result is a flat scan image, meaning the image to be identified corresponding to the first image is a flat scan image.

[0090] Of course, in one or more embodiments of this specification, when determining the final result of the image to be identified based on the weights of the preset part labels, in order to reduce the amount of computation while ensuring the accuracy of the identification result, the probability of the first image being an enhanced image and the probability of the first image being a flat scan image can be determined based on the probability of the identification result of each first image and the weights of the preset part labels, and then the final identification result can be determined based on the relationship between the two determined probability sums. Suppose there are seven first images in total. Four of these images involve the chest, and three involve the abdomen. The four chest images are enhanced images with a probability of 0.6 each, while the three abdominal images are plain scan images with a probability of 0.9 each. Further assuming the preset weights for the body part labels are chest:abdomen = 2:1, then the weighted sum of probabilities for the enhanced images is 2 × 0.6 × 4 = 4.8, and the weighted sum of probabilities for the plain scan images is 1 × 0.9 × 3 = 2.7. Since 4.8 > 2.7, the final recognition result for the image to be identified is determined to be a plain scan image.

[0091] Furthermore, when determining the recognition result of the image to be recognized, the server can also use the probability of the recognition result of each first image as a weight, and weight the recognition results according to the weight to determine the recognition result of the image to be recognized. For example: there are a total of seven first images. Among the first images, four involve the chest and three involve the abdomen. The four chest first images are enhanced images, and the probability of being enhanced images is 0.6. The three abdominal first images are plain scan images, and the probability of being plain scan images is 0.9. Then the weight ratio of enhanced images to plain scan images is 0.6:0.9. The sum of the probabilities of being enhanced images in the first images is 0.6×4=2.4, and the sum of the probabilities of being plain scan images in the first images is 0.9×3=2.7. Obviously, 2.4<2.7. Therefore, the final recognition result of the image to be recognized is the plain scan image.

[0092] Of course, since the four chest images are enhanced images and the probability of each being an enhanced image is 0.6, the probability of each chest image being a plain scan image is 0.4. Therefore, when performing weighted summation, we can also consider the sum of the probabilities of each recognition result for each chest image. That is, the probability of the four chest images being enhanced images is: 0.6 × 4 = 2.4, the probability of the four chest images being plain scan images is: 0.4 × 4 = 1.6, the probability of the three abdominal images being plain scan images is: 0.9 × 3 = 2.7, the probability of the three abdominal images being enhanced images is: 0.1 × 3 = 0.3, the probability of the first image being an enhanced image is: 2.4 + 0.3 = 2.7, and the probability of the first image being an enhanced image is: 1.6 + 2.7 = 4.3. Obviously, 2.7 < 4.3, so the final recognition result of the image to be recognized is a plain scan image.

[0093] based on Figure 1 In the image recognition method provided in this specification, the server receives a sequence of images to be recognized and, through a pre-trained human body part recognition model, determines the part labels of each slice image in the sequence, and identifies the first image based on the part labels. Then, the recognition result of each first image is determined. Finally, based on the recognition results of each first image, the final recognition result of the image to be recognized is determined. This method uses a pre-trained image pattern recognition model to recognize selected slice images, which are selected based on part labels. Therefore, based on the recognition results of each slice image, the final recognition result of the image to be recognized is determined, i.e., whether the image to be recognized is a plain scan image or an enhanced image, increasing the accuracy and efficiency of image recognition.

[0094] Furthermore, since adjacent slice images in the image sequence to be identified have a high degree of similarity, meaning that adjacent slice images generally have the same part labels, in step S100 above, when the server determines the part labels of each slice image in the image sequence to be identified based on the pre-trained human part recognition model, the server can determine the adjacent slice images for each slice image in the image sequence to be identified, and use the slice image and the determined adjacent slice images as the target slice image. Then, the target slice image can be input into the human part recognition model to obtain the human part recognition result output by the human part recognition model, which serves as the part label of the slice image.

[0095] A slice image and its adjacent slice images can be used as target slice images, and the target slice images can be input into the human body part recognition model. This increases the amount of data in the slice images to be recognized, thus avoiding the problem of inaccurate part recognition results due to insufficient data when using a single slice image.

[0096] Based on the image recognition method described above, this specification also provides a corresponding schematic diagram of an image recognition device, such as... Figure 2 As shown.

[0097] Figure 2 This is a schematic diagram of an image recognition apparatus provided in an embodiment of this specification, the apparatus comprising:

[0098] The image receiving module 200 is used to receive a sequence of images to be identified, determine the part labels of each slice image in the sequence of images to be identified through a pre-trained human body part recognition model, and select at least a portion of the slice images as the first image based on the part labels.

[0099] Image input module 204 is used to input each of the first images into a pre-trained image pattern recognition model to determine the recognition result corresponding to each of the first images;

[0100] The result determination module 206 is used to determine the final result of recognizing the image to be recognized based on the determined recognition results; wherein the final result is either an enhanced image or a flat scan image.

[0101] Optionally, the image receiving module 200 is specifically used to receive a three-dimensional image to be identified, slice the three-dimensional image, and determine each slice image corresponding to the three-dimensional image as the image sequence to be identified.

[0102] Optionally, the image receiving module 200 is specifically configured to: receive the three-dimensional image to be identified and divide the three-dimensional image into several sub-three-dimensional images; determine the part label of each sub-three-dimensional image based on the obtained sub-three-dimensional images through a pre-trained human part recognition model; and select at least one sub-three-dimensional image from the sub-three-dimensional images corresponding to each part label as the first image.

[0103] Optionally, the image receiving module 200 is specifically configured to: acquire the part label of the part of interest; determine the slice image of interest as the first image based on the part label of the part of interest and the part identification label of each slice image; and process each first image according to a preset image processing method; wherein the preset image processing method includes at least one of grayscale value normalization, image orientation correction, and image cropping.

[0104] Optionally, the recognition result is either an enhanced image or a flat scan image;

[0105] The result determination module 206 is specifically used to determine the proportion of recognition results belonging to enhanced images among the determined recognition results; if the proportion is greater than a first preset value, then the final result of the recognition of the image to be recognized is determined to be an enhanced image; if the proportion is not greater than the first preset value, then the final result of the recognition of the image to be recognized is determined to be a flat scan image.

[0106] Optionally, the result determination module 206 is specifically configured to: for each recognition result, determine the part label of the first image corresponding to the recognition result; determine the weight corresponding to the recognition result according to the preset weight of each part label; determine the sum of each weighted recognition result belonging to the enhanced image as a first value, and the sum of each weighted recognition result belonging to the flat scan image as a second value; when the first value is greater than the second value, determine the final result as an enhanced image; when the first value is not greater than the second value, determine the final result as a flat scan image.

[0107] Optionally, the result determination module 206 is further configured to, for each first image, determine a first probability that the first image belongs to an enhanced image and a second probability that it belongs to a plain scan image; weight the first probability and the second probability according to the preset weights of the part labels corresponding to the first image; determine the sum of the weighted first probabilities for each first image and the sum of the weighted second probabilities for each first image; when the sum of the first probabilities is greater than the sum of the second probabilities, determine the final result as an enhanced image; when the sum of the first probabilities is not greater than the sum of the second probabilities, determine the final result as a plain scan image.

[0108] Optionally, the image receiving module 200 is specifically configured to: for each slice image in the image sequence to be identified, determine the adjacent slice images of the slice image, and use the slice image and the determined adjacent slice images as target slice images; input the target slice image into the human body part recognition model to obtain the human body part recognition result output by the human body part recognition model, and use it as the part label of the slice image.

[0109] This specification also provides a computer-readable storage medium storing a computer program that can be used to perform the image recognition method described above.

[0110] Based on the image recognition method described above, embodiments of this specification also propose... Figure 3 The diagram shows a schematic structural representation of the electronic device. Figure 3 At the hardware level, the electronic device includes a processor, internal bus, network interface, memory, and non-volatile memory, and may also include other hardware required for the business. The processor reads the corresponding computer program from the non-volatile memory into memory and then runs it to implement the image recognition method described above.

[0111] Of course, in addition to software implementation, this specification does not exclude other implementation methods, such as logic devices or a combination of hardware and software. In other words, the execution subject of the following processing flow is not limited to each logic unit, but can also be hardware or logic devices.

[0112] In the 1990s, improvements to a technology could be clearly distinguished as either hardware improvements (e.g., improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (improvements to the methodology). However, with technological advancements, many methodological improvements today can be considered direct improvements to the hardware circuit structure. Designers almost always obtain the corresponding hardware circuit structure by programming the improved methodology into the hardware circuit. Therefore, it cannot be said that a methodological improvement cannot be implemented using hardware physical modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user programming the device. Designers can program and "integrate" a digital system onto a PLD themselves, without needing chip manufacturers to design and manufacture dedicated integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing integrated circuit chips, this programming is mostly implemented using "logic compiler" software. Similar to the software compiler used in program development, the original code before compilation must be written in a specific programming language, called a Hardware Description Language (HDL). There are many HDLs, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, and RHDL (Ruby Hardware Description Language). Currently, the most commonly used are VHDL (Very-High-Speed ​​Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should understand that by simply performing some logic programming on the method flow using one of these hardware description languages ​​and programming it into an integrated circuit, the hardware circuit implementing the logical method flow can be easily obtained.

[0113] The controller can be implemented in any suitable manner. For example, it can take the form of a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, application-specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers. Examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicon Labs C8051F320. A memory controller can also be implemented as part of the control logic of the memory. Those skilled in the art will also recognize that, in addition to implementing the controller in purely computer-readable program code form, the same functionality can be achieved by logically programming the method steps to make the controller take the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers. Therefore, such a controller can be considered a hardware component, and the means included therein for implementing various functions can also be considered as structures within the hardware component. Alternatively, the means for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.

[0114] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or any combination of these devices.

[0115] For ease of description, the above devices are described in terms of function, divided into various units. Of course, in implementing this specification, the functions of each unit can be implemented in one or more software and / or hardware components.

[0116] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0117] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0118] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0119] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0120] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0121] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0122] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0123] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0124] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this specification may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0125] This specification can be described in the general context of computer-executable instructions that are executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. This specification can also be practiced in distributed computing environments, where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.

[0126] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.

[0127] The above description is merely an embodiment of this specification and is not intended to limit this specification. Various modifications and variations can be made to this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this specification should be included within the scope of the claims of this application.

Claims

1. A method for image recognition, characterized in that, The method includes: The system receives a sequence of images to be identified, determines the part labels of each slice image in the sequence of images to be identified through a pre-trained human body part recognition model, and selects at least a portion of the slice images as the first image based on the part labels. Each first image is input into a pre-trained image pattern recognition model to determine the recognition result corresponding to each first image, wherein the recognition result is either an enhanced image or a flat scan image. Based on the determined recognition results, the final result of recognizing the image to be recognized is determined. In the process of determining the final result, for each recognition result, the part label of the first image corresponding to the recognition result is determined, and the weight corresponding to the recognition result is determined according to the preset weight of each part label. The final result is determined according to the weight corresponding to each recognition result and its corresponding recognition result. The weights corresponding to different part labels are not completely the same. The final result is either an enhanced image or a flat scan image.

2. The method as described in claim 1, characterized in that, Receive the image sequence to be identified, specifically including: Receive a 3D image to be identified, slice the 3D image, and determine each slice image corresponding to the 3D image as the image sequence to be identified.

3. The method as described in claim 1, characterized in that, The process involves receiving a sequence of images to be identified, determining the location labels of each slice image in the sequence using a human body part recognition model, and selecting at least a portion of the slice images as the first image based on the location labels. Specifically, this includes: Receive a 3D image to be identified and divide the 3D image into several sub-3D images; Based on the obtained sub-3D images, the part labels of each sub-3D image are determined by a pre-trained human part recognition model; For each part label, select at least one sub-3D image from the sub-3D images corresponding to that part label as the first image.

4. The method as described in claim 1, characterized in that, Selecting at least a portion of the slice images as the first image based on the location labels specifically includes: Retrieve the region tags of the area of ​​interest; Based on the region label of the region of interest and the region identification labels of each slice image, the slice image of interest is determined as the first image; The first images are processed according to a preset image processing method; wherein the preset image processing method includes at least one of grayscale value normalization, image orientation correction, and image cropping.

5. The method as described in claim 1, characterized in that, Based on the determined recognition results, the final result of recognizing the image to be recognized is determined, specifically including: Based on the proportion of recognition results belonging to the enhanced image among the determined recognition results; If the percentage is greater than the first preset value, then the final result of the recognition of the image to be recognized is determined to be an enhanced image; If the percentage is not greater than the first preset value, then the final result of recognizing the image to be recognized is determined to be a flat scan image.

6. The method as described in claim 1, characterized in that, Based on the determined recognition results, the final recognition result of the image to be recognized is determined, specifically including: The sum of the weighted recognition results belonging to the enhanced image is determined as the first value, and the sum of the weighted recognition results belonging to the plain scan image is determined as the second value. When the first value is greater than the second value, the final result is determined to be an enhanced image; If the first value is not greater than the second value, the final result is determined to be a flat scan image.

7. The method as described in claim 1, characterized in that, Based on the determined recognition results, the final recognition result of the image to be recognized is determined, and the method further includes: For each first image, determine a first probability that the first image belongs to an enhanced image and a second probability that it belongs to a plain scan image; The first probability and the second probability are weighted according to the preset weights of the part labels corresponding to the first image. Determine the sum of the weighted first probabilities corresponding to each first image, and the sum of the weighted second probabilities corresponding to each first image; When the sum of the first probabilities is greater than the sum of the second probabilities, the final result is determined to be an enhanced image; If the sum of the first probabilities is not greater than the sum of the second probabilities, the final result is determined to be a flat scan image.

8. The method as described in claim 1, characterized in that, Using a pre-trained human body part recognition model, the part labels of each slice image in the image sequence to be recognized are determined, specifically including: For each slice image in the image sequence to be identified, the adjacent slice images are determined, and the slice image and the determined adjacent slice images are used as the target slice image; The target slice image is input into the human body part recognition model to obtain the human body part recognition result output by the human body part recognition model, which is used as the part label of the slice image.

9. A computer-readable storage medium, characterized in that, The storage medium stores a computer program, which, when executed by a processor, implements the method described in any one of claims 1-8.

10. An electronic device, characterized in that, It includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the method described in any one of claims 1-8.