Image labeling reliability prediction method and device, electronic equipment and storage medium
By calculating the similarity between local image regions and visual features, the problem of large subjective differences in image annotation is solved, enabling objective evaluation of annotation results and improving the reliability and consistency of annotation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- PEKING UNION MEDICAL COLLEGE HOSPITAL
- Filing Date
- 2022-06-15
- Publication Date
- 2026-06-30
AI Technical Summary
Current image annotation relies on manual methods, leading to significant subjective differences and a lack of objective reliability assessment methods. This is especially true in medical image recognition, where doctors' experience and areas of expertise differ, resulting in inconsistent annotation results.
By calculating the similarity between local image regions and visual features, and by obtaining the similarity between the annotation results and historical annotation content, the reliability of the current annotation results is predicted. Multiple quantitative indicators of local image regions and visual features are used to analyze the annotation results and predict the reliability of the current image reading results.
It achieves an objective and unified evaluation of subjective annotation results, improves the reliability and consistency of annotation results, and enhances annotation quality and efficiency by calculating the similarity of historical annotation content.
Smart Images

Figure CN115239945B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image recognition technology, specifically to an image annotation reliability prediction method, apparatus, electronic device, and storage medium. Background Technology
[0002] In the field of image recognition based on AI (Artificial Intelligence) models, it is usually necessary to first label the sample image data, and then use the labeled sample image data for machine learning and training. To obtain effective labeled data, most existing technologies rely on manual labeling of the collected sample image data.
[0003] Manual annotation relies entirely on an individual's experience and understanding of the image content. It involves meticulously sketching target objects within the image and adding corresponding annotations to aid in identifying the image content or type (e.g., related lesion types). Existing image data annotation methods, due to their dependence on the experience and understanding of the image content by individual technicians, often result in subjective differences in the annotated sample image data from different image readers, lacking objectivity. Furthermore, existing methods, relying on manual annotation, are relatively slow and prone to errors.
[0004] In existing technologies, to address the inconsistencies and lack of objective standards in manual annotation due to subjectivity, a large number of annotation results are typically collected for comprehensive evaluation. For example, in medical image recognition, the gold standard diagnosis in a standard image library is formed through collaborative review by multiple doctors. Only with a gold standard image library can machine interpretation (i.e., automated interpretation using AI models trained on machine learning) be possible. In forming the gold standard diagnosis, the reliability of each doctor's interpretation needs to be considered. However, since doctors have different professional experience and areas of expertise, the reliability of annotations for different medical images is not entirely the same. There is no unified, quantifiable objective standard; existing technologies typically rely on the doctor's authority in the field to judge the reliability of the interpretation results. Therefore, there is no objective method for evaluating the reliability of annotation results in existing technologies. Summary of the Invention
[0005] To address the aforementioned technical problems in the prior art, this application proposes an image annotation reliability prediction device, electronic device, and computer-readable storage medium. By calculating the similarity between the current image viewing result and historical annotation content, the reliability of the current image viewing result is predicted, thus solving the problem of how to objectively evaluate subjective results.
[0006] The first aspect of this application provides an image annotation reliability prediction method, including:
[0007] Obtain local image regions related to the annotation results, and based on the local image regions related to the annotation results, obtain visual features related to the annotation results;
[0008] The similarity between the local image region, the visual features, and the annotator's historical annotation results is calculated respectively to predict the reliability of the annotator's current annotation results.
[0009] In some embodiments, before obtaining the local image region related to the annotation result, the method further includes: calculating the similarity between the annotator's current annotation result and historical annotation results.
[0010] In some embodiments, calculating the similarity between the current annotation result and the historical annotation result of the annotator includes:
[0011] The annotator's historical annotation results are divided into two subsets: an erroneous image set g0 and a correct image set g1.
[0012] Calculate the first similarity s0 between the annotator's current annotation result and the erroneous image set g0, and the second similarity s1 between the annotator's current annotation result and the correct image set g1;
[0013] The similarity between the annotator's current annotation result and historical annotation results is calculated based on the first similarity s0 and the second similarity s1.
[0014] In some embodiments, the visual features associated with the annotation results include: optic disc morphology, optic disc color, optic cup morphology, orientation of small vessels in the optic disc and optic cup, degree of defect in the nerve fiber layer near the superior and inferior vascular arches, and related lesions in the retinal region.
[0015] In some embodiments, the method further includes managing the various annotation results using a standard database.
[0016] In some embodiments, the standard database includes a set of images to be labeled and historical labeling results.
[0017] In some embodiments, the calculation of each of the similarities is achieved by calculating vector similarity.
[0018] A second aspect of this application provides an image annotation reliability prediction apparatus, comprising:
[0019] The visual feature processing module is used to acquire local image regions related to the annotation results, and based on the local image regions related to the annotation results, acquire visual features related to the annotation results.
[0020] The annotation result reliability prediction module calculates the similarity between the local image region, the visual features, and the annotator's historical annotation results, and predicts the reliability of the annotator's current annotation result.
[0021] In some embodiments, the apparatus further includes a similarity calculation module for calculating the similarity between the current annotation result and the historical annotation result of the annotator.
[0022] In some embodiments, calculating the similarity between the current annotation result and the historical annotation result of the annotator includes:
[0023] The annotator's historical annotation results are divided into two subsets: an erroneous image set g0 and a correct image set g1.
[0024] Calculate the first similarity s0 between the annotator's current annotation result and the erroneous image set g0, and the second similarity s1 between the annotator's current annotation result and the correct image set g1;
[0025] The similarity between the annotator's current annotation result and historical annotation results is calculated based on the first similarity s0 and the second similarity s1.
[0026] In some embodiments, the visual features associated with the annotation results include: optic disc morphology, optic disc color, optic cup morphology, orientation of small vessels in the optic disc and optic cup, degree of defect in the nerve fiber layer near the superior and inferior vascular arches, and related lesions in the retinal region.
[0027] In some embodiments, the apparatus further includes a standard database module for managing the various annotation results via a standard database.
[0028] In some embodiments, the standard database includes a set of images to be annotated and historical annotation results.
[0029] In some embodiments, the calculation of each of the similarities is achieved by calculating vector similarity.
[0030] A third aspect of this application provides an electronic device, including:
[0031] Memory and one or more processors;
[0032] The memory is communicatively connected to the one or more processors, and the memory stores instructions that can be executed by the one or more processors. When the instructions are executed by the one or more processors, the electronic device is used to implement the methods described in the foregoing embodiments.
[0033] A fourth aspect of this application provides a computer-readable storage medium having computer-executable instructions stored thereon, which, when executed by a computing device, can be used to implement the methods described in the foregoing embodiments.
[0034] A fifth aspect of this application provides a computer program product comprising a computer program stored on a computer-readable storage medium, the computer program including program instructions which, when executed by a computer, can be used to implement the methods described in the foregoing embodiments.
[0035] In this embodiment, multiple quantitative indicators of local image regions and visual features are used to analyze the annotation results. By calculating the similarity between the current image viewing result and the historical annotation content, the reliability of the current image viewing result is predicted, thus achieving an objective and unified evaluation of subjective results. Attached Figure Description
[0036] The features and advantages of this application will be more clearly understood by referring to the accompanying drawings, which are illustrative and should not be construed as limiting the application in any way. In the drawings:
[0037] Figure 1 These are diagrams of the physiological structure of the fundus according to some embodiments of this application;
[0038] Figure 2A This is a flowchart illustrating an image annotation reliability prediction method according to some embodiments of this application;
[0039] Figure 2B This is a schematic diagram illustrating the similarity between the current annotation result and the historical annotation result of the annotator, according to some embodiments of this application;
[0040] Figure 3 This is a schematic diagram of the structure of an image annotation reliability prediction device according to some embodiments of this application;
[0041] Figure 4 This is a schematic diagram of the logical structure of an electronic device according to some embodiments of this application;
[0042] Figure 5 This is a schematic diagram of the architecture of a general-purpose computer node according to some embodiments of this application. Detailed Implementation
[0043] In the following detailed description, numerous specific details of this application are illustrated by example to provide a thorough understanding of the relevant disclosure. However, it will be apparent to those skilled in the art that this application can be practiced without these details. It should be understood that the terms “system,” “apparatus,” “unit,” and / or “module” used in this application are one way of distinguishing different parts, elements, sections, or components at different levels in a sequential arrangement. However, these terms may be replaced with other expressions if other expressions can achieve the same purpose.
[0044] It should be understood that when a device, unit, or module is referred to as being "on," "connected to," or "coupled to" another device, unit, or module, it may be directly connected to or coupled to or communicate with other devices, units, or modules, or there may be intermediate devices, units, or modules present, unless the context explicitly indicates otherwise. For example, the term "and / or" as used herein includes any one and all combinations of one or more of the relevant listed items.
[0045] The terminology used in this application is for the purpose of describing specific embodiments only and is not intended to limit the scope of this application. As shown in the specification and claims of this application, unless the context clearly indicates otherwise, words such as "a," "an," "an," and / or "the" do not specifically refer to the singular and may also include the plural. Generally speaking, the terms "comprising" and "including" only indicate that explicitly identified features, integrals, steps, operations, elements, and / or components are included, and such expressions do not constitute an exclusive list, and other features, integrals, steps, operations, elements, and / or components may also be included.
[0046] Referring to the following description and accompanying drawings, these and other features and characteristics, operating methods, functions of related structural elements, combinations of parts, and economics of manufacture of this application can be better understood, wherein the description and drawings form part of the specification. However, it is clearly understood that the drawings are for illustrative and descriptive purposes only and are not intended to limit the scope of protection of this application. It is understood that the drawings are not drawn to scale.
[0047] Various structural diagrams are used in this application to illustrate various variations of the embodiments according to this application. It should be understood that the preceding or following structures are not intended to limit this application. The scope of protection of this application is determined by the claims.
[0048] In current technologies, the annotation of sample images mainly relies on manual work, which makes the annotation results highly susceptible to subjective factors and difficult to assess in terms of reliability. Typically, in medical image recognition, the annotation data is provided by different interpreters, and the annotation results are highly dependent on the doctor's experience and personal ability. Without knowing the identity and background of the current annotator, it is virtually impossible to determine the reliability of the interpretation results, let alone control the quality of interpretation.
[0049] In view of this, embodiments of this application provide an image annotation reliability prediction method. To ensure high-quality image reading results, the method combines the historical annotations of each reader to predict annotation reliability. It considers that different image regions have varying degrees of importance to the recognition results, for example... Figure 1In the illustrated fundus physiological structure diagram, lesions are mostly manifested in several key areas, and lesions in different key areas represent different types. The reliability of the final result needs to be evaluated by comprehensively considering the interpretation of multiple areas in the entire image. The technical solution of this application predicts the reliability of the current interpretation result by comprehensively considering the correct image set and the incorrect image set of the interpreter's history and annotation process. In addition, the embodiments of this application also consider the key areas and visual features of different lesions, and predict the reliability of the current interpretation result by calculating the similarity between the current interpretation result and the historical annotation content.
[0050] Specifically, such as Figure 2A As shown, in one embodiment of this application, the image annotation reliability prediction method includes:
[0051] S201, Obtain the local image region related to the annotation result, and based on the local image region related to the annotation result, obtain the visual features related to the annotation result.
[0052] In a preferred embodiment of this application, for example, Figure 1 The fundus images shown include visual features such as optic disc morphology, optic disc color, optic cup morphology, the course of small vessels in the optic disc and optic cup, the degree of damage to the nerve fiber layer near the superior and inferior vascular arches, and related lesions in the retinal region. In the embodiments of this application, multiple dimensions of similarity are used to evaluate the annotation results, and quantitative indicators are used to objectively predict the reliability of the annotation results. The dimensions of similarity evaluation not only include local image regions related to the annotation results, but also visual features related to the annotation results and historical annotation results. The similarity evaluation dimensions of this application are richer and more comprehensive, increasing the reliability of the evaluation results.
[0053] S202, calculate the similarity between the local image region, the visual features and the annotator's historical annotation results, and predict the reliability of the annotator's current annotation results.
[0054] In a preferred embodiment of this application, the aforementioned similarity calculation is achieved by calculating vector similarity. For example, for the similarity calculation between a local image region and historical annotation results, one or more features such as color, texture, shape, spatiality, brightness, and contrast of the local image region can be extracted, and the extracted features are combined into a vector; simultaneously, corresponding features are extracted from the historical annotation results and combined into another vector, and the similarity between the two vectors is calculated (for example, calculating Euclidean distance), thus realizing the similarity calculation between the local image region (and historical annotation results). The importance of local image regions to diseases varies, and the evaluation process in this application embodiment emphasizes the importance of visual features related to the annotation results. Taking fundus images as an example again, in the reliability calculation, considering the different importance of different fundus image regions for judging different lesion types, the similarity calculation focuses on different image regions for different analysis purposes, directly using visual feature similarity calculation to assist prediction, further improving the reliability of the evaluation results. Preferably, there are various visual features in fundus images as described above. Different diseases will show obvious lesions in different visual features. Therefore, in the embodiments of this application, corresponding visual features are selected according to different analysis purposes, and the similarity between the visual features and historical annotation results is calculated according to the above-mentioned similarity calculation method. The different analysis purposes are determined by the annotation results. For example, when the current annotation result indicates the lesion type or lesion location, relevant visual features are selected according to the lesion type or lesion location. Only the similarity of the relevant visual features is calculated, or the weight of the relevant visual features is increased during the similarity calculation, thereby enhancing the importance of different visual features in different analysis purposes. When the current annotation result indicates no lesion, all visual features can be selected for similarity calculation to improve overall prediction reliability; alternatively, relevant visual features in the current image can be selected item by item according to different lesion types or lesion locations in the historical annotation results for separate similarity calculation to improve prediction reliability item by item (or the most representative one or more similarity values can be selected to represent the overall prediction reliability).
[0055] In the embodiments of this application, the annotation results are quantitatively analyzed through similarity calculations across multiple evaluation dimensions to objectively predict their reliability. Optionally, reliability can be filtered based on the similarity calculation results. For example, similarity can be sorted, and the consistency between the current annotation result and the historical annotation results (which may have multiple similarities) with the highest or highest similarity can be used as the reliability prediction result output; alternatively, a threshold can be applied to similarity, and the consistency of annotations with similarity exceeding a certain threshold can be statistically analyzed and used as the reliability prediction result output; alternatively, one or more similarities of the consistent annotation results can be statistically analyzed to determine the reliability of the result. The specific selection of local image regions and visual features, similarity calculations, and reliability prediction algorithms should not be considered as limitations on the implementation of this application.
[0056] In some implementations, the method also includes calculating the similarity between the current annotation result and the historical annotation results of the annotator; for details, see [link to implementation details]. Figure 2B The annotation process divides the annotator's historical annotation results into two subsets: an erroneous image set g0 and a correct image set g1. It calculates the first similarity s0 between the annotator's current annotation result and the erroneous image set g0, and the second similarity s1 between the annotator's current annotation result and the correct image set g1. Based on the first similarity s0 and the second similarity s1, it calculates the similarity between the annotator's current annotation result and the historical annotation results, thereby predicting the reliability of the annotation result. Typically, different similarity calculations correspond to different lesion identifications. Similarity calculations can be selectively performed according to specific identification needs. For example, for macular diseases, the similarity of the macular region between fundus images is mainly considered; for optic nerve diseases, the similarity of the optic disc region in fundus images is mainly considered; for diagnosing glaucoma, the similarity of the optic disc-optic cup region and the superior and inferior vascular arch regions is mainly observed; for fundus vascular diseases, the similarity between vascular regions is mainly considered; for diagnosing diabetic retinopathy, it is necessary to observe the entire retinal region and also observe whether there are lesions related to diabetic retinopathy.
[0057] Typically, each radiologist's experience and skills improve gradually, as does their accuracy in identifying lesion types. Considering that historical annotation results are a record of a long-term process, the final medical diagnosis can help determine whether previous annotations were correct (i.e., whether errors occurred). Therefore, in the preferred embodiment of the application, the images used in the historical annotation process are further divided into two subsets: an erroneous image set g0 and a correct image set g1. The reliability of the radiologist's diagnostic results during formal image reading can be further predicted using a similarity method between erroneous and correct images: calculating the first similarity s0 between the image and images in set g0; calculating the second similarity s1 between the image and images in set g1; and estimating the reliability of the diagnostic results based on the first and second similarities s0 and s1. This method can further improve the efficiency and accuracy of prediction, and also help identify the areas of expertise and weakness for each physician, thereby helping physicians improve their skills or correct cognitive deficiencies.
[0058] In some embodiments, the technical solution of this application also manages annotation results through a standard database. Optionally, the standard database includes historical annotation results, which can be recorded and differentiated based on the annotator and the actual diagnostic results (i.e., verification of the annotation results, such as feedback / re-annotation of incorrect or correct results), thereby helping to achieve the aforementioned similarity calculation. In some optional embodiments, the standard database may also include multiple annotation results for the same image from multiple annotators, and cross-validation of these multiple annotation results can further confirm the reliability of these multiple annotation results. Preferably, after obtaining a standard database of a certain size, the sample images and corresponding annotation results in the standard database can be used to train an artificial intelligence model for machine learning to obtain a reliable image recognition model that solves a specific problem (such as fundus image screening).
[0059] In some embodiments, standard databases or trained image recognition models can be used to train annotators, helping them continuously improve their professional skills without guidance. Furthermore, after the standard database is established, embodiments of this application also include extracting specific images from the standard database based on similarity calculations for designated annotators to re-annotate. Re-annotation can solve multiple problems simultaneously: on the one hand, it can assess the consistency of annotators' annotations in the short term, thereby eliminating a small number of erroneous results caused by accidents or negligence through joint screening; on the other hand, it can identify image types with high error or accuracy rates by statistical analysis, thereby adjusting the reliability weights of these image annotation results in a targeted manner; furthermore, for image types with high error rates, it can encourage annotators to practice / train repeatedly to improve their accuracy for that image type and enhance their individual capabilities. Specifically, during the re-annotation process, historical image review processes, review times, and whether arbitration has been conducted are also considered to determine the applicability of an image to the annotator (the trainee), thereby conducting targeted and continuous assessment of the image reviewers. The image reading time includes the duration from the start of annotation to completion, and the length of the reading time is directly proportional to the difficulty of interpreting the image. Whether the gold standard was determined through arbitration is also related to the difficulty of interpreting the image. During the re-annotation process, for images with significant differences in similarity, the image readers are evaluated based on factors such as reading time and whether the gold standard was determined through arbitration.
[0060] The above are specific implementation methods of the image annotation reliability prediction method provided in this application. In the embodiments of this application, the reliability of the current annotation result is predicted by analyzing the annotation results through multiple quantitative indicators, thus achieving an objective and unified evaluation of subjective results. Furthermore, the image annotation reliability prediction method in the embodiments of this application has the following characteristics: First, by comprehensively considering the correct and incorrect image sets from the radiologist's historical training and annotation process, the reliability of the current image reading result is predicted. Second, by considering the predilection regions and visual characteristics of different fundus diseases, the reliability of the current image reading result is predicted by calculating the similarity between the current image reading result and historical annotation content. Finally, by continuously and specifically extracting images for radiologist training during the annotation process, the applicability of an image to the current training subject can be predicted. Similarity, reliability, error rate, accuracy, consistency, correlation weight, image reading time, and image reading difficulty in this application can all be calculated and analyzed through unified quantitative numerical indicators, thereby effectively solving the problem of how to objectively analyze and evaluate subjective manual annotation results.
[0061] Figure 3 This is a schematic diagram of an image annotation reliability prediction device according to some embodiments of this application. Figure 3As shown, the image annotation reliability prediction device 300 includes a visual feature processing module 310 and an annotation result reliability prediction module 320; wherein,
[0062] The visual feature processing module 310 is used to acquire local image regions related to the annotation results, and acquire visual features related to the annotation results based on the local image regions related to the annotation results;
[0063] The annotation result prediction module 320 calculates the similarity between the local image region, the visual features, and the annotator's historical annotation results, and predicts the reliability of the annotator's current annotation result.
[0064] In some embodiments, the apparatus further includes a similarity calculation module for calculating the similarity between the current annotation result and the historical annotation result of the annotator.
[0065] In some embodiments, calculating the similarity between the current annotation result and the historical annotation result of the annotator includes:
[0066] The annotator's historical annotation results are divided into two subsets: an erroneous image set g0 and a correct image set g1.
[0067] Calculate the similarity s0 between the annotator's current annotation result and the erroneous image set g0, and the similarity s1 between the annotator's current annotation result and the correct image set g1;
[0068] The similarity between the current annotation result and the historical annotation result of the annotator is calculated based on s0 and s1.
[0069] In some embodiments, the visual features associated with the annotation results include: optic disc morphology, optic disc color, optic cup morphology, orientation of small vessels in the optic disc and optic cup, degree of defect in the nerve fiber layer near the superior and inferior vascular arches, and related lesions in the retinal region.
[0070] In some embodiments, the apparatus further includes a standard database module for managing the various annotation results via a standard database.
[0071] In some embodiments, the standard database includes a set of images to be annotated and historical annotation results.
[0072] In some embodiments, the calculation of each of the similarities is achieved by calculating vector similarity.
[0073] In some embodiments, the apparatus further includes an image training module for re-annotating specific images from a standard database based on the similarity calculation.
[0074] Reference Appendix Figure 4 This is a schematic diagram of an electronic device provided in one embodiment of this application. Figure 4 As shown, the electronic device 400 includes:
[0075] Memory 430 and one or more processors 410;
[0076] The memory 430 is communicatively connected to the one or more processors 410. The memory 430 stores instructions 432 that can be executed by the one or more processors. The instructions 432 are executed by the one or more processors 410 to cause the one or more processors 410 to perform the methods in the foregoing embodiments of this application.
[0077] Specifically, the processor 410 and the memory 430 can be connected via a bus or other means. Figure 4 Taking the connection via bus 440 as an example, processor 410 can be a central processing unit (CPU). Processor 410 can also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or combinations of the above types of chips.
[0078] The memory 430, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as the cascaded asymptotic network in this embodiment. The processor 410 executes various functional applications and data processing by running the non-transitory software programs, instructions, and functional modules 432 stored in the memory 430.
[0079] The memory 430 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store data created by the processor 410, etc. Furthermore, the memory 430 may include high-speed random access memory and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, the memory 430 may optionally include memory remotely located relative to the processor 410, which can be connected to the processor 410 via a network (e.g., via communication interface 420). Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
[0080] One embodiment of this application provides a computer-readable storage medium storing computer-executable instructions, which, when executed, perform the steps in the above method embodiments.
[0081] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working process of the devices and modules described above can be referred to the corresponding descriptions in the foregoing method and / or device embodiments, and will not be repeated here.
[0082] Although the subject matter described herein is provided in the general context of execution on a computer system in conjunction with an operating system and applications, those skilled in the art will recognize that other implementations can also be executed in conjunction with other types of program modules. Generally, program modules include routines, programs, components, data structures, and other types of structures that perform specific tasks or implement specific abstract data types. Those skilled in the art will understand that the subject matter described herein can be practiced using other computer system configurations, including handheld devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframes, etc., and can also be used in distributed computing environments where tasks are performed by remote processing devices connected via a communication network. In a distributed computing environment, program modules may reside on both local and remote memory storage devices.
[0083] Those skilled in the art will recognize that the units and method steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0084] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, essentially, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. For example, typically, the technical solution of this application can be implemented through at least one such... Figure 5 The general-purpose computer node 510 shown is used to implement and / or propagate. Figure 5In this general-purpose computer node 510, there are: a computer system / server 512, peripherals 514, and a display device 516; wherein, the computer system / server 512 includes a processing unit 520, an input / output interface 522, a network adapter 524, and a memory 530, and data transmission is usually achieved internally through a bus; further, the memory 530 is usually composed of various storage devices, such as RAM (Random Access Memory) 532, cache 534, and storage system (generally composed of one or more large-capacity non-volatile storage media) 536, etc.; the program 540 that implements some or all of the functions of the technical solution of this application is stored in the memory 530, and usually exists in the form of multiple program modules 542.
[0085] The aforementioned computer-readable storage media include physically volatile and non-volatile, removable and non-destructible media implemented in any manner or technology for storing information such as computer-readable instructions, data structures, program modules or other data. Specifically, computer-readable storage media include, but are not limited to, USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other solid-state storage technologies, CD-ROMs, digital versatile discs (DVDs), HD-DVDs, Blu-ray or other optical storage devices, magnetic tapes, disk storage or other magnetic storage devices, or any other medium that can be used to store desired information and can be accessed by a computer.
[0086] In summary, this application proposes an image annotation reliability prediction method, apparatus, electronic device, and computer-readable storage medium thereof. The embodiments of this application utilize multiple quantitative indicators of local image regions and visual features to analyze the annotation results. By calculating the similarity between the current image viewing result and historical annotation content, the reliability of the current image viewing result is predicted, achieving an objective and unified evaluation of subjective results.
[0087] It should be understood that the specific embodiments described above are merely illustrative or explanatory of the principles of this application and do not constitute a limitation thereof. Therefore, any modifications, equivalent substitutions, improvements, etc., made without departing from the spirit and scope of this application should be included within the protection scope of this application. Furthermore, the appended claims are intended to cover all variations and modifications falling within the scope and boundaries of the appended claims, or equivalent forms of such scope and boundaries.
Claims
1. A method for predicting the reliability of image annotations, characterized in that, include: Obtain local image regions related to the current annotation result, and based on the local image regions related to the current annotation result, obtain visual features related to the current annotation result; The similarity between the local image region, the visual feature, and the annotator's historical annotation results is calculated respectively to predict the reliability of the annotator's current annotation result; Before obtaining the local image region related to the annotation result, the method further includes: calculating the similarity between the annotator's current annotation result and historical annotation results; The dimensions of the similarity evaluation include not only the local image regions related to the annotation results, but also the visual features related to the annotation results and historical annotation results; The similarity between the current annotation result and the historical annotation result calculated by the annotator includes: The annotator's historical annotation results are divided into two subsets: an erroneous image set g0 and a correct image set g1. Calculate the first similarity s0 between the annotator's current annotation result and the erroneous image set g0, and the second similarity s1 between the annotator's current annotation result and the correct image set g1; The similarity between the annotator's current annotation result and historical annotation results is calculated based on the first similarity s0 and the second similarity s1. The visual features related to the annotation results include at least one of the following: optic disc shape, optic disc color, optic cup shape, orientation of small vessels in the optic disc and optic cup, degree of defect in the nerve fiber layer near the superior and inferior vascular arches, and related lesions in the retinal region. The method further includes: managing each of the annotation results through a standard database; The calculation of each of the aforementioned similarities is achieved by calculating vector similarity; Training annotators using standard databases also includes extracting specific images from the standard database based on similarity calculations and having designated annotators re-annotate them.
2. An image annotation reliability prediction device for implementing the method of claim 1, characterized in that, include: The visual feature processing module is used to acquire local image regions related to the annotation results, and based on the local image regions related to the annotation results, acquire visual features related to the current annotation results; The annotation result reliability prediction module calculates the similarity between the local image region, the visual features, and the annotator's historical annotation results, and predicts the reliability of the annotator's current annotation result.
3. The apparatus according to claim 2, characterized in that, The device further includes a similarity calculation module, used to calculate the similarity between the current annotation result and the historical annotation result of the annotator.
4. The apparatus according to claim 3, characterized in that, The similarity between the current annotation result and the historical annotation result calculated by the annotator includes: The annotator's historical annotation results are divided into two subsets: an erroneous image set g0 and a correct image set g1. Calculate the first similarity s0 between the annotator's current annotation result and the erroneous image set g0, and the second similarity s1 between the annotator's current annotation result and the correct image set g1; The similarity between the annotator's current annotation result and historical annotation results is calculated based on the first similarity s0 and the second similarity s1.
5. The apparatus according to claim 2, characterized in that, The visual features related to the annotation results include: optic disc morphology, optic disc color, optic cup morphology, orientation of small blood vessels in the optic disc and optic cup, degree of damage to the nerve fiber layer near the superior and inferior vascular arches, and related lesions in the retinal region.
6. The apparatus according to claim 2, characterized in that, The device also includes a standard database module for managing the various annotation results through a standard database.
7. The apparatus according to any one of claims 2-6, characterized in that, The calculation of each of the aforementioned similarities is achieved by calculating vector similarity.
8. An electronic device, characterized in that, include: Memory and one or more processors; The memory is communicatively connected to the one or more processors, and the memory stores instructions that can be executed by the one or more processors. When the instructions are executed by the one or more processors, the electronic device is used to implement the method as described in any one of claims 1.
9. A computer-readable storage medium having stored thereon computer-executable instructions which, when executed by a computing device, can be used to implement the method as claimed in any one of claims 1.