A method and apparatus for data annotation

By using open model-assisted tools, accurate labeled data can be generated by optimizing a small amount of manually labeled data, which solves the problems of long time consumption and high cost of existing inventory counting methods and achieves efficient and accurate multi-scenario adaptive labeling.

CN122244481APending Publication Date: 2026-06-19BEIJING JINGDONG YUANSHENG TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING JINGDONG YUANSHENG TECH CO LTD
Filing Date
2024-12-17
Publication Date
2026-06-19

Smart Images

  • Figure CN122244481A_ABST
    Figure CN122244481A_ABST
Patent Text Reader

Abstract

This invention discloses a data annotation method and apparatus, relating to the field of computer technology. One specific embodiment of the method includes: acquiring an image to be annotated and first annotation data corresponding to a first annotation object in the image; inputting the image to be annotated and the first annotation data into an open model to obtain second annotation data corresponding to a second annotation object in the image other than the first annotation object; correcting the second annotation data to obtain erroneous third annotation data, and optimizing the open model based on the third annotation data and the image to be annotated; and performing data annotation on the image to be annotated based on the optimized open model to obtain the data annotation result. This embodiment, through multiple interaction processes with the open model, can quickly optimize and apply the open model, not limited to a single application scenario, effectively improving annotation efficiency while ensuring annotation accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer technology, and in particular to a method and apparatus for data annotation. Background Technology

[0002] Inventory counting is a crucial step in logistics. Existing methods typically rely on manual counting, supplemented by automated equipment like cameras for cargo identification to improve efficiency. However, camera-based inventory counting solutions face challenges such as cargo occlusion in densely packed scenarios and visual confusion among multiple product categories. High-precision cargo recognition models are essential for effectively identifying and counting diverse goods. Furthermore, most existing annotation methods rely on manual labeling, requiring extensive and detailed annotation data to train a high-precision model—a time-consuming and costly process. Moreover, different models need to be trained for different scenarios or shooting angles to achieve accurate recognition, significantly increasing annotation costs. Summary of the Invention

[0003] In view of this, embodiments of the present invention provide a data annotation method and apparatus that use increasingly mature open models as auxiliary tools. Through multiple interaction processes with the open model, only the first annotation data corresponding to a portion of the first annotation objects needs to be obtained to quickly optimize and apply the open model. It is not limited to a single application scenario, effectively improving annotation efficiency while ensuring annotation accuracy.

[0004] To achieve the above objectives, according to one aspect of the present invention, a data annotation method is provided.

[0005] An embodiment of the present invention provides a data annotation method comprising: acquiring an image to be annotated and first annotation data corresponding to a first annotation object in the image to be annotated; inputting the image to be annotated and the first annotation data into an open model to obtain second annotation data corresponding to a second annotation object in the image to be annotated other than the first annotation object; correcting the second annotation data to obtain erroneous third annotation data, and optimizing the open model based on the third annotation data and the image to be annotated; and performing data annotation on the image to be annotated based on the optimized open model to obtain a data annotation result.

[0006] Optionally, obtaining the image to be labeled and the first labeling data corresponding to the first labeling object in the image to be labeled includes: obtaining the image to be labeled based on the user interaction interface; determining the labeling parameters corresponding to one or more first labeling objects according to the user's labeling operation on the user interaction interface; and generating the first labeling data according to the first labeling object and the corresponding labeling parameters.

[0007] Optionally, the step of correcting the second annotation data to obtain the incorrectly annotated third annotation data includes: displaying the second annotation data through the user interface; determining the incorrectly annotated third annotation data from the second annotation data based on the user's modification operation on the user interface; wherein the modification operation indicates that the user has modified the second annotation object and / or the annotation parameters corresponding to the second annotation object in the second annotation data.

[0008] Optionally, optimizing the open model based on the third annotation data and the image to be annotated includes: repeatedly executing the following steps until the loss function value of the open model meets a preset threshold: inputting the third annotation data and the image to be annotated into the open model to obtain fourth annotation data; using the fourth annotation data as new second annotation data for correction, and inputting the corrected new third annotation data and the image to be annotated into the open model.

[0009] Optionally, the step of performing data annotation on the image to be annotated based on the optimized open model to obtain data annotation results includes: performing data annotation on the image to be annotated based on the optimized open model to obtain target annotation data; and determining the data annotation results according to the annotation parameters corresponding to each target annotation object in the target annotation data.

[0010] Optionally, the annotation parameters include one or more of the following: the type of annotation graphic, the size of the annotation graphic, the coordinates of the center point of the annotation graphic, and the type of annotation object; determining the data annotation result based on the annotation parameters corresponding to each target annotation object in the target annotation data includes: statistically obtaining the quantity distribution of each annotation type based on the number of target annotation objects and the annotation type corresponding to each target annotation object, and using the quantity distribution as the data annotation result; and / or, determining the occlusion situation of each target annotation object based on the number of target annotation objects and the type of annotation graphic, the size of the annotation graphic, and the coordinates of the center point of the annotation graphic corresponding to each target annotation object, and using the occlusion situation as the annotation result.

[0011] Optionally, the step of obtaining the image to be labeled based on the user interaction interface includes: displaying sample image sets corresponding to multiple application scenarios using the user interaction interface; determining the target application scenario from the multiple application scenarios based on the user's selection operation on the user interaction page, and determining the image to be labeled from the sample image set corresponding to the target application scenario.

[0012] Optionally, after obtaining the data annotation results, the method further includes: in response to the data annotation results indicating the quantity distribution of each annotation type, if the quantity distribution indicates that the distribution of multiple annotation types is uneven, removing the image to be annotated from the sample image set.

[0013] To achieve the above objectives, according to another aspect of the present invention, a data annotation apparatus is provided.

[0014] An embodiment of the present invention provides a data annotation apparatus comprising:

[0015] The acquisition module is used to acquire the image to be labeled and the first labeling data corresponding to the first labeling object in the image to be labeled;

[0016] The first data annotation module is used to input the image to be annotated and the first annotation data into the open model to obtain the second annotation data corresponding to the second annotation object in the image to be annotated, excluding the first annotation object;

[0017] An optimization module is used to correct the second labeled data to obtain the incorrectly labeled third labeled data, and to optimize the open model based on the third labeled data and the image to be labeled;

[0018] The second data annotation module performs data annotation on the image to be annotated based on the optimized open model, and obtains the data annotation results.

[0019] To achieve the above objectives, according to another aspect of the present invention, an electronic device for data annotation is provided.

[0020] An electronic device for data annotation according to an embodiment of the present invention includes: one or more processors; and a storage device for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors implement a data annotation method according to an embodiment of the present invention.

[0021] To achieve the above objectives, according to another aspect of the present invention, a computer-readable storage medium is provided.

[0022] An embodiment of the present invention provides a computer-readable storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements a data annotation method according to an embodiment of the present invention.

[0023] One embodiment of the above invention has the following advantages or beneficial effects: using increasingly mature open models as auxiliary tools, through multiple interaction processes with the open models, it is only necessary to obtain the first annotation data corresponding to a portion of the first annotation objects, so that the open models can be quickly optimized and applied, without being limited to a single application scenario. While effectively improving annotation efficiency, it also ensures the accuracy of annotation.

[0024] The further effects of the aforementioned unconventional alternative methods will be explained below in conjunction with specific implementation methods. Attached Figure Description

[0025] The accompanying drawings are provided to better understand the invention and are not intended to unduly limit the scope of the invention. Wherein:

[0026] Figure 1 This is a schematic diagram of the main flow of the data annotation method according to an embodiment of the present invention;

[0027] Figure 2 This is a schematic diagram of the main process for obtaining manually annotated first annotation data according to an embodiment of the present invention;

[0028] Figure 3 This is a schematic diagram of the main process for obtaining an image to be labeled according to an embodiment of the present invention;

[0029] Figure 4 This is a schematic diagram of the main process for modifying the second annotation data according to an embodiment of the present invention;

[0030] Figure 5 This is a schematic diagram of the main process for optimizing an open model according to an embodiment of the present invention;

[0031] Figure 6 This is a schematic diagram of the main process for determining data annotation results according to an embodiment of the present invention;

[0032] Figure 7 This is a schematic diagram of the main modules of a data annotation device according to an embodiment of the present invention;

[0033] Figure 8 This is an exemplary system architecture diagram in which embodiments of the present invention can be applied;

[0034] Figure 9 This is a schematic diagram of the structure of a computer system suitable for implementing terminal devices or servers of the present invention. Detailed Implementation

[0035] The following description, in conjunction with the accompanying drawings, illustrates exemplary embodiments of the present invention, including various details to aid understanding. These details should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the invention. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.

[0036] It should be noted that, unless otherwise specified, the embodiments of the present invention and the technical features thereof can be combined with each other.

[0037] It should be noted that the collection, gathering, updating, analysis, processing, use, transmission, and storage of user personal information involved in this disclosed technical solution all comply with relevant laws and regulations, are used for legitimate purposes, and do not violate public order and good morals. Necessary measures are taken to prevent unauthorized access to user personal information data and to safeguard user personal information security, network security, and national security.

[0038] Figure 1 This is a schematic diagram illustrating the main steps of a data annotation method according to an embodiment of the present invention.

[0039] like Figure 1 As shown, the data annotation method of this invention mainly includes the following steps:

[0040] Step S101: Obtain the image to be labeled and the first labeling data corresponding to the first labeled object in the image to be labeled;

[0041] Step S102: Input the image to be labeled and the first labeled data into the open model to obtain the second labeled data corresponding to the second labeled objects in the image to be labeled, excluding the first labeled objects; wherein, the open model is a general model trained based on a public dataset;

[0042] Step S103: Correct the second annotation data to obtain the third annotation data with incorrect annotations, and optimize the open model based on the third annotation data and the image to be annotated;

[0043] Step S104: Perform data annotation on the image to be annotated based on the optimized open model to obtain the data annotation results.

[0044] The first labeled object can be manually labeled. To minimize the cost of manual labeling, only a small number of first labeled objects need to be labeled in the image to be labeled. It should be noted that this application actually uses an open model as an auxiliary labeling tool. Therefore, the accuracy of the first labeled data must be ensured so that the open model can effectively learn from it and label the second labeled objects in the image to be labeled based on the automatically learned results. It is understandable that the open model will automatically learn based on the first labeled data. Therefore, the more types and labeling methods of the first labeled objects involved in the first labeled data, the better. Taking a logistics scenario as an example, if only item type 1 is labeled using method A as the first labeled data, then after inputting the image to be labeled and the first labeled data into the open model, the model can only automatically learn to label other item type 1 items in the image using method A, while other item types cannot be effectively learned. However, if item type 1 is labeled not only using method A but also item type 2 is labeled using method B as the first labeled data, then the obtained second labeled data will also include other item type 2 items labeled using method B.

[0045] Specifically, the first annotation data manually annotated can be obtained using the user interface, i.e., step S101 as follows. Figure 2 As shown, it includes:

[0046] Step S201: Obtain the image to be labeled based on the user interface;

[0047] Step S202: Based on the user's annotation operation on the user interaction interface, determine the annotation parameters corresponding to one or more first annotation objects respectively;

[0048] Step S203: Generate first annotation data based on the first annotation object and the corresponding annotation parameters.

[0049] For example, a user can first select an image as the image to be labeled based on the user interface, and then add corresponding annotation parameters to one or more first annotation objects by clicking and dragging the mouse on the image. For example, by clicking and dragging the mouse, the user selects the first annotation objects to be labeled, and then adds and confirms the annotation parameters through the toolbar, so as to label the first annotation objects one by one.

[0050] In one optional embodiment, the selection of the image to be labeled can be as follows: Figure 3 As shown, it includes:

[0051] Step S301: Display sample image sets corresponding to multiple application scenarios using a user interface;

[0052] Step S302: Based on the user's selection operation on the user interaction page, determine the target application scenario from multiple application scenarios, and determine the image to be labeled from the sample image set corresponding to the target application scenario.

[0053] In each application scenario, a sample image set can be provided, including multiple images with different shooting angles, different cargo density, and different warehouse scenarios, so that users can select images that are closer to their actual needs for image annotation, thereby optimizing and training the open model.

[0054] In an optional embodiment, after the user completes the annotation of the first annotation object, it is necessary to respond to the user's trigger operation to determine that the user has completed the annotation operation of all the first annotation objects, and the first annotation data can be input into the open model. Therefore, in an optional embodiment, after step S203 and before step S102, the method further includes: calling the open model through a preset interface according to the user's model call operation on the user interaction interface.

[0055] The user interface can provide multiple open models for users to choose from, such as the YOLOv8-DET detection model and the SAM-H segmentation model, allowing users to select the most suitable open model based on their specific application scenario. It is understood that since the open models in this embodiment are typically located in the cloud or on a third-party server, rather than being locally stored, different preset interfaces can be set for different open models to enable model invocation. For example, after a user selects an open model, they can click the "Run" button to invoke the model. When the user needs to change the model, they can click the "Cancel" button to end the current model invocation, and then select another open model and click the "Run" button again to invoke it.

[0056] Furthermore, the required input data format may differ for different open models. Therefore, after step S203, the process may further include storing the first annotation data according to the format requirements of the open model. This process allows users to convert between different data formats to meet the needs of different open models. For example, a format storage selection box can be provided on the user interface to store the first annotation data in a local database or local cache according to the user-selected format. The specific storage format can be any of JSON, XML, or TXT formats.

[0057] In a further optional embodiment, the specific requirements for the input data content vary depending on the open model. Taking a big data AI model as an example, after obtaining the first labeled data, it is also necessary to generate corresponding prompt words based on the first labeled data and the image to be labeled. The prompt words mainly include specific task instructions, contextual information, input data, and the output format or type, so that the AI ​​model can output second labeled data based on the prompt words.

[0058] Through the above process, even when the user only manually annotates a portion of the first annotated objects, the open model can be used to obtain the second annotated data corresponding to the second annotated objects. However, since the open model is trained on a public dataset, and public datasets often differ from the user's actual application scenario, it is difficult to guarantee the accuracy of the second annotated data even if data learning is performed based on the correct first annotated data (positive sample data). Therefore, after obtaining the second annotated data, this invention can further refine the open model through a human-computer interaction process, making the optimized open model more suitable for the current application scenario, thereby improving the accuracy of data annotation.

[0059] In an optional embodiment, step S103 can be as follows: Figure 4 As shown, it includes:

[0060] Step S401: Display the second labeled data through the user interface;

[0061] Step S402: Based on the user's modification operation on the user interaction interface, determine the third annotation data with incorrect annotation from the second annotation data; wherein, the modification operation indicates the user's modification of the second annotation object and / or the annotation parameters corresponding to the second annotation object in the second annotation data.

[0062] Understandably, the errors in the second-labeled data could stem from incorrect labeling of the object itself, such as labeling a shelf as an item. Alternatively, errors could exist in the labeling parameters, such as mislabeling item type 1 as item type 2 due to goods obscuring the label. Regardless of whether it's the labeling object or the labeling parameters, manually correcting these incorrect second-labeled data is necessary so that the open model can use the third-labeled data as negative samples for further learning. It's important to note that users don't need to correct all incorrectly labeled second-labeled data; modifying only a portion is sufficient. The open model can still learn from the incorrectly labeled third-labeled dataset and relabel other second-labeled data with the same errors.

[0063] For example, in the second labeled data, there are multiple instances of incorrect item classification due to partial occlusion, such as item type 1 being incorrectly labeled as item type 2. In this case, the user can modify the item classification of one of the second labeled data entries to obtain the third labeled data. After the open model optimizes based on the third labeled data and re-identifies the image to be labeled, the open model can then determine that the partially occluded item is actually item type 1 and correctly label it. It is evident that through user modifications, a rapid fine-tuning mechanism for the open model can be achieved with minimal manual annotation, effectively improving the accuracy of data labeling.

[0064] Considering the low efficiency of manual error identification, it's possible that not all error types can be effectively identified in a single identification process. For example, it might only detect cases where item type 1 is incorrectly labeled as item type 2, without discovering cases where item type 3 is also incorrectly labeled as item type 2. Therefore, in an optional embodiment, step S103 can also include... Figure 5 As shown, repeat steps S501 and S502 repeatedly until the loss function value of the open model meets the preset threshold:

[0065] Step S501: Input the third annotation data and the image to be annotated into the open model to obtain the fourth annotation data;

[0066] Step S502: Correct the fourth annotation data as the new second annotation data, and input the corrected new third annotation data and the image to be annotated into the open model.

[0067] By repeatedly performing the data correction process described above, all erroneous second-label data can be identified as much as possible, ensuring the accuracy of the optimized open model. The loss function value of the open model can be directly obtained from the platform providing the open model; users do not need to perform any calculations, only need to determine the relationship between the loss function value and a preset threshold.

[0068] For step S104, in an optional embodiment, it can also be as follows: Figure 6 As shown, it includes:

[0069] Step S601: Based on the optimized open model, perform data annotation on the image to be annotated to obtain target annotation data;

[0070] Step S602: Determine the data annotation result based on the annotation parameters corresponding to each target annotation object in the target annotation data.

[0071] It is understood that the target annotation data actually includes multiple target annotation objects and the annotation parameters corresponding to each target annotation object. However, in this embodiment of the invention, to facilitate user viewing of the results, the obtained data annotation result refers to the result obtained after statistical analysis based on the target annotation data, and does not refer to the target annotation data itself. Of course, the target annotation data can also be directly used as the data annotation result feedback according to the user's actual needs.

[0072] In one optional embodiment, the annotation parameters provided by this invention include one or more of the following: the type of annotation graphic, the size of the annotation graphic, the coordinates of the center point of the annotation graphic, and the type of annotation object. Depending on the target to be statistically analyzed, step S602 may include:

[0073] (i) Based on the number of target labeled objects and the labeling type corresponding to each target labeled object, the quantity distribution of each labeling type is statistically obtained, and the quantity distribution is used as the data labeling result;

[0074] (ii) Based on the number of target labeled objects, the type of labeled graphic corresponding to each target labeled object, the size of the labeled graphic, and the coordinates of the center point of the labeled graphic, determine the occlusion of each target labeled object, and use the occlusion as the labeling result.

[0075] To facilitate users' intuitive viewing of the data annotation results, a user interaction page can be used to visualize the results. For example, quantity distribution can be represented using bar charts or pie charts, while occlusion can be represented using occlusion rate or other indicators. In addition to the above two types of annotation results, other quantitative statistical results such as the number of target annotated objects and the total number of annotation types can also be used as data annotation results; this invention does not impose specific limitations on these.

[0076] In a further optional embodiment, in response to the data annotation results indicating the quantity distribution corresponding to each annotation type, the method may further include: removing the image to be annotated from the sample image set when the quantity distribution indicates that the distribution of multiple annotation types is uneven. It is understood that for machine learning processes, if there are too many data annotation results of one type and too few data annotation results of other types, the open model can only effectively learn from the type with the most data annotation results, while other types of data cannot be effectively learned due to the lack of data annotation results. Naturally, the probability of errors in subsequent annotation processes will increase. Therefore, in this embodiment of the invention, images with a relatively even distribution of each annotation type are selected for annotation as much as possible to enable better training of the open model.

[0077] The data annotation method according to embodiments of the present invention uses increasingly mature open models as auxiliary tools. Through multiple interaction processes with the open models, it only needs to obtain the first annotation data corresponding to a portion of the first annotation objects to quickly optimize and apply the open models. It is not limited to a single application scenario, effectively improving annotation efficiency while ensuring annotation accuracy.

[0078] Figure 7 This is a schematic diagram of the main modules of a data annotation device according to an embodiment of the present invention.

[0079] like Figure 7 As shown, the data annotation apparatus 700 of this embodiment includes:

[0080] The acquisition module 701 is used to acquire the image to be labeled and the first labeling data corresponding to the first labeling object in the image to be labeled;

[0081] The first data annotation module 702 is used to input the image to be annotated and the first annotation data into an open model to obtain the second annotation data corresponding to the second annotation object in the image to be annotated, excluding the first annotation object; wherein, the open model is a general model trained based on a public dataset;

[0082] The optimization module 703 is used to correct the second annotation data to obtain the incorrectly labeled third annotation data, and to optimize the open model based on the third annotation data and the image to be labeled;

[0083] The second data annotation module 704 performs data annotation on the image to be annotated based on the optimized open model, and obtains the data annotation result.

[0084] In an optional embodiment of the present invention, the acquisition module 701 is further configured to: acquire an image to be labeled based on a user interface; determine labeling parameters corresponding to one or more first labeling objects according to the user's labeling operation on the user interface; and generate the first labeling data according to the first labeling objects and the corresponding labeling parameters.

[0085] In an optional embodiment of the present invention, the optimization module 703 is further configured to: display the second annotation data through the user interface; and determine the third annotation data with erroneous annotation from the second annotation data based on the user's modification operation on the user interface; wherein the modification operation indicates that the user has made modifications to the second annotation object and / or the annotation parameters corresponding to the second annotation object in the second annotation data.

[0086] In an optional embodiment of the present invention, the optimization module 703 is further configured to perform the following steps in a loop until the loss function value of the open model meets a preset threshold: inputting the third labeled data and the image to be labeled into the open model to obtain the fourth labeled data; correcting the fourth labeled data as new second labeled data, and inputting the corrected new third labeled data and the image to be labeled into the open model.

[0087] In an optional embodiment of the present invention, the second data annotation module 704 is further configured to: perform data annotation on the image to be annotated based on the optimized open model to obtain target annotation data; and determine the data annotation result according to the annotation parameters corresponding to each target annotation object in the target annotation data.

[0088] In an optional embodiment of the present invention, the annotation parameters include one or more of the following: the type of annotation graphic, the size of the annotation graphic, the coordinates of the center point of the annotation graphic, and the type of annotation object; the second data annotation module 704 is further configured to: statistically obtain the quantity distribution of each of the annotation types according to the number of the target annotation objects and the annotation type corresponding to each target annotation object, and use the quantity distribution as the data annotation result; and / or, determine the occlusion situation of each of the target annotation objects according to the number of the target annotation objects, the type of annotation graphic corresponding to each target annotation object, the size of the annotation graphic, and the coordinates of the center point of the annotation graphic, and use the occlusion situation as the annotation result.

[0089] In an optional embodiment of the present invention, the acquisition module 701 is further configured to: display sample image sets corresponding to multiple application scenarios using a user interaction interface; determine a target application scenario from the multiple application scenarios based on the user's selection operation on the user interaction page; and determine the image to be labeled from the sample image set corresponding to the target application scenario.

[0090] In an optional embodiment of the present invention, the acquisition module 701 is further configured to, in response to the data annotation result indicating the quantity distribution of each annotation type, remove the image to be annotated from the sample image set when the quantity distribution indicates that the distribution of multiple annotation types is uneven.

[0091] The data annotation apparatus according to embodiments of the present invention uses increasingly mature open models as auxiliary tools. Through multiple interaction processes with the open models, it can quickly optimize and apply the open models by only acquiring the first annotation data corresponding to a portion of the first annotation objects. It is not limited to a single application scenario, effectively improving annotation efficiency while ensuring annotation accuracy.

[0092] Figure 8 An exemplary system architecture 800 is shown whereby the data annotation method or data annotation apparatus of embodiments of the present invention can be applied.

[0093] like Figure 8 As shown, system architecture 800 may include terminal devices 801, 802, and 803, a network 804, and a server 805. Network 804 serves as the medium for providing communication links between terminal devices 801, 802, and 803 and server 805. Network 804 may include various connection types, such as wired or wireless communication links or fiber optic cables, etc.

[0094] Users can use terminal devices 801, 802, and 803 to interact with server 805 via network 804 to receive or send data. Various communication client applications can be installed on terminal devices 801, 802, and 803, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, and social media platforms.

[0095] Terminal devices 801, 802, and 803 can be various electronic devices with displays and web browsing capabilities, including but not limited to smartphones, tablets, laptops, and desktop computers.

[0096] Server 805 can be a server that provides various services, such as a backend management server that supports the images to be labeled and the first annotation data obtained by the user using terminal devices 801, 802, and 803. The backend management server can analyze and process the received images to be labeled and the first annotation data, and feed back the processing results (such as data annotation results) to the terminal devices.

[0097] It should be noted that the data annotation method provided in the embodiments of the present invention is generally executed by server 805, and correspondingly, the data annotation device is generally set in server 805.

[0098] It should be understood that Figure 8 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.

[0099] The following is for reference. Figure 9It shows a schematic diagram of the structure of a computer system 900 suitable for implementing a terminal device of the present invention. Figure 9 The terminal device shown is merely an example and should not impose any limitations on the functionality and scope of use of the embodiments of the present invention.

[0100] like Figure 9 As shown, the computer system 900 includes a central processing unit (CPU) 901, which can perform various appropriate actions and processes based on programs stored in read-only memory (ROM) 902 or programs loaded from storage section 908 into random access memory (RAM) 903. The RAM 903 also stores various programs and data required for the operation of the system 900. The CPU 901, ROM 902, and RAM 903 are interconnected via a bus 904. An input / output (I / O) first interface 905 is also connected to the bus 904.

[0101] The following components are connected to the I / O first interface 905: an input section 906 including a keyboard, mouse, etc.; an output section 907 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and speakers, etc.; a storage section 908 including a hard disk, etc.; and a communication section 909 including a network first interface card such as a LAN card, modem, etc. The communication section 909 performs communication processing via a network such as the Internet. A drive 910 is also connected to the I / O first interface 905 as needed. A removable medium 911, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on the drive 910 as needed so that computer programs read from it can be installed into the storage section 908 as needed.

[0102] In particular, according to the embodiments disclosed in this invention, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments disclosed in this invention include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via communication section 909, and / or installed from removable medium 911. When the computer program is executed by central processing unit (CPU) 901, it performs the functions defined above in the system of this invention.

[0103] It should be noted that the computer-readable medium shown in this invention can be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this invention, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In this invention, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. Computer-readable signal media can also be any computer-readable medium other than computer-readable storage media, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wireless, wire, optical fiber, RF, etc., or any suitable combination thereof.

[0104] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0105] The modules described in the embodiments of the present invention can be implemented in software or hardware. The described modules can also be housed in a processor; for example, a processor can be described as including an acquisition module, a first data annotation module, an optimization module, and a second data annotation module. The names of these modules do not necessarily limit the module itself; for example, the acquisition module can also be described as "a module that acquires an image to be annotated and first annotation data corresponding to a first annotation object in the image to be annotated."

[0106] In another aspect, the present invention also provides a computer-readable medium, which may be included in the device described in the above embodiments; or it may exist independently and not assembled into the device. The computer-readable medium carries one or more programs, which, when executed by the device, cause the device to include: acquiring an image to be labeled and first labeling data corresponding to a first labeling object in the image to be labeled; inputting the image to be labeled and the first labeling data into an open model to obtain second labeling data corresponding to a second labeling object in the image to be labeled, excluding the first labeling object; wherein the open model is a general model trained based on a public dataset; correcting the second labeling data to obtain incorrectly labeled third labeling data, and optimizing the open model based on the third labeling data and the image to be labeled; and performing data labeling on the image to be labeled based on the optimized open model to obtain data labeling results.

[0107] According to the technical solution of the present invention, the increasingly mature open model is used as an auxiliary tool. Through multiple interaction processes with the open model, only the first annotation data corresponding to a portion of the first annotation objects needs to be obtained to quickly optimize and apply the open model. It is not limited to a single application scenario. While effectively improving annotation efficiency, it also ensures the accuracy of annotation.

[0108] The specific embodiments described above do not constitute a limitation on the scope of protection of this invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can occur depending on design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.

Claims

1. A data annotation method, characterized in that, include: Obtain the image to be labeled and the first labeling data corresponding to the first labeling object in the image to be labeled; The image to be labeled and the first labeling data are input into the open model to obtain the second labeling data corresponding to the second labeling object in the image to be labeled, excluding the first labeling object; The second labeled data is corrected to obtain the incorrectly labeled third labeled data, and the open model is optimized based on the third labeled data and the image to be labeled; The image to be labeled is annotated based on the optimized open model to obtain the data annotation results.

2. The method according to claim 1, characterized in that, The step of obtaining the image to be labeled and the first labeling data corresponding to the first labeling object in the image to be labeled includes: Obtain the image to be labeled based on the user interface; Based on the annotation operation performed by the user on the user interaction interface, determine the annotation parameters corresponding to one or more first annotation objects respectively; The first annotation data is generated based on the first annotation object and the corresponding annotation parameters.

3. The method according to claim 2, characterized in that, The step of correcting the second labeled data to obtain the incorrectly labeled third labeled data includes: The second labeled data is displayed through the user interface. Based on the user's modification operation on the user interaction interface, the third annotation data with incorrect annotation is determined from the second annotation data; wherein, the modification operation indicates that the user has modified the second annotation object and / or the annotation parameters corresponding to the second annotation object in the second annotation data.

4. The method according to claim 1, characterized in that, The optimization of the open model based on the third labeled data and the image to be labeled includes: Repeat the following steps until the loss function value of the open model meets a preset threshold: The third annotation data and the image to be annotated are input into the open model to obtain the fourth annotation data; The fourth annotation data is used as the new second annotation data for correction, and the new third annotation data obtained after correction and the image to be annotated are input into the open model.

5. The method according to claim 1, characterized in that, The image to be labeled is annotated based on the optimized open model to obtain the data annotation results, including: The image to be labeled is annotated based on the optimized open model to obtain the target labeled data; The data annotation result is determined based on the annotation parameters corresponding to each target annotation object in the target annotation data.

6. The method according to claim 5, characterized in that, The annotation parameters include one or more of the following: the type of annotation graphic, the size of the annotation graphic, the coordinates of the center point of the annotation graphic, and the type of annotation object; The step of determining the data annotation result based on the annotation parameters corresponding to each target annotation object in the target annotation data includes: Based on the number of target labeled objects and the labeling type corresponding to each target labeled object, the quantity distribution of each labeling type is statistically obtained, and the quantity distribution is used as the data labeling result. And / or, Based on the number of target labeled objects, the type of labeled graphic corresponding to each target labeled object, the size of the labeled graphic, and the coordinates of the center point of the labeled graphic, the occlusion status of each target labeled object is determined, and the occlusion status is used as the labeling result.

7. The method according to claim 2, characterized in that, The process of acquiring the image to be labeled based on the user interface includes: The user interface is used to display sample image sets corresponding to multiple application scenarios. Based on the user's selection operation on the user interaction page, a target application scenario is determined from multiple application scenarios, and the image to be labeled is determined from the sample image set corresponding to the target application scenario.

8. The method according to claim 7, characterized in that, After obtaining the data annotation results, the following is also included: In response to the data annotation results indicating the quantity distribution of each annotation type, if the quantity distribution indicates that the distribution of multiple annotation types is uneven, the image to be annotated is removed from the sample image set.

9. A data annotation device, characterized in that, include: The acquisition module is used to acquire the image to be labeled and the first labeling data corresponding to the first labeling object in the image to be labeled; The first data annotation module is used to input the image to be annotated and the first annotation data into the open model to obtain the second annotation data corresponding to the second annotation object in the image to be annotated, excluding the first annotation object; An optimization module is used to correct the second labeled data to obtain the incorrectly labeled third labeled data, and to optimize the open model based on the third labeled data and the image to be labeled; The second data annotation module performs data annotation on the image to be annotated based on the optimized open model, and obtains the data annotation results.

10. An electronic device for data annotation, characterized in that, include: One or more processors; Storage device for storing one or more programs. When the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-8.

11. A computer-readable medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the method as described in any one of claims 1-8.

12. A computer program product comprising a computer program that, when executed by a processor, implements the method as described in any one of claims 1-8.