Data annotation system, method, and data annotation manager

Through the data annotation manager and basic computing unit storage repository in the data annotation system, users can upload and select annotation models themselves, which solves the problem that existing technologies can only use technical personnel to integrate annotation models, and achieves more efficient data annotation flexibility and richer annotation model selection.

CN115204256BActive Publication Date: 2026-06-26YINWANG INTELLIGENT TECHNOLOGIES CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
YINWANG INTELLIGENT TECHNOLOGIES CO LTD
Filing Date
2020-04-30
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing data annotation systems can only integrate technical personnel's annotation models through hard coding, and cannot use users' own annotation models for annotation, resulting in poor flexibility.

Method used

A data annotation system is provided, including a data annotation manager, an annotation model storage repository, and a basic computing unit storage repository. Users can upload basic parameter data of annotation models and allocate hardware resources in the basic computing units through the data annotation manager to construct target computing units to combine user-defined annotation models for annotation.

Benefits of technology

This system enhances the flexibility of the data annotation system, allowing users to select and change annotation models to meet personalized needs and improve annotation efficiency and flexibility.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115204256B_ABST
    Figure CN115204256B_ABST
Patent Text Reader

Abstract

The embodiment of the application discloses a kind of data labeling system, method and data labeling manager, belong to machine learning technical field.The system includes data labeling manager, labeling model storage warehouse and basic computing unit storage warehouse.Data labeling manager receives data labeling request, obtains target basic computing unit in basic computing unit storage warehouse, and assigns hardware resource to it, establishes target computing unit, obtains the first storage path information of the basic parameter data of first labeling model and sends to target computing unit.Target computing unit obtains the basic parameter data of labeling model to be used in labeling model storage warehouse by first storage path information, combines target model inference framework and the basic parameter data of first labeling model into first labeling model, and uses first labeling model, annotates to be labeled data.Adopting the present application, it can make selectable labeling model more abundant, better provide data labeling service for user.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] This application is a divisional application. The original application has the application number 202080005146.9 and the original application date is April 30, 2020. The entire contents of the original application are incorporated herein by reference. Technical Field

[0002] This application relates to the field of machine learning technology, and in particular to a data annotation system, method and data annotation manager. Background Technology

[0003] A usable machine learning model requires both model building and training. Model training typically involves collecting and labeling a large amount of sample data. Each sample and its corresponding label are then used as a training set to train the model. Therefore, sample labeling is an essential step in model training.

[0004] Currently, users who need annotation can upload the data they want annotated to the data annotation system. The system can then use pre-integrated annotation models to annotate the uploaded data. The annotation models in the data annotation system are typically integrated into the system by technical personnel through hard coding.

[0005] In the process of developing this application, the inventors discovered that the prior art has at least the following problems:

[0006] Currently, annotation models can only be integrated into data annotation systems by technical personnel through hard-coding, and users cannot use their own annotation models for annotation. If users need to use their own annotation models to annotate sample data, this is not possible within the current data annotation system. Therefore, the current data annotation system has poor flexibility, only supporting annotation models integrated by technical personnel. Summary of the Invention

[0007] To address the problem of poor flexibility in data annotation systems in related technologies, which can only use annotation models integrated by technicians, this application provides a data annotation system, method, and data annotation manager. The technical solution is as follows:

[0008] Firstly, a data annotation system is provided, comprising a data annotation manager, an annotation model storage repository, and a basic computing unit storage repository, wherein:

[0009] The data annotation manager is used to receive data annotation requests sent by clients, wherein the data annotation requests carry model identifiers and hardware resource allocation information of the first annotation model; to obtain target basic computing units from the basic computing unit storage repository, wherein the target basic computing unit includes a target model inference framework and a hardware driver invocation tool corresponding to the first annotation model; to allocate hardware resources to the target basic computing unit based on the hardware resource allocation information, and to establish the target computing unit; and to obtain the first storage path information of the basic parameter data corresponding to the model identifier of the first annotation model from the correspondence between the stored model identifiers and the storage path information of the basic parameter data, and to send the first storage path information to the target computing unit.

[0010] The target computing unit is configured to obtain the basic parameter data of the annotation model to be used from the annotation model storage warehouse through the first storage path information, wherein the basic parameter data of the first annotation model includes the trained values ​​of the trainable parameters in the first annotation model; combine the target model inference framework and the basic parameter data of the first annotation model to obtain the first annotation model; obtain the data to be annotated; input the data to be annotated into the first annotation model to annotate the data to be annotated.

[0011] In the scheme shown in the embodiments of this application, the data annotation system can be implemented in a single server. For example, the data annotation manager is a functional module in the single server, while the annotation model storage repository and the basic computing unit storage repository are storage areas in the single server. Of course, the data annotation system can also be a server cluster, in which the data annotation manager, annotation model storage repository, and basic computing unit storage repository can be deployed on different servers in the server cluster.

[0012] The aforementioned basic computing unit can be a program that includes a model inference framework, hardware driver calling tools, and environment files supporting language execution. The model inference framework can be a convolutional architecture for fast feature embedding (Caffee), Tensorflow, PyTorch, etc. These model inference frameworks can be stored by technical personnel in the basic computing unit storage repository of the data annotation system. Users can also choose the hardware resources used during annotation; for example, users can specify the number of central processing units (CPUs) and graphics processing units (GPUs). That is, users can specify hardware resources according to their actual needs, rather than the data annotation system allocating hardware resources pre-allocated, which better meets user requirements.

[0013] When users have annotation needs, the data annotation manager in the data annotation system can select a basic computing unit that includes the target model inference framework and allocate hardware resources to the basic computing unit to build the target computing unit.

[0014] Subsequently, the target computation unit can obtain the basic parameter data of the first annotation model and combine it with the target model inference framework to form the first annotation model. Then, the first annotation model can be used to annotate the data to be annotated. It can be seen that in the solution shown in the embodiments of this application, it is not necessary to integrate the annotation model into the data annotation system in a hard-coded manner, making the data annotation system more flexible and allowing for a wider range of available annotation models.

[0015] In one possible implementation, the data annotation manager is further configured to:

[0016] Receive the model identifier and basic parameter data of the first labeled model sent by the client;

[0017] The basic parameter data of the first annotation model is stored in the annotation model storage warehouse, and the model identifier of the first annotation model and the first storage path information of the basic parameter data of the first annotation model are stored accordingly.

[0018] In the scheme shown in the embodiments of this application, the user uploads the basic parameter data of the first labeled model to the data annotation system. The data annotation manager in the data annotation system can receive the basic parameter data of the first labeled model uploaded by the user and store it in the labeled model storage warehouse of the data annotation system, and store the model identifier of the first labeled model and the first storage path information of the basic parameter data of the first labeled model accordingly.

[0019] When annotating the data to be labeled, users can choose to combine the basic parameter data of their uploaded first labeled model with the target model inference framework provided by the data annotation system to generate the first labeled model.

[0020] In one possible implementation, the data annotation request also carries a data identifier for the data to be annotated, and the data annotation manager is further configured to:

[0021] In the correspondence between stored data identifiers and data storage path information, obtain the second storage path information corresponding to the data identifier of the data to be labeled;

[0022] Send the second storage path information to the target computing unit;

[0023] The target computing unit is used for:

[0024] The data to be labeled is obtained using the second storage path information.

[0025] In one possible implementation, the data annotation request also carries a framework identifier of the target model inference framework, and the data annotation manager is used for:

[0026] Based on the framework identifier of the target model inference framework, the target basic computing unit containing the target model inference framework is obtained from the basic computing unit storage repository.

[0027] In the solution illustrated in this application embodiment, when a user uploads their own basic parameter data for the labeled model, the user can also select the model inference framework to use. Upon receiving a data labeling request, the data labeling manager in the data labeling system selects the target basic computational unit containing the target model inference framework based on the framework identifier carried within the request.

[0028] In one possible implementation, the data annotation manager is further configured to:

[0029] During the process of the target computing unit annotating the data to be annotated using the first annotation model, a annotation model replacement request sent by the client is received, wherein the annotation model replacement request carries the model identifier of the second annotation model; in the correspondence between the stored model identifier and the storage path information of the basic parameter data, the third storage path information of the basic parameter data corresponding to the model identifier of the second annotation model is obtained; a model replacement instruction is sent to the target computing unit, wherein the model replacement instruction carries the third storage path information;

[0030] The target computing unit is used to stop labeling unlabeled data to be labeled; obtain the basic parameter data of the second labeling model from the labeling model storage warehouse through the third storage path information; replace the basic parameter data of the first labeling model in the target model inference framework with the basic parameter data of the second labeling model to obtain the second labeling model, wherein the model inference framework corresponding to the second labeling model and the first labeling model are the same; input the unlabeled data to be labeled into the second labeling model to label the unlabeled data to be labeled.

[0031] In the solution illustrated in this application embodiment, during the annotation process of the data to be labeled, the user can choose to change the annotation model to annotate the unlabeled data. For example, after using the first annotation model to annotate the data for ten minutes, the user may need to use the second annotation model to annotate the unlabeled data. At this time, the user can send an annotation model change request to the data annotation system. The data annotation manager in the data annotation system can, based on the user's annotation model change request, instruct the target computing unit to replace the basic parameter data of the first annotation model in the target model inference framework with the basic parameter data of the second annotation model, and form the second annotation model. Then, the second annotation model can be used to continue annotating the unlabeled data. In this process, the user only needs to select the annotation model to change to, without having to re-upload the data to be labeled, thus making the annotation efficiency higher.

[0032] Secondly, a data annotation method is provided, the method comprising:

[0033] Receive a data annotation request sent by the client, wherein the data annotation request carries the model identifier and hardware resource allocation information of the first annotation model;

[0034] Obtain the target basic computing unit from the basic computing unit storage repository, wherein the target basic computing unit includes the target model inference framework and hardware driver calling tool corresponding to the first labeled model;

[0035] Based on the hardware resource allocation information, hardware resources are allocated to the target basic computing unit to establish the target computing unit;

[0036] In the correspondence between the stored model identifier and the storage path information of the basic parameter data, the first storage path information of the basic parameter data corresponding to the model identifier of the first labeled model is obtained, and the first storage path information is sent to the target computing unit so that the target computing unit can obtain the basic parameter data of the labeled model to be used in the labeled model storage warehouse through the first storage path information, combine the target model inference framework and the basic parameter data of the first labeled model to obtain the first labeled model, obtain the data to be labeled, input the data to be labeled into the first labeled model, and label the data to be labeled. The basic parameter data of the first labeled model includes the trained values ​​of the trainable parameters in the first labeled model.

[0037] In one possible implementation, the method further includes:

[0038] Receive the model identifier and basic parameter data of the first labeled model sent by the client;

[0039] The basic parameter data of the first annotation model is stored in the annotation model storage warehouse, and the model identifier of the first annotation model and the first storage path information of the basic parameter data of the first annotation model are stored accordingly.

[0040] In one possible implementation, the data annotation request also carries a data identifier for the data to be annotated, and the method further includes:

[0041] In the correspondence between stored data identifiers and data storage path information, obtain the second storage path information corresponding to the data identifier of the data to be labeled;

[0042] The second storage path information is sent to the target computing unit so that the target computing unit can obtain the data to be labeled through the second storage path information.

[0043] In one possible implementation, the data annotation request also carries the framework identifier of the target model inference framework, and the step of obtaining the target basic computing unit from the basic computing unit storage repository includes:

[0044] Based on the framework identifier of the target model inference framework, the target basic computing unit containing the target model inference framework is obtained from the basic computing unit storage repository.

[0045] In one possible implementation, the method further includes:

[0046] During the process of the target computing unit annotating the data to be annotated using the first annotation model, a annotation model replacement request sent by the client is received, wherein the annotation model replacement request carries the model identifier of the second annotation model;

[0047] In the correspondence between the stored model identifier and the storage path information of the basic parameter data, obtain the third storage path information of the basic parameter data corresponding to the model identifier of the second labeled model;

[0048] A model replacement instruction is sent to the target computing unit, wherein the model replacement instruction carries the third storage path information, so that the target computing unit stops labeling the unlabeled data to be labeled. Through the third storage path information, the basic parameter data of the second labeling model is obtained from the labeling model storage warehouse. The basic parameter data of the first labeling model in the target model inference framework is replaced with the basic parameter data of the second labeling model to obtain the second labeling model. The unlabeled data to be labeled is input into the second labeling model to label the unlabeled data to be labeled. The model inference frameworks corresponding to the second labeling model and the first labeling model are the same.

[0049] Thirdly, a data annotation apparatus is provided, the apparatus comprising:

[0050] The receiving module is used to receive a data annotation request sent by the client, wherein the data annotation request carries the model identifier and hardware resource allocation information of the first annotation model;

[0051] The acquisition module is used to acquire a target basic computing unit from the basic computing unit storage warehouse, wherein the target basic computing unit includes a target model inference framework and a hardware driver calling tool corresponding to the first labeled model;

[0052] The allocation module is used to allocate hardware resources to the target basic computing unit based on the hardware resource allocation information, and to establish the target computing unit;

[0053] The sending module is used to obtain the first storage path information of the basic parameter data corresponding to the model identifier of the first labeled model from the correspondence between the stored model identifier and the storage path information of the basic parameter data, and send the first storage path information to the target computing unit so that the target computing unit can obtain the basic parameter data of the labeled model to be used from the labeled model storage warehouse through the first storage path information, combine the target model inference framework and the basic parameter data of the first labeled model to obtain the first labeled model, obtain the data to be labeled, input the data to be labeled into the first labeled model, and label the data to be labeled. The basic parameter data of the first labeled model includes the trained values ​​of the trainable parameters in the first labeled model.

[0054] In one possible implementation, the receiving module is further configured to:

[0055] The system receives the model identifier and basic parameter data of the first labeled model sent by the client; stores the basic parameter data of the first labeled model in the labeled model storage warehouse, and stores the model identifier of the first labeled model and the first storage path information of the basic parameter data of the first labeled model accordingly.

[0056] In one possible implementation, the data annotation request also carries a data identifier of the data to be annotated, and the acquisition module is further configured to:

[0057] In the correspondence between stored data identifiers and data storage path information, obtain the second storage path information corresponding to the data identifier of the data to be labeled;

[0058] The second storage path information is sent to the target computing unit so that the target computing unit can obtain the data to be labeled through the second storage path information.

[0059] In one possible implementation, the data annotation request also carries the framework identifier of the target model inference framework, and the acquisition module is used to:

[0060] Based on the framework identifier of the target model inference framework, the target basic computing unit containing the target model inference framework is obtained from the basic computing unit storage repository.

[0061] In one possible implementation, the device further includes:

[0062] The replacement module is used to receive a labeling model replacement request sent by the client during the process of the target computing unit labeling the data to be labeled using the first labeling model, wherein the labeling model replacement request carries a model identifier of the second labeling model.

[0063] In the correspondence between the stored model identifier and the storage path information of the basic parameter data, obtain the third storage path information of the basic parameter data corresponding to the model identifier of the second labeled model;

[0064] A model replacement instruction is sent to the target computing unit, wherein the model replacement instruction carries the third storage path information, so that the target computing unit stops labeling the unlabeled data to be labeled. Through the third storage path information, the basic parameter data of the second labeling model is obtained from the labeling model storage warehouse. The basic parameter data of the first labeling model in the target model inference framework is replaced with the basic parameter data of the second labeling model to obtain the second labeling model. The unlabeled data to be labeled is input into the second labeling model to label the unlabeled data to be labeled. The model inference frameworks corresponding to the second labeling model and the first labeling model are the same.

[0065] Thirdly, a data annotation manager is provided, the data annotation manager including a processor and a memory, the memory storing at least one instruction, the instruction being loaded and executed by the processor to perform the operations performed by the data annotation method as described in the first aspect above.

[0066] Fourthly, a computer-readable storage medium is provided, wherein at least one instruction is stored therein, the instruction being loaded and executed by a processor to perform the operations performed by the data annotation method as described in the first aspect above.

[0067] In the scheme shown in this application embodiment, when a user has annotation needs, they can send a data annotation request to the data annotation system. The data annotation manager in the data annotation system can receive the data annotation request. It then retrieves the basic parameter data of the first annotation model from the annotation model storage repository and the target basic computing unit, which includes the target model inference framework, from the basic computing unit repository. Simultaneously, hardware resources are allocated to this target basic computing unit to construct a target computing unit. This target computing unit can combine the target model inference framework and the basic parameter data of the first annotation model into a first annotation model. Afterward, the target computing unit can use this first annotation model to annotate the data to be annotated. This avoids the need to hard-code the annotation model into the data annotation system. The sources of the basic parameter data for the annotation model can be diverse, not limited to integration by technical personnel, making the annotation model more flexible and offering a richer selection of annotation models, thus providing better data annotation services to users. Furthermore, the hardware resources are specified by the user, better meeting their annotation needs. Attached Figure Description

[0068] Figure 1 This is a schematic diagram of an implementation environment provided in an embodiment of this application;

[0069] Figure 2 This is a schematic diagram of the structure of a data annotation manager provided in an embodiment of this application;

[0070] Figure 3 This is a flowchart of a data annotation method provided in an embodiment of this application;

[0071] Figure 4 This is a schematic diagram of a labeled model storage warehouse provided in an embodiment of this application;

[0072] Figure 5 This is a schematic diagram of a data annotation system provided in an embodiment of this application;

[0073] Figure 6 This is an interaction flowchart of a data annotation manager and a target computing unit provided in an embodiment of this application;

[0074] Figure 7 This is a schematic diagram of a data annotation device provided in an embodiment of this application. Detailed Implementation

[0075] This application provides a data annotation method, which can be implemented by a data annotation manager in a data annotation system. This data annotation system can be a single server or a server cluster. Figure 1The diagram illustrates an implementation environment provided in this application. This environment may include a client and a data annotation system. Users can select an annotation model to use through the client and upload data to be annotated to the data annotation system, sending a data annotation request. The data annotation system can combine the user-selected annotation model and use the combined annotation model to annotate the user-uploaded data. Furthermore, users can also upload basic parameter data of the annotation model to the data annotation system through the client, so that the data annotation system can subsequently use this user-uploaded basic parameter data and model inference framework to combine and annotate the user-uploaded data.

[0076] The structural diagram of the above data annotation manager can be seen as follows: Figure 2 As shown. See also Figure 2 The data annotation manager may include a processor 210, a receiver 220, and a transmitter 230, which can be connected to the processor 210. The receiver 220 and transmitter 230 may be network interface cards (NICs). The receiver 220 can receive data annotation requests from clients, and the transmitter 230 can send annotation result data to clients. The processor 210 may be the control center of the data annotation manager, connecting various parts of the data annotation manager, such as the receiver 220 and transmitter 230, using various interfaces and lines. In this application, the processor 210 may be a CPU (Central Processing Unit), and optionally, the processor 210 may include one or more processing units. The processor 210 may also be a digital signal processor, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices. The data annotation manager may also include a memory 240, which can be used to store software programs and modules. The processor 210 executes annotation processing of the data to be annotated by reading the software code and modules stored in the memory.

[0077] See Figure 3 This is a flowchart of a data annotation method provided in an embodiment of this application. The method may include the following steps:

[0078] Step 301: Receive the data annotation request sent by the client.

[0079] In implementation, users can log in to their pre-registered target account in the data annotation system via the client. After successfully logging in, users can upload the data to be annotated through the client. For example, the data to be annotated can be an image. Before uploading the data to the annotation platform, users can package and name the data according to the platform's data specifications before uploading it to the automatic annotation data. After receiving the data uploaded by the client, the data annotation manager in the data annotation system can store the data and the corresponding relationship between the data identifier and storage path information of the data to be annotated.

[0080] After the client uploads the data to be labeled to the data labeling system, the system can send a list of data to be labeled to the client. This list can include data identifiers for data recently uploaded by the user, as well as data identifiers for data previously uploaded. Here, the data identifier is the name the user gave to the data to be labeled before uploading it. The user can select the data identifiers of the data to be labeled from the list.

[0081] In addition, the client application displays a labeled model upload option. Users can select this option to upload the basic parameter data of the labeled model to the data annotation system. The basic parameter data includes the trained values ​​of the trainable parameters (weights) in the labeled model. Users can package and name the basic parameter data according to the model specifications of the data annotation system before uploading it. When uploading the basic parameter data, the client can upload the target account's account identifier, the labeled model's model identifier, and the basic parameter data together with the data annotation system. The data annotation system then stores the uploaded basic parameter data in the labeled model storage repository, along with the corresponding model identifier, target account identifier, and storage path information.

[0082] The client can also display a labeling model selection option. When a user selects this option, it triggers a request from the data labeling system to retrieve the labeling model list. This request can include the account identifier of the currently logged-in target account. Upon receiving this request, the data labeling system retrieves the model identifiers of the common labeling models and the model identifiers corresponding to the target account's account identifier, forming a labeling model list for that target account, which is then returned to the client. The user can then select the model identifier of the first labeling model to use from this list.

[0083] After the user selects the data to be labeled and the first labeling model, they can select the labeling start option in the client to trigger the client to send a data labeling request to the data labeling system. This data labeling request can carry the data identifier of the data to be labeled and the model identifier of the first labeling model. The data manager in the data labeling system can receive this data labeling request.

[0084] Step 302: Obtain the target basic computing unit from the basic computing unit storage warehouse.

[0085] Among them, there are various model inference frameworks, such as convolutional architecture for fast feature embedding (Caffee), Tensorflow, PyTorch, etc.

[0086] In implementation, the basic parameter data and corresponding model inference framework of the labeled model in the data annotation system can be stored separately. Specifically, the basic parameter data of the labeled model can be stored in the labeled model storage repository, while the model inference framework can be packaged into a basic computing unit and stored in the basic computing unit storage repository. Each basic computing unit may include at least one model inference framework, and may also include a Toolkit for calling hardware drivers, a Runtime supporting language execution, and a data annotation manager interaction module, etc.

[0087] There are several ways to obtain the target basic computational unit, including the target model inference framework corresponding to the first labeled model. Several of them are listed below for explanation.

[0088] Method 1: The data annotation system can store the correspondence between the model identifiers of all annotation models available to the user and the identifiers of the basic computing units.

[0089] For publicly shared annotation models, technical personnel can configure the mapping between model identifiers and basic computational unit identifiers in the data annotation system. The basic computational unit corresponding to the model identifier of the annotation model should include the model inference framework corresponding to that annotation model. For annotation models corresponding to user-uploaded basic parameter data, users can specify the corresponding model inference framework when uploading the basic parameter data. The client can then send the model identifier of the annotation model corresponding to the user-uploaded basic parameter data and the framework identifier of the user-specified model inference framework to the data annotation system. The data annotation system will then store the corresponding model identifier of the annotation model corresponding to the user-uploaded basic parameter data and the identifier of the basic computational unit containing the user-specified model inference framework.

[0090] After receiving a data annotation request, the data annotation manager can determine the identifier of the corresponding basic computing unit based on the model identifier carried in the request. If multiple basic computing unit identifiers are determined, the identifier of the target basic computing unit can be randomly selected from them.

[0091] Then, the data annotation manager can retrieve the target basic computing unit from the basic computing unit storage repository.

[0092] Method 2: The data annotation system can store the correspondence between the model identifiers of the common annotation model and the identifiers of the basic calculation units.

[0093] After a user selects the first annotation model through the client, if the selected first annotation model is a common annotation model, the user can select the annotation start option to trigger the client to send a data annotation request to the data annotation system. This data annotation request can carry the data identifier of the data to be annotated and the model identifier of the first annotation model. The data annotation manager in the data annotation system can obtain the identifier of the target basic computing unit based on the model identifier of the first annotation model in the data annotation request, within the correspondence between model identifiers of common annotation models and identifiers of basic computing units. Furthermore, the target basic computing unit can be retrieved from the basic computing unit storage repository.

[0094] If the user selects a model corresponding to the user-uploaded basic parameter data as the first annotation model, the client can navigate to the model inference framework selection interface. This machine learning selection interface displays a list of model inference frameworks, including framework identifiers for various model inference frameworks. The user can select a target model inference framework identifier based on their needs. After selecting the target model inference framework, the user can select the "Start Annotation" option in the client to trigger a data annotation request sent to the data annotation system. This request can include the data identifier of the data to be labeled, the framework identifier of the target model inference framework, and the model identifier of the first annotation model. In this case, the data annotation system stores the correspondence between model inference framework identifiers and basic computation unit identifiers. The data annotation manager can then retrieve the corresponding target model inference framework identifier based on the framework identifier in the data annotation request. Furthermore, it can retrieve the target basic computation unit from the basic computation unit storage repository.

[0095] Step 303: Based on the hardware resource allocation information, allocate hardware resources to the target basic computing unit and establish the target computing unit.

[0096] In implementation, a hardware resource allocation option can be displayed on the client side. Before selecting the annotation start option, users can first select the hardware resource allocation option to enter the hardware resource allocation interface. In the hardware resource allocation interface, users can input the required hardware resource allocation information according to their actual needs. The hardware resource allocation information can include the number of CPUs, the number of Graphics Processing Units (GPUs), etc. Correspondingly, the hardware resource allocation information can also be carried in the data annotation request sent by the client. After receiving the data annotation request, the data annotation manager can allocate hardware resources to the target basic computing unit according to the hardware resource allocation information carried in it, thereby constructing a target computing unit.

[0097] Step 304: Based on the correspondence between the stored model identifiers and the storage path information of the basic parameter data, obtain the first storage path information of the basic parameter data corresponding to the model identifier of the first labeled model, and send the first storage path information to the target computing unit. This enables the target computing unit to retrieve the basic parameter data of the labeled model to be used from the labeled model storage repository using the first storage path information, combine the target model inference framework and the basic parameter data of the first labeled model to obtain the first labeled model, obtain the data to be labeled, input the data to be labeled into the first labeled model, and label the data to be labeled.

[0098] In implementation, the data annotation manager can obtain the model identifier of the first annotation model carried in the data annotation request, and obtain the first storage path information of the basic parameter data of the first annotation model from the correspondence between the stored model identifier and the storage path information of the basic parameter data. Then, the data annotation manager can send the first storage path information of the basic parameter data of the first annotation model to the target computing unit.

[0099] After receiving the first storage path information, the target computing unit can obtain the basic parameter data of the first annotation model from the annotation model storage warehouse according to the first storage path information.

[0100] In the labeled model storage repository, the basic parameter data of each labeled model can be packaged together with the corresponding labeled inference script and the dependent files of the labeled inference script for storage. Here, the combination of basic parameter data, corresponding labeled inference script, and dependent files of the labeled inference script can be referred to as the model base file.

[0101] Correspondingly, when the target calculation unit obtains the basic parameter data of the first labeled model, it can obtain the target model base file containing the basic parameter data of the first labeled model. For the common labeled model base files in the labeled model storage repository, technical personnel can write and package them. However, for the labeled model base files uploaded by users through the client, users need to write and package them according to the model specifications of the data labeled system.

[0102] The annotation inference script in the model base file needs to provide the following interfaces: annotation model loading interface, unannotated data preprocessing interface, data annotation interface, and annotation result data processing interface. Specifically, the annotation model loading interface loads the annotation model into memory; the unannotated data preprocessing interface preprocesses the unannotated data, such as format conversion, to make the data compatible with the annotation model; the data annotation interface instructs how to annotate the data, such as parallel annotation or serial annotation; and the annotation result data processing interface converts the format of the annotation result data output by the annotation model to meet the user's format requirements for the annotation result data.

[0103] like Figure 4As shown, the annotation model storage repository can store the common annotation model base files and the annotation model base files uploaded by each user. The common annotation model base files can include the base model files of annotation model 1, annotation model 2, annotation model 3 and annotation model 4. The base model files of the annotation model uploaded by user 1 include the base model files of annotation model 5 and annotation model 6. The base model files of the annotation model uploaded by user 2 include the base model files of annotation model 7 and annotation model 8.

[0104] After obtaining the basic parameter data of the first labeled model, the target computing unit can add the basic parameter data to the target model inference framework to obtain the first labeled model.

[0105] The data annotation manager can obtain the data identifier of the data to be annotated carried in the data annotation request, and obtain the second storage path information of the data to be annotated based on the correspondence between the stored data identifier and the data storage path information.

[0106] The data annotation manager can send the second storage path information of the data to be annotated to the target computing unit. The target computing unit can then retrieve the data to be annotated based on this second storage path information.

[0107] Then, the target computation unit can execute the annotation inference script of the target model base file, call the annotation model loading interface in the annotation inference script, and load the first annotation model into memory. Before inputting the data to be annotated into the first annotation model, the preprocessing interface for the data to be annotated in the annotation inference script can also be called to preprocess the data. Here, preprocessing can be format conversion, that is, converting the data to be annotated into a format that the first annotation model can annotate. Then, the data annotation interface in the annotation inference script can be called to input the preprocessed data to be annotated into the first annotation model. Then, after the first annotation model outputs the annotation result data corresponding to each data to be annotated, the annotation result data processing interface in the annotation inference script can be called to perform post-annotation processing on the annotation result data. Here, post-annotation processing can be format conversion, for example, converting the output JavaScript Object Notation (JSON) format annotation result data into Extensible Markup Language (XML) format.

[0108] After annotation is completed, the target calculation unit can send the annotation result data to the data annotation manager, which then returns it to the client.

[0109] In one possible implementation, during the process of labeling the data to be labeled using the first labeling model, the labeling model can be changed, and the changed labeling model can be used to label the unlabeled data to be labeled. Accordingly, the processing can be as follows:

[0110] The client application displays a labeling model replacement option. Users can select this option to access the labeling model selection interface, which shows a list of replaceable labeling models. This list includes model identifiers for labeling models with the same inference framework as the first labeling model. Users can then select the model identifier of the second labeling model from this list. The client can then send a labeling model replacement request to the data labeling system.

[0111] The data annotation manager in the data annotation system receives an annotation model replacement request, which carries the model identifier of the second annotation model. The data annotation manager retrieves the third storage path information of the basic parameter data corresponding to the model identifier of the second annotation model from the stored correspondence between model identifiers and basic parameter data storage path information. It then sends an annotation model replacement instruction to the target computing unit, which may also carry the third storage path information. Upon receiving the annotation model replacement instruction, the target computing unit stops annotating the unannotated data. It then retrieves the basic parameter data of the second annotation model from the annotation model storage repository using the third storage path information. Alternatively, it can retrieve the model base file containing the basic parameter data of the second annotation model. The target computing unit can then replace the basic parameter data of the first annotation model in the target model inference framework with the basic parameter data of the second annotation model to obtain the second annotation model. It can then execute the annotation inference script in the model base file containing the basic parameter data of the second annotation model to annotate the unannotated data using the second annotation model.

[0112] In one possible implementation, to improve the accuracy of the annotation model, after the first annotation model is used to annotate the data to be annotated, the annotation results can be manually verified and adjusted. The verified and adjusted annotation results are used as output sample data, and the data to be annotated that was input into the first annotation model is used as input sample data. The first annotation model is then trained using the output sample data and the input sample data to update the values ​​of the trainable parameters in the basic parameter data of the first annotation model, thereby optimizing the first annotation model.

[0113] The following is combined with Figure 6This section describes the interaction flow between the data annotation manager and the target calculation unit in the data annotation system. (See also...) Figure 6 During the annotation process, the interaction between the data annotation manager and the target computational unit may include the following steps:

[0114] Step 601: The data annotation manager sends the first storage path information of the basic parameter data of the first annotation model to the target calculation unit.

[0115] Step 602: The target computing unit retrieves the first basic model file containing the basic parameter data of the first labeled model from the labeled model storage repository according to the first storage path information. The basic parameter data of the first labeled model in the first basic model file is added to the target model inference framework to generate the first labeled model.

[0116] Step 603: The data annotation manager sends the second storage path information of the data to be annotated to the target computing unit.

[0117] Step 604: The target calculation unit obtains the data to be labeled based on the second storage path information.

[0118] Step 605: The target calculation unit executes the annotation inference script in the first basic model file and annotates the data to be annotated through the first annotation model.

[0119] Step 606: The annotation manager sends an annotation model replacement instruction to the target calculation unit. The annotation model replacement instruction carries the third storage path information of the basic parameter data of the second annotation model.

[0120] Step 607: The target calculation unit stops annotating the unannotated data to be labeled. Based on the third storage path information, it retrieves the second basic model file, which includes the basic parameter data of the second labeled model, from the labeled model storage repository. The basic parameter data of the first labeled model in the target model inference framework is replaced with the basic parameter data of the second labeled model in the second basic model file, generating the second labeled model.

[0121] Step 608: The target calculation unit executes the annotation inference script in the second basic model file and annotates the unannotated data to be annotated through the second annotation model.

[0122] In the scheme shown in this application embodiment, when a user has annotation needs, they can send a data annotation request to the data annotation system. The data annotation manager in the data annotation system can receive the data annotation request. It then retrieves the basic parameter data of the first annotation model from the annotation model storage repository and the target basic computing unit, which includes the target model inference framework, from the basic computing unit repository. Simultaneously, hardware resources are allocated to this target basic computing unit to construct a target computing unit. This target computing unit can combine the target model inference framework and the basic parameter data of the first annotation model into a first annotation model. Afterward, the target computing unit can use this first annotation model to annotate the data to be annotated. This avoids the need to hard-code the annotation model into the data annotation system. The sources of the basic parameter data for the annotation model can be diverse, not limited to integration by technical personnel, making the annotation model more flexible and offering a richer selection of annotation models. Furthermore, the hardware resources are specified by the user, better meeting the user's annotation needs.

[0123] This application also provides a data annotation system, such as... Figure 5 As shown, a data annotation system may include a basic computing unit storage area, a data annotation manager, and an annotation model storage repository. The data annotation manager is used for:

[0124] A data annotation manager is used to receive data annotation requests sent by clients, wherein the data annotation requests carry model identifiers and hardware resource allocation information of the first annotation model; obtain a target basic computing unit from the basic computing unit storage repository, wherein the target basic computing unit includes a target model inference framework and a hardware driver invocation tool corresponding to the first annotation model; allocate hardware resources to the target basic computing unit based on the hardware resource allocation information, and establish the target computing unit; obtain the first storage path information of the basic parameter data corresponding to the model identifier of the first annotation model from the correspondence between the stored model identifiers and the storage path information of the basic parameter data, and send the first storage path information to the target computing unit. Specifically, this data annotation manager can implement the processing performed by the data annotation manager in steps 301-304 and other implicit steps above, and its specific implementation method will not be elaborated here.

[0125] The target computation unit is configured to retrieve the basic parameter data of the annotation model to be used from the annotation model storage repository using the first storage path information, wherein the basic parameter data of the first annotation model includes the trained values ​​of the trainable parameters in the first annotation model; combine the target model inference framework and the basic parameter data of the first annotation model to obtain the first annotation model; retrieve the data to be labeled; and input the data to be labeled into the first annotation model to label the data. Specifically, this target computation unit can implement the processing performed by the target computation unit in step 304 above, and its specific implementation method will not be elaborated here.

[0126] Based on the same technical concept, embodiments of the present invention also provide a data annotation apparatus, which can be applied to implement... Figure 5 In the data annotation system described in the corresponding embodiment, the function of a data annotation manager is implemented. For example... Figure 7 As shown, the data annotation device includes:

[0127] The receiving module 710 is used to receive a data annotation request sent by the client, wherein the data annotation request carries the model identifier and hardware resource allocation information of the first annotation model. Specifically, it can implement the receiving function in step 301 above, as well as other implicit steps.

[0128] The acquisition module 720 is used to acquire a target basic computing unit from the basic computing unit storage repository. The target basic computing unit includes the target model inference framework and hardware driver invocation tool corresponding to the first labeled model. Specifically, it can implement the acquisition function in step 302 above, as well as other implicit steps.

[0129] The allocation module 730 is used to allocate hardware resources to the target basic computing unit based on the hardware resource allocation information, and establish the target computing unit. Specifically, it can implement the allocation function in step 303 above, as well as other implicit steps.

[0130] The sending module 740 is used to obtain the first storage path information of the basic parameter data corresponding to the model identifier of the first labeled model from the correspondence between the stored model identifier and the storage path information of the basic parameter data, and send the first storage path information to the target computing unit. This allows the target computing unit to obtain the basic parameter data of the labeled model to be used from the labeled model storage repository through the first storage path information, combine the target model inference framework and the basic parameter data of the first labeled model to obtain the first labeled model, obtain the data to be labeled, input the data to be labeled into the first labeled model, and label the data to be labeled. The basic parameter data of the first labeled model includes the trained values ​​of the trainable parameters in the first labeled model. Specifically, this can implement the sending function in step 304 above, as well as other implicit steps.

[0131] In one possible implementation, the receiving module 710 is further configured to:

[0132] Receive the model identifier and basic parameter data of the first labeled model sent by the client;

[0133] The basic parameter data of the first annotation model is stored in the annotation model storage warehouse, and the model identifier of the first annotation model and the first storage path information of the basic parameter data of the first annotation model are stored accordingly.

[0134] In one possible implementation, the data annotation request also carries a data identifier of the data to be annotated, and the acquisition module 720 is further configured to:

[0135] In the correspondence between stored data identifiers and data storage path information, obtain the second storage path information corresponding to the data identifier of the data to be labeled;

[0136] The second storage path information is sent to the target computing unit so that the target computing unit can obtain the data to be labeled through the second storage path information.

[0137] In one possible implementation, the data annotation request also carries the framework identifier of the target model inference framework, and the acquisition module 720 is used for:

[0138] Based on the framework identifier of the target model inference framework, the target basic computing unit containing the target model inference framework is obtained from the basic computing unit storage repository.

[0139] In one possible implementation, the device further includes:

[0140] The replacement module is used to receive a labeling model replacement request sent by the client during the process of the target computing unit labeling the data to be labeled using the first labeling model, wherein the labeling model replacement request carries a model identifier of the second labeling model.

[0141] In the correspondence between the stored model identifier and the storage path information of the basic parameter data, obtain the third storage path information of the basic parameter data corresponding to the model identifier of the second labeled model;

[0142] A model replacement instruction is sent to the target computing unit, wherein the model replacement instruction carries the third storage path information, so that the target computing unit stops labeling the unlabeled data to be labeled. Through the third storage path information, the basic parameter data of the second labeling model is obtained from the labeling model storage warehouse. The basic parameter data of the first labeling model in the target model inference framework is replaced with the basic parameter data of the second labeling model to obtain the second labeling model. The unlabeled data to be labeled is input into the second labeling model to label the unlabeled data to be labeled. The model inference frameworks corresponding to the second labeling model and the first labeling model are the same.

[0143] It should be noted that the data annotation apparatus provided in the above embodiments is only illustrated by the division of the above functional modules. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the data annotation manager can be divided into different functional modules to complete all or part of the functions described above. In addition, the data annotation apparatus and the data annotation method embodiments provided in the above embodiments belong to the same concept, and their specific implementation process can be found in the method embodiments, which will not be repeated here.

[0144] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When these computer program instructions are loaded and executed on a device, they generate all or part of the processes or functions described in the embodiments of this application. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic cable, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to the device or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, and magnetic tape), an optical medium (e.g., Digital Video Disk (DVD), etc.), or a semiconductor medium (e.g., solid-state drive, etc.).

[0145] Those skilled in the art will understand that all or part of the steps of the above embodiments can be implemented by hardware or by a program instructing related hardware. The program can be stored in a computer-readable storage medium, such as a read-only memory, a disk, or an optical disk.

[0146] The above description is merely one embodiment of this application and is not intended to limit this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the protection scope of this application.

Claims

1. A data annotation system, characterized in that, The data annotation system includes a data annotation manager and an annotation model storage repository, wherein: The data annotation manager is used to receive a data annotation request sent by a client, wherein the data annotation request carries a model identifier of a first annotation model, determines the first storage location indication information of the basic parameter data corresponding to the first annotation model, and sends the first storage location indication information to the target computing unit, wherein the target computing unit includes a target model inference framework corresponding to the first annotation model. The target computing unit is configured to: obtain basic parameter data corresponding to the first annotation model from the annotation model storage warehouse according to the first storage location indication information; wherein the basic parameter data stored in the annotation model storage warehouse includes basic parameter data of common annotation models and basic parameter data of annotation models uploaded by users; the basic parameter data corresponding to the first annotation model includes the trained values ​​of trainable parameters in the first annotation model, the trained values ​​being obtained by training the first annotation model, and the trainable parameters including weights; combine the target model inference framework and the basic parameter data corresponding to the first annotation model to obtain the first annotation model; obtain data to be labeled; and input the data to be labeled into the first annotation model to label the data to be labeled.

2. The data annotation system according to claim 1, characterized in that, The data annotation system also includes a basic computing unit storage repository, and the data annotation request also carries hardware resource allocation information. The data annotation manager is further used for: Obtain the target basic computing unit from the basic computing unit storage repository, wherein the target basic computing unit includes the target model inference framework and hardware driver invocation tool corresponding to the first labeled model; allocate hardware resources to the target basic computing unit based on the hardware resource allocation information, and establish the target computing unit.

3. The data annotation system according to claim 1 or 2, characterized in that, The data annotation manager is also used for: Receive the model identifier and basic parameter data of the first labeled model sent by the client; The basic parameter data corresponding to the first annotation model is stored in the annotation model storage warehouse, and the model identifier of the first annotation model and the first storage location indication information of the basic parameter data corresponding to the first annotation model are stored accordingly.

4. The data annotation system according to claim 1 or 2, characterized in that, The data annotation request also carries a data identifier for the data to be annotated, and the data annotation manager is further configured to: In the correspondence between stored data identifiers and data storage location indication information, obtain the second storage location indication information corresponding to the data identifier of the data to be labeled; Send the second storage location indication information to the target computing unit; The target computing unit is used for: The data to be labeled is obtained using the second storage location indication information.

5. The data annotation system according to claim 1 or 2, characterized in that, The data annotation request also carries the framework identifier of the target model inference framework, and the data annotation manager is used for: Based on the framework identifier of the target model inference framework, the target basic computing unit containing the target model inference framework is obtained from the basic computing unit storage repository.

6. The data annotation system according to claim 1 or 2, characterized in that, The data annotation manager is also used for: During the process of the target computing unit annotating the data to be annotated using the first annotation model, a annotation model replacement request sent by the client is received, wherein the annotation model replacement request carries the model identifier of the second annotation model; in the correspondence between the stored model identifier and the storage location indication information of the basic parameter data, the third storage location indication information of the basic parameter data corresponding to the model identifier of the second annotation model is obtained; a model replacement instruction is sent to the target computing unit, wherein the model replacement instruction carries the third storage location indication information; The target computing unit is used to stop labeling unlabeled data to be labeled; obtain the basic parameter data of the second labeling model through the third storage location indication information; replace the basic parameter data corresponding to the first labeling model in the target model inference framework with the basic parameter data of the second labeling model to obtain the second labeling model, wherein the model inference framework corresponding to the second labeling model and the first labeling model are the same; input the unlabeled data to be labeled into the second labeling model to label the unlabeled data to be labeled.

7. A data annotation method, characterized in that, The method includes: Receive a data annotation request sent by a client, wherein the data annotation request carries a model identifier of a first annotation model; The system determines the first storage location indication information of the basic parameter data corresponding to the first annotation model, and sends the first storage location indication information to the target computing unit. The target computing unit includes a target model inference framework corresponding to the first annotation model. The target computing unit retrieves the basic parameter data corresponding to the first annotation model from the annotation model storage repository using the first storage location indication information. It then combines the target model inference framework and the basic parameter data corresponding to the first annotation model to obtain the first annotation model. The system also retrieves data to be annotated, inputs the data to be annotated into the first annotation model, and annotates the data. The basic parameter data stored in the annotation model storage repository includes common annotation model basic parameter data and user-uploaded annotation model basic parameter data. The basic parameter data corresponding to the first annotation model includes the trained values ​​of the trainable parameters in the first annotation model. The trained values ​​are obtained by training the first annotation model, and the trainable parameters include weights.

8. The method according to claim 7, characterized in that, The data annotation request also carries hardware resource allocation information, and the method further includes: Obtain the target basic computing unit from the basic computing unit storage repository, wherein the target basic computing unit includes the target model inference framework and hardware driver calling tool corresponding to the first labeled model; Based on the hardware resource allocation information, hardware resources are allocated to the target basic computing unit to establish the target computing unit.

9. The method according to claim 7 or 8, characterized in that, The method further includes: Receive the model identifier and basic parameter data of the first labeled model sent by the client; The basic parameter data corresponding to the first annotation model is stored in the annotation model storage warehouse, and the model identifier of the first annotation model and the first storage location indication information of the basic parameter data corresponding to the first annotation model are stored accordingly.

10. The method according to claim 7 or 8, characterized in that, The data annotation request also carries a data identifier for the data to be annotated, and the method further includes: In the correspondence between stored data identifiers and data storage location indication information, obtain the second storage location indication information corresponding to the data identifier of the data to be labeled; The second storage location indication information is sent to the target computing unit so that the target computing unit can obtain the data to be labeled through the second storage location indication information.

11. The method according to claim 8, characterized in that, The data annotation request also carries the framework identifier of the target model inference framework, and the step of obtaining the target basic computing unit from the basic computing unit storage repository includes: Based on the framework identifier of the target model inference framework, the target basic computing unit containing the target model inference framework is obtained from the basic computing unit storage repository.

12. The method according to claim 7 or 8, characterized in that, The method further includes: During the process of the target computing unit annotating the data to be annotated using the first annotation model, a annotation model replacement request sent by the client is received, wherein the annotation model replacement request carries the model identifier of the second annotation model; Based on the correspondence between the stored model identifier and the storage location indication information of the basic parameter data, obtain the third storage location indication information of the basic parameter data corresponding to the model identifier of the second labeled model; A model replacement instruction is sent to the target computing unit, wherein the model replacement instruction carries the third storage location indication information, so that the target computing unit stops labeling the unlabeled data to be labeled. Through the third storage location indication information, the basic parameter data of the second labeling model is obtained from the labeling model storage warehouse. The basic parameter data corresponding to the first labeling model in the target model inference framework is replaced with the basic parameter data of the second labeling model to obtain the second labeling model. The unlabeled data to be labeled is input into the second labeling model to label the unlabeled data to be labeled. The model inference framework corresponding to the second labeling model and the first labeling model is the same.

13. A data annotation method, characterized in that, The method is applied to a computing unit, and the method includes: Receive first storage location indication information sent by the data annotation manager, wherein the first storage location indication information is used to indicate the storage location of the basic parameter data corresponding to the first annotation model; According to the first storage location indication information, the basic parameter data corresponding to the first annotation model is obtained from the annotation model storage warehouse. The basic parameter data stored in the annotation model storage warehouse includes the basic parameter data of common annotation models and the basic parameter data of annotation models uploaded by users. The basic parameter data corresponding to the first annotation model includes the trained values ​​of the trainable parameters in the first annotation model. The trained values ​​are obtained by training the first annotation model. The trainable parameters include weights. The first annotation model is obtained by combining the target model inference framework corresponding to the first annotation model and the basic parameter data corresponding to the first annotation model. Obtain the data to be labeled, input the data to be labeled into the first labeling model, and label the data to be labeled.

14. The method according to claim 13, characterized in that, The process of obtaining the data to be labeled includes: Receive the second storage location indication information corresponding to the data to be labeled sent by the data labeling manager; The data to be labeled is obtained using the second storage location indication information.

15. The method according to claim 13, characterized in that, The method further includes: Receive a model replacement instruction sent by the data annotation manager, wherein the model replacement instruction carries third storage location indication information; Stop labeling the unlabeled data. Based on the third storage location indication information, obtain the basic parameter data of the second annotation model; The basic parameter data corresponding to the first labeled model in the target model inference framework is replaced with the basic parameter data corresponding to the second labeled model to obtain the second labeled model, wherein the model inference framework corresponding to the second labeled model and the first labeled model are the same; The unlabeled data to be labeled is input into the second labeling model to label the unlabeled data.

16. The method according to any one of claims 13-15, characterized in that, The computing unit is established by the data annotation manager by allocating hardware resources to the target basic computing unit according to the hardware resource allocation information. The hardware resource allocation information is sent to the data annotation manager by the client. The target basic computing unit includes the target model inference framework and hardware driver invocation tool corresponding to the first annotation model.

17. A data annotation apparatus, characterized in that, The device includes: A receiving module is used to receive a data annotation request sent by a client, wherein the data annotation request carries a model identifier of a first annotation model; A sending module is used to determine the first storage location indication information of the basic parameter data corresponding to the first annotation model, and send the first storage location indication information to the target computing unit. The target computing unit includes a target model inference framework corresponding to the first annotation model, so that the target computing unit can obtain the basic parameter data corresponding to the first annotation model from the annotation model storage warehouse through the first storage location indication information, combine the target model inference framework and the basic parameter data corresponding to the first annotation model to obtain the first annotation model, obtain the data to be annotated, input the data to be annotated into the first annotation model, and annotate the data to be annotated. The basic parameter data stored in the annotation model storage warehouse includes common annotation model basic parameter data and user-uploaded annotation model basic parameter data. The basic parameter data corresponding to the first annotation model includes the trained values ​​of the trainable parameters in the first annotation model, which are obtained by training the first annotation model. The trainable parameters include weights.

18. The apparatus according to claim 17, characterized in that, The device further includes: The acquisition module is used to acquire a target basic computing unit from the basic computing unit storage repository, wherein the target basic computing unit includes a target model inference framework and a hardware driver calling tool corresponding to the first labeled model; The allocation module is used to allocate hardware resources to the target basic computing unit based on hardware resource allocation information, and to establish the target computing unit.

19. The apparatus according to claim 17 or 18, characterized in that, The receiving module is further configured to: Receive the model identifier and basic parameter data of the first labeled model sent by the client; The basic parameter data corresponding to the first annotation model is stored in the annotation model storage warehouse, and the model identifier of the first annotation model and the first storage location indication information of the basic parameter data corresponding to the first annotation model are stored accordingly.

20. The apparatus according to claim 17 or 18, characterized in that, The data annotation request also carries the data identifier of the data to be annotated. The acquisition module is used for: In the correspondence between stored data identifiers and data storage location indication information, obtain the second storage location indication information corresponding to the data identifier of the data to be labeled; The second storage location indication information is sent to the target computing unit so that the target computing unit can obtain the data to be labeled through the second storage location indication information.

21. The apparatus according to claim 17 or 18, characterized in that, The data annotation request also carries the framework identifier of the target model inference framework. The acquisition module is used for: Based on the framework identifier of the target model inference framework, the target basic computing unit containing the target model inference framework is obtained from the basic computing unit storage repository.

22. The apparatus according to claim 17 or 18, characterized in that, The device further includes: The replacement module is used to receive a labeling model replacement request sent by the client during the process of the target computing unit labeling the data to be labeled using the first labeling model, wherein the labeling model replacement request carries a model identifier of the second labeling model. In the correspondence between the stored model identifier and the storage path information of the basic parameter data, obtain the third storage path information of the basic parameter data corresponding to the model identifier of the second labeled model; A model replacement instruction is sent to the target computing unit, wherein the model replacement instruction carries the third storage path information, so that the target computing unit stops labeling the unlabeled data to be labeled. Through the third storage path information, the basic parameter data of the second labeling model is obtained from the labeling model storage warehouse. The basic parameter data corresponding to the first labeling model in the target model inference framework is replaced with the basic parameter data of the second labeling model to obtain the second labeling model. The unlabeled data to be labeled is input into the second labeling model to label the unlabeled data to be labeled. The model inference framework corresponding to the second labeling model and the first labeling model is the same.

23. A data annotation manager, characterized in that, The data annotation manager includes a processor and a memory, the memory storing at least one instruction that is loaded and executed by the processor to perform the operations performed by the data annotation method as described in any one of claims 7 to 12.

24. A computer-readable storage medium, characterized in that, The storage medium stores at least one instruction, which is loaded and executed by a processor to perform the operations performed by the data annotation method as described in any one of claims 7 to 12.