Corpus tagging method and device, electronic equipment and storage medium

A technology of corpus labeling and corpus, applied in the computer field, can solve the problems that the labeling speed needs to be improved, and achieve the effect of improving the labeling speed

Pending Publication Date: 2022-05-13
MASHANG CONSUMER FINANCE CO LTD
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Related corpus annotation methods still need to improve the annotation speed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Corpus tagging method and device, electronic equipment and storage medium
  • Corpus tagging method and device, electronic equipment and storage medium
  • Corpus tagging method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

[0023] In the training process of the voice robot, because a large amount of labeled corpus is required, but the cost of verifying the labeled corpus is too high, the corpus can be marked by combining verification and AI-assisted labeling. Wherein, the annotated corpus is an annotated corpus.

[0024] Such as figure 1 as shown, figure 1 Schematic diagram of the process of labeling methods for related corpora. Firstly, the AI ​​model is used to predict the unlabeled data to obtain the ini...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a corpus labeling method and device, electronic equipment and a storage medium. Comprising the steps that corpora to be labeled are obtained, the corpora to be labeled are labeled through a labeling model, an initial labeling result corresponding to the corpora to be labeled is obtained, and the initial labeling result comprises initial labeling corpora corresponding to the corpora to be labeled and labeling information; the labeling information comprises a label value and credibility corresponding to each initial labeling corpus; based on the tag values, the initial tagged corpora are classified to obtain a plurality of classified corpus sets, and the tag values of the initial tagged corpora included in each classified corpus set in the plurality of classified corpus sets are the same; based on the credibility, sorting the initial tagged corpora included in each classified corpus set to obtain a plurality of sorted classified corpus sets; the sorted multiple classified corpus sets are sent to a client side for verification; and receiving the verified tagged corpus returned by the client as a target tagged corpus.

Description

technical field [0001] The application belongs to the field of computer technology, and in particular relates to a corpus tagging method, device, electronic equipment and storage medium. Background technique [0002] In the training process of the voice robot, because a large amount of labeled corpus is required, but the cost of verifying the labeled corpus is too high, the corpus can be marked by combining verification and AI-assisted labeling. The annotation speed of related corpus annotation methods still needs to be improved. Contents of the invention [0003] In view of the above problems, the present application proposes a corpus tagging method, device, electronic equipment, and storage medium to improve the above problems. [0004] In the first aspect, the embodiment of the present application provides a corpus tagging method, the method comprising: obtaining the corpus to be tagged, tagging the corpus to be tagged through a tagging model, and obtaining an initial ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/117G06K9/62
CPCG06F40/117G06F18/241
Inventor 耿福明吴海英权圣蒋宁王洪斌
Owner MASHANG CONSUMER FINANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products