Corpus labeling method, device, server and storage medium

A corpus labeling and corpus technology, applied in the field of information processing, can solve problems such as single corpus labeling results, difficulty in judging the accuracy of labeling results, cognitive level and operating habits affecting the quality of corpus labeling, and achieve the goal of ensuring high quality and accuracy Effect

Active Publication Date: 2021-11-19
CHINANETCENT TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the inventors found that there are at least the following problems in the related technologies: in the traditional corpus labeling method, a single corpus is generally marked by a single labeler, and the cognitive level and operating habits of the labeler greatly affect the quality of labeling the corpus. As a result, the labeling results of the corpus are single, and it is difficult to judge the accuracy of the labeling results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Corpus labeling method, device, server and storage medium
  • Corpus labeling method, device, server and storage medium
  • Corpus labeling method, device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, various implementation modes of the present invention will be described in detail below in conjunction with the accompanying drawings. However, those of ordinary skill in the art can understand that, in each implementation manner of the present invention, many technical details are provided for readers to better understand the present application. However, even without these technical details and various changes and modifications based on the following implementation modes, the technical solution claimed in this application can also be realized. The division of the following embodiments is for the convenience of description, and should not constitute any limitation to the specific implementation of the present invention, and the various embodiments can be combined and referred to each other on the premise of no contradiction.

[0023] The first embodi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiments of the present invention relate to the technical field of information processing, and in particular to a corpus tagging method, device, server and storage medium. The corpus labeling method includes: obtaining an even-numbered manual labeling result for the initial corpus, and a model labeling result for the initial corpus; wherein, the model labeling result for the initial corpus is obtained according to a preset labeling model, and the preset The labeling model is obtained according to several initial corpus trainings that have been manually labeled; among all the labeling results including the manual labeling results and the model labeling results, the only labeling result that meets the preset conditions is obtained as the initial corpus The final labeling result. By adopting the embodiment of the present invention, high-quality corpus labeling results of the initial corpus can be obtained, and the influence of a single labeler on the quality of corpus labeling can be reduced.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of information processing, and in particular to a corpus tagging method, device, server and storage medium. Background technique [0002] Natural language processing refers to the computer receiving input in the form of natural language, and internally processes and calculates the natural language through user-defined algorithms to return the results expected by the user. It can usually be applied to fields such as text retrieval, machine translation, and information question answering. Users usually define the algorithm by establishing an algorithm model, and the established algorithm model needs to be trained through a large number of labeled original language materials; labeling the original language materials refers to processing the original corpus, and combining various representations of language features The additional codes are marked on the corresponding language component...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06K9/62
CPCG06F16/3344G06F18/10G06F18/214
Inventor 宣劭文李金锋
Owner CHINANETCENT TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products