Text annotation method and device

A text and target text technology, applied in the field of text labeling methods and devices, can solve the problems of consistency and high labor cost, and achieve the effect of ensuring consistency and reducing labor labeling costs.

Active Publication Date: 2019-06-21
IFLYTEK CO LTD
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, by manually labeling text with semantic slots, for example, by experts in a specific field, the labor cost is high, and when there are many labelers, there will be problems with the consistency of labeling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text annotation method and device
  • Text annotation method and device
  • Text annotation method and device

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0070] see figure 1 , which is a schematic flowchart of a text labeling method provided in this embodiment, the method includes the following steps:

[0071] S101: Obtain target text to be labeled.

[0072] In this embodiment, the text that needs semantic slot labeling is defined as the target text. It should be noted that this embodiment does not limit the language type of the target text, for example, the target text can be Chinese text or English text, etc.; and this embodiment does not limit the length of the target text, for example, the target text can be a sentence Text, or chapter-level text.

[0073] S102: Determine the specific field to which the target text belongs.

[0074] In this embodiment, after the target text to be marked is obtained through step S101, the target text can be semantically analyzed to determine the specific field to which the target text belongs. For example, the specific field can be the field of film and television, music, and medicine. W...

no. 2 example

[0095] This embodiment will introduce the specific implementation process of step S103 "use the structured data in the specific field to mark each entry in the target text with semantic slots" in the first embodiment.

[0096]see figure 2 , which shows a schematic flow diagram of using structured data in a specific field provided by this embodiment to perform semantic slot labeling on each entry in the target text, and the flow includes the following steps:

[0097] S201: Retrieve each value under each field in the structured data in a specific field, and obtain each value matching the target text as each retrieval value.

[0098] In this embodiment, the retrieval method can be used to perform matching retrieval for each value under each field in the structured data in the specific field. Specifically, text matching or pinyin matching can be used to retrieve from the structured data. Each value that the target text matches, here, each value that matches is defined as a retri...

no. 3 example

[0188] This embodiment will introduce a text tagging device, and for related content, please refer to the above method embodiments.

[0189] see Figure 7 , which is a schematic diagram of the composition of a text tagging device provided in this embodiment, the device 700 includes:

[0190] A target text acquisition unit 701, configured to acquire the target text to be marked;

[0191] a specific field determining unit 702, configured to determine the specific field to which the target text belongs;

[0192] The semantic slot labeling unit 703 is configured to use the structured data in the specific field to perform semantic slot labeling on each entry in the target text.

[0193] In an implementation of this embodiment, the semantic slot labeling unit 703 includes:

[0194] A retrieval value acquisition subunit, configured to retrieve each value under each field in the structured data in the specific field, and obtain each value matching the target text as ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text annotation method and device, and the method comprises the steps: obtaining a to-be-annotated target text, determining a specific field to which the target text belongs,and carrying out the semantic slot annotation of each entry in the target text through the structured data in the specific field. The adopted annotation basis is the structured data of the specific field to which the target text belongs. Since the structured data comprises each field and the value under each field, and each field generally represents the semantic slot in the specific field, the structured data can be used for carrying out semantic slot marking on each entry in the target text without manual marking, so that the manual marking cost is reduced. Besides, the corresponding relation between the fields and the field values of the structured data is fixed, so that semantic slot annotation is carried out based on the structured data, and the consistency of annotation results canbe ensured.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, and in particular to a text labeling method and device. Background technique [0002] With the rapid progress of speech-related technologies and the rapid rise of the field of artificial intelligence, the semantic understanding technology used to support human-machine dialogue has attracted more and more attention. Current semantic understanding technologies include rule-based text processing schemes, deep learning-based statistical model schemes, etc., but these schemes require a large amount of manually labeled data. User statements are completely covered, resulting in the semantic understanding system not being able to understand user requests well, and the experience is poor. [0003] When the data is manually labeled, the label of each entry in the text is generally manually labeled. In a labeling method, the semantic slot to which each entry in the text belongs is...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/35
Inventor 梅林海杨强陈志刚
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products