Unlock instant, AI-driven research and patent intelligence for your innovation.

A Bilingual Text Annotation Method

A textual, bilingual technology, applied in semantic analysis, natural language translation, instruments, etc., can solve the problems of inconsistent and incomplete textual information, different textual analysis results, etc., to achieve the effect of improving the segmentation results

Active Publication Date: 2019-03-15
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0023] In the Chinese sentence at the source end (S), the entire sentence is a basic discourse unit, which is marked as e1, but the corresponding English sentence at the target end (T) is divided into e1 and e2. As can be seen from this example, Although the semantics of Chinese and English are consistent, the text analysis results obtained are completely different, which leads to the fact that even if you want to add text information in practical applications, the source end text information is inconsistent with the target end text information. Or incomplete, which leads to the fact that bilingual natural language processing tasks based on text analysis can only use source or target text information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Bilingual Text Annotation Method
  • A Bilingual Text Annotation Method
  • A Bilingual Text Annotation Method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0054] The basic idea of ​​the present invention is to properly use the bilingual text information at both ends, and propose a bilingual text labeling method to improve the consistency of bilingual text analysis results. For example, figure 2 Two discourse analysis results obtained by analyzing Chinese sentences and English sentences with the same meaning using rhetorical structure theory are given. These two sentences follow different rules from the segmentation. The definition of the basic discourse unit in English is a grammatically legal clause (clause). Similarly, the definition of the basic discourse unit in Chinese is similar. However, due to the differences between the two languages, mainly in terms of usage hab...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a bilingual discourse annotation method. The method comprises the following steps: 1, performing automatic segmentation, automatic word alignment and automatic discourse analysis on a source language end sentence and a target language end sentence in a bilingual sentence pair to obtain word alignment information and discourse analysis trees at the two ends; 2, obtaining corresponding relationships among basic discourse units in the sentences at the two ends according to the word alignment information and the discourse analysis trees at the two ends obtained in the step 1; and 3, constructing a bilingual discourse structure according to the basic discourse units in the sentences at the two ends and the corresponding relationships among the basic discourse units obtained in the step 2. Through adoption of the method, discourse analysis with relatively high consistency can be performed on bilingual parallel sentences. A Chinese-English language pair is verified by an annotation experiment. Compared with an existing monolingual discourse analysis method, the bilingual discourse annotation method has the advantages that a discourse analysis result with a higher consistency degree can be obtained by analysis, and the consistency of discourse segmentation information and discourse structure information is improved greatly.

Description

technical field [0001] The invention relates to the technical field of natural language processing, and is a novel text labeling method for bilingual scenes. Background technique [0002] In natural language processing tasks, its basic units can be divided into words, phrases, and sentences from small to large, and finally form chapters. The purpose of discourse analysis is to analyze and understand sentences at the semantic level as a whole. [0003] Similar to syntactic analysis, discourse analysis is an intermediate link in many natural language processing tasks, and it is used in various tasks, such as: automatic summarization, question answering system, machine translation, machine understanding, text generation, etc. The main reasons why discourse technology has attracted attention are as follows: (1) Unlike syntactic analysis, which uses words as the most basic analysis unit, discourse analysis uses basic discourse units as the basic unit, and the segmentation of bas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27G06F17/28
CPCG06F40/30G06F40/58
Inventor 张家俊刘洋宗成庆
Owner INST OF AUTOMATION CHINESE ACAD OF SCI