Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Chinese zero pronoun resolution method and system

A technology of pronouns and Chinese, applied in the field of Chinese zero pronoun resolution method and system, can solve the problems of low accuracy rate of automatic syntactic analysis, zero pronoun recognition and resolution accuracy difficult to meet application standards, etc.

Pending Publication Date: 2019-01-08
HARBIN INST OF TECH +1
View PDF5 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Zero pronoun recognition and resolution algorithms often rely on syntactic analysis, and the accuracy of automatic syntactic analysis is not high, which also makes it difficult for the accuracy of zero pronoun recognition and resolution to meet the application standard

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Chinese zero pronoun resolution method and system
  • A Chinese zero pronoun resolution method and system
  • A Chinese zero pronoun resolution method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0061] The present embodiment provides a method for dissolving Chinese zero pronouns, wherein the zero pronoun resolution actually includes two processes of zero pronoun identification and zero pronoun resolution; figure 1 shown, including:

[0062] S101. Obtain candidate zero pronoun markers by preprocessing the target corpus;

[0063] Further, said preprocessing the target corpus to obtain the candidate zero pronoun mark includes:

[0064] Divide the target data set according to the data set division method, and obtain the marks of zero pronouns on the training set, test set and verification set.

[0065] Specifically, the target data set is the OntoNotes5.0 data set, and the OntoNote5.0 is divided according to the data set division method of the CoNLL-2012Share Task coreference resolution evaluation task; wherein, the OntoNotes5.0 data set itself contains zero pronoun marks information, and CoNLL-2012 provides the training, verification, and testing three-part data set di...

Embodiment 2

[0113] The present embodiment provides a Chinese zero pronoun resolution system, such as Figure 5 shown, including:

[0114] The preprocessing module 110 is used to obtain the candidate zero pronoun mark by preprocessing the target corpus;

[0115] Further, the preprocessing module 110 includes:

[0116] The zero pronoun marking unit 111 is configured to divide the target data set according to the data set division method to obtain the marking of the zero pronouns on the training set, test set and verification set.

[0117] The zero pronoun identification module 120 is used to identify the position of the candidate zero pronoun; the result of the position identification is combined with the preset optimization rule to obtain the target zero pronoun;

[0118] Further, the zero pronoun recognition module 120 includes:

[0119] The context semantic feature acquisition unit 121 is used to use the word vector of the candidate zero pronoun context as input, and utilize the bidir...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese zero pronoun resolution method and system. The method includes: obtaining a zero pronoun mark by preprocessing a target language; carrying out position recognition ofcandidate zero pronouns; combining the result of position recognition with the preset optimization rule to get the target zero pronoun.; obtaining a set of expression pairs according to all target zero pronouns and candidate antecedents; obtaining the probability of anaphora relationship between target zero pronoun and candidate antecedents and sorting the probabilities of multiple anaphora relationships; according to the sorting results, obtaining the corresponding zero pronoun resolution results. The invention utilizes preset optimization rules combined with syntactic analysis to realize accurate identification of zero pronouns, and the zero pronouns resolution is completed by using a deep learning method.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a Chinese zero pronoun resolution method and system. Background technique [0002] Zero pronoun resolution is a special kind of resolution for the phenomenon of zero reference in pronoun resolution, which widely exists in natural language texts, especially in Chinese. In a text, the part that the user can deduce based on the contextual relationship can be omitted. The omitted part generally bears the corresponding syntactic components in the sentence, and refers back to a certain linguistic unit in the previous text, represented by zero pronouns. Zero-pronoun resolution is the process of recovering zero-pronoun references to preceding linguistic units, sometimes referred to as omission recovery. [0003] Compared with explicit pronoun resolution, the biggest problem of zero pronoun resolution is the lack of explicit pronoun representation, so it is more difficult and cha...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/216G06F40/211G06F40/289
Inventor 刘秉权孙承杰栾克鑫游世学杜新凯
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products