Reinforcement learning based anaphora resolution method

A technology that refers to digestion and reinforcement learning, applied in the field of natural language processing, can solve the problems of general digestion effect, poor effect, weak model generalization ability, etc., and achieve the effect of improving model effect and accuracy.

Inactive Publication Date: 2019-08-16
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT +1
View PDF5 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Common reference resolution models include reference pair model, reference ranking model, entity expression model, etc. Usually, the reference pair model only extracts information from two independent words to determine whether the two words have a reference relationship. Far from enough, especially when the candidate antecedent entity expression lacks effective information description, the effect is even worse, so only using the features of entity reference pairs often eliminates the general effect
On the other hand, in the process of digesting model training, most models are trained using heuristic loss functions. For different languages ​​and different target data sets, it is often necessary to manually adjust the hyperparameters of the loss function, and the model generalization ability is not strong.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reinforcement learning based anaphora resolution method
  • Reinforcement learning based anaphora resolution method
  • Reinforcement learning based anaphora resolution method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] This embodiment takes the model training process as an example, and the training corpus is the CoNLL 2012 English data set, such as "[I(12)] noticed that many friends, around[me(12)] received[it(119)].It seems thatalmost everyone received[this SMS(119)].”As shown in Mark (12) and Mark (119), [I(12)] refers to [me(12)], and [it(119)] refers to [this SMS (119)], the word vectors and related features of [I(12)] and [me(12)] are vector spliced ​​to obtain the i-dimensional vector h 0 , put h 0 As the input of the model, the neural network is trained by reinforcement learning method to obtain the anaphora resolution model.

Embodiment 2

[0071] In this embodiment, the model prediction process is taken as an example. The test corpus is "[My sister] has [a dog] and [she] loves [it] very much." The pronouns obtained through preprocessing are [My sister], [a dog ], [she], [it], and combine their word vectors and related features to obtain i-dimensional vector h 0 , put h 0 As a model input, use model predictions for scoring and sorting, and the running results: [My sister][a dog] score -1.66, [My sister][she] score 8.06, [My sister][it ] score -1,83, select the highest score as the result of referring to resolution, that is, [she] refers to [Mysister]. Continue to sort by scoring, [a dog][she] scored 2.92, [a dog][it] scored 6.61, [adog][My sister] scored -1.66, and the highest score is selected as Refer to the result of resolution, that is, [it] refers to [a dog], [she] and [it] refer to the same process as above, and finally get the result of resolution of reference [[she][My sister]],[[ it][a dog]]. Among t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a reinforcement learning based anaphora resolution method, which comprises the following steps: data preprocessing: carrying out word segmentation, sentence segmentation, part-of-speech tagging, part-of-speech reduction, named entity identification, syntactic analysis and word vector conversion on text data to obtain candidate preceding words and analogy word related characteristics; constructing a neural network model: combining the characteristics of the word vectors and the relevant characteristics which can learn the fingering pairs and the relevant semantic information, better sorting and scoring the candidate preceding words and the fingering words, and finally obtaining an fingering chain; and using the trained model to carry out anaphora resolution, inputting text data, and outputting a resolution chain. According to the method, deep learning training is carried out by adopting a reward measurement mechanism for overcoming the defects of a heuristic lossfunction, the model effect is improved, hyper-parameter setting is automatically carried out for different language data sets, the necessity of manual setting is avoided, the practicability of the model is improved, and the application range is expanded.

Description

technical field [0001] The present invention relates to the field of natural language processing, more specifically, a method for anaphora resolution based on reinforcement learning. Background technique [0002] Reference is a ubiquitous expression in natural language. In order to avoid repetition, people are accustomed to using pronouns, titles, and abbreviations to refer to the aforementioned entities, which makes the language concise and coherent. However, a large number of references increases the difficulty of natural language processing, and reference resolution is the task of identifying different expressions of the same entity in the text. It has an extremely important basic support role for natural language processing applications such as information extraction, automatic summarization, automatic question answering, machine translation, and machine reading comprehension. [0003] There are mainly the following methods for referencing resolution: [0004] Resoluti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06N3/08
CPCG06N3/08G06F40/30
Inventor 赵忠华李舟军赵志云杨泽赵硕王禄恒付培国孙利远万欣欣
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products