A Verb Phrase Omission Resolution Method Based on Deep Learning

A technology of verb phrases and deep learning, which is applied in natural language data processing, instruments, biological neural network models, etc., can solve the problem of low recognition accuracy of trigger word judgment precedent phrases, and achieve the effect of improving recognition accuracy

Active Publication Date: 2022-02-22
HARBIN INST OF TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem of low accuracy of trigger word judgment and antecedent phrase recognition in existing verb phrase omission resolution methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Verb Phrase Omission Resolution Method Based on Deep Learning
  • A Verb Phrase Omission Resolution Method Based on Deep Learning
  • A Verb Phrase Omission Resolution Method Based on Deep Learning

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0025] Specific implementation mode one: a kind of deep learning-based method for omitting verb phrases described in this implementation mode, the specific steps of the method are:

[0026] Step 1, determine the sentences contained in dataset 1 (Penn Treebank 2Wall Street Journal) and dataset 2 (Anannotated corpus for the analysis of VP ellipsis By Johan Bos and JenniferSpenader);

[0027] When preprocessing the sentences in dataset 1, the OpenNMT encoder is obtained;

[0028] When preprocessing the sentences in Dataset 2, the verb phrases and adjective phrases in each sentence are used as the candidate antecedent phrases of the sentence in turn, and the sentence is correspondingly divided into candidate antecedent phrases, candidate antecedent phrases, candidate antecedent phrases, and candidate antecedent phrases. Four parts after the phrase and the trigger word;

[0029] Step 2, extract the auxiliary verb features, syntactic features, context features and sentence-level fe...

specific Embodiment approach 2

[0053] Specific embodiment 2: This embodiment further defines a method for ellipsis and resolution of verb phrases based on deep learning described in Embodiment 1. The data set 2 in step 1 (An annotated corpus for the analysis of VP ellipsis By Johan Bos and Jennifer Spenader) are annotated by JohanBos and Jennifer Spenader to provide antecedent phrases and trigger words, and the sentences in Dataset 2 all have verb phrase omissions.

[0054] The sentences in the data set 2 in this embodiment all have verb phrase omissions, which are used to train the trigger word judgment model in step 2 and the preceding phrase recognition model in step 3. Among them, Johan Bos and Jennifer Spenader are the names of the authors of the English literature (An annotated corpus for the analysis of VP ellipsis ByJohan Bos and Jennifer Spenader) from Dataset 2.

specific Embodiment approach 3

[0055] Specific Embodiment 3: This embodiment further limits the method of verb phrase omission and resolution based on deep learning described in Embodiment 2. The steps in this embodiment perform preprocessing on Dataset 1 and Dataset 2. The process is:

[0056] Use word_tokenize in the NLTK tool to segment the sentences in dataset 1; use OpenNMT-py to train the results of word segmentation in dataset 1 to obtain the OpenNMT encoder;

[0057] The OpenNMT encoder has two outputs, one of which outputs the hidden layer state output corresponding to the last word, and the other output is the hidden layer state output corresponding to each word;

[0058] Extract the dataset 2 labeled by Johan Bos and Jennifer Spenader, use BIOEST to label each sentence in the extracted dataset 2, and divide each labeled sentence into antecedent phrase, part before the antecedent phrase, part after the antecedent phrase, and trigger The four parts of the word, the antecedent phrase, the part befo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A deep learning-based verb phrase omission resolution method belongs to the field of computer artificial intelligence technology. The invention solves the problem of low accuracy of trigger word judgment and antecedent phrase recognition existing in the existing verb phrase omission resolution method. The present invention preprocesses the determined data set 1 and data set 2; the process of judging trigger words adds the extraction of sentence context features and sentence-level features, and converts the extracted sentence features into vector input support vector machines, and then according to The output of the support vector machine determines the trigger word of the input sentence; finally, the correct antecedent phrase is identified from multiple candidate antecedent phrases generated by the trigger word by using the multi-layer perceptron. The invention adds context features and sentence-level features when extracting sentence features, so that the accuracy rate of trigger word judgment can reach about 90%, and the accuracy rate of preceding phrase recognition can reach more than 85%. The invention can be applied in the technical field of computer artificial intelligence.

Description

technical field [0001] The invention belongs to the technical field of computer artificial intelligence, and in particular relates to a method for omitting verb phrases based on deep learning. Background technique [0002] A chatbot is a computer program that uses natural language processing technology to simulate human communication and conduct conversations with humans. The origin of chatbots can be traced back to the article "Computing Machinery and Intelligence" published by Turing on "Mind" in 1950. This article proposed the classic "Turing Test" (Turing Test), which has been used for decades. It is regarded as the ultimate goal of computer artificial intelligence. In chatbots, multi-turn dialogue chat is a core module. Verb ellipsis is an anaphoric construction that omits the spoken component. In English, instances of phrasal ellipsis consist of two parts: the trigger word and the antecedent phrase. Trigger words, usually auxiliary or modal verbs, indicate the pres...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/289G06F40/284G06F40/211G06N3/04
CPCG06F40/211G06F40/289G06F40/284G06N3/044
Inventor 张伟男刘元兴宋皓宇刘挺
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products