Method for analyzing implicit type discourse relation based on hierarchical depth semantics

A relational analysis and discourse technology, applied in semantic analysis, semantic tool creation, natural language data processing, etc., can solve problems such as overfitting and data sparseness, inability to effectively utilize the deep semantic information of implicit discourse relational arguments, etc. Make up for misjudgments, obtain fast and accurate results, and achieve mutually optimized effects

Active Publication Date: 2017-01-11
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF2 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the technical problems of over-fitting and data sparseness caused by the existing implicit discourse relationship analysis method due to the main reasons of data scale and model itself, that is, to solve the proble

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for analyzing implicit type discourse relation based on hierarchical depth semantics
  • Method for analyzing implicit type discourse relation based on hierarchical depth semantics
  • Method for analyzing implicit type discourse relation based on hierarchical depth semantics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] The present embodiment has specifically described the flow chart of the proposed method of the present invention and the method in the present embodiment, as figure 1 shown.

[0041] From figure 1 It can be seen that the proposed method of the present invention includes four modules: a preprocessing part, corresponding to the corpus preprocessing in step 1; a vector initialization part, corresponding to the multi-level semantic vector initialization in step 2; a feature extraction part, corresponding to step 3 Generating and expanding the useful word pair table, and the hierarchical depth semantic representation of the implicit textual relationship in step 4.1; the classification part, corresponding to the neural network model parameter training in steps 4.2 to 4.3, and the implicit textual relationship category score;

[0042]Among them, the wide arrow indicates the data flow direction of the training corpus, and the narrow arrow indicates the data flow direction of ...

Embodiment 2

[0044] This embodiment specifically describes the classification system architecture of the method proposed in the present invention. figure 2 It is the framework diagram of the implicit discourse relationship classification system proposed by the present invention.

[0045] From figure 2 It can be seen that the implicit discourse relationship classification system of the method proposed in the present invention corresponds to the hierarchical depth semantic representation of the implicit discourse relationship in step 4, the training of neural network model parameters, and the scoring of implicit discourse relationship categories. The input from left to right is the implicit discourse relationship distribution vector, namely the product of the prior probability of the implicit discourse relationship and the transition matrix, the implicit discourse relationship argument pair vector, and the implicit discourse relationship useful word pair vector; multi-level After the sema...

Embodiment 3

[0047] This embodiment specifically describes the process of running the implicit discourse relationship analysis based on hierarchical depth semantics on a PC based on the method proposed in the present invention, specifically corresponding to steps 1 to 4 in the content of the invention;

[0048] This embodiment is based on the English tagged corpus Penn Discourse Treebank (PDTB) and its tagged categories, and the unlabeled corpus Central News Agency of Taiwan, English Service (CNA) and Xinhua News Agency, English Service (XIN), and follows the sequence of steps in the content of the invention : Introduce the corpus preprocessing method, the multi-level semantic vector initialization method, the method of generating useful word pairs and expanding the useful word pairs, and the implicit discourse relationship model training and category scoring methods.

[0049] A) Corpus preprocessing, the implementation steps are as follows:

[0050] 1. According to the statistical results...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for analyzing implicit type discourse relation based on hierarchical depth semantics, and belongs to the technical field of application of natural language processing. The method comprises the following steps of firstly, combining marked and unmarked corpuses, expanding the corpus training scale, and solving the problem of under-learning due to undersize corpus training scale; then, according to a certain rule, initializing a depth semantic vector of each corpus training hierarchy, sorting word pairs favorable for classification according to information gain value, and using the word pairs as subsequent feature selection basis; finally, designing a scoring function, combining the multiple hierarchial depth semantic information of to-be-classified discourse relation theory element pairs, utilizing the parameters of a nerve network training model, fitting a type tag of the implicit type discourse relation, and finding the model for furthest optimizing the performance, so as to complete the analysis of the implicit type discourse relation. The method has the advantages that the false judging of the traditional method based on discrete features is overcome; the analysis accuracy of the type tag of the implicit type discourse relation is improved; a user can quickly and accurately obtain the analysis result of the implicit type discourse relation.

Description

technical field [0001] The invention relates to an implicit discourse relationship analysis method, in particular to an implicit discourse relationship analysis method based on hierarchical depth semantics, and belongs to the technical field of natural language processing applications. Background technique [0002] As an important task in the field of natural language processing application technology, discourse relationship analysis has been unremittingly studied by scholars, and has played an important role in statistical machine translation, information extraction, sentiment analysis and other fields. Discourse relations are based on lexical and syntactic analysis, aiming to identify and classify intersentence relations without discourse connectives at the discourse level, especially focusing on the analysis of implicit discourse relations without discourse connectives. difficulty. As the semantic analysis of natural language has gradually become the mainstream of academ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F17/30
CPCG06F16/35G06F16/36G06F40/211G06F40/284G06F40/30
Inventor 鉴萍佘萧寒黄河燕
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products