Text graph construction method based on text content characteristics

A technology of content features and construction methods, applied in the field of text graph construction, can solve problems such as the inability to prepare and express text semantic features, and achieve the effect of improving flexibility

Active Publication Date: 2020-09-08
NORTHWESTERN POLYTECHNICAL UNIV
View PDF13 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a text graph construction method based on text content features, in order to solve the problems in the graph construction method in the prior art that cannot be prepared to express text The problem of semantic features

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text graph construction method based on text content characteristics
  • Text graph construction method based on text content characteristics
  • Text graph construction method based on text content characteristics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] In this embodiment, a method for constructing a text graph based on text content features is disclosed, which is used to convert the text to be converted into a text graph.

[0039] The text graph construction method provided in this embodiment retains the semantic relationship of word nodes while detaching from dependence on co-occurrence relationships when constructing edges.

[0040] The described method is carried out according to the following steps:

[0041] Step 1, obtain the text to be converted;

[0042] General text is enough, it can be a sentence or an article. Both Chinese and English are acceptable, and the corresponding text processing methods are as follows;

[0043] Step 2, performing text preprocessing on the text to be converted to obtain the preprocessed text; the text preprocessing includes sequential word segmentation, cleaning and standardization;

[0044] Wherein said preprocessed text includes multiple words;

[0045] In this embodiment, the ...

Embodiment 2

[0082] In this embodiment, the method provided by the present invention is verified experimentally, taking classification as an example, using the method provided by the present invention to construct a text graph, and then using the graph attention network (GAN) to learn and classify. GAN is a graph neural network based on attention mechanism, refer to the paper "Graph Attention Networks". Text-GAN(1) is the first method used when constructing graphs (the method of step 4.1-step 4.3 in Example 1), pre-trained word vectors; Text-GAN(2) is used when constructing graphs The second method (the method of step I-step III in embodiment one), pre-trained the word vector; Text-GAN (2)-rand also used the second graph construction method (step I-step in embodiment one III method), randomly initialize word vectors. Use Text-GCN as a comparison algorithm, which comes from the paper "Graph Convolutional Networks for TextClassification". Similarly, the first is to construct a graph based o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text graph construction method based on text content features, which is characterized in that when edges of a text graph are constructed, dependence on a co-occurrence relationship is avoided, and meanwhile, semantic relationships of word nodes are reserved, so that text semantic features can be accurately expressed; in addition, two graph construction methods are provided, an appropriate method can be selected according to practical application, the graph obtained through the first method can have a global node, namely the node with the maximum degree, and the globalnode has a connecting edge with other remaining nodes; however, if the number of nodes in the graph is large and the weight difference of the nodes is not large, the difference between the degree setvalue of the intermediate node and the node weight value is too large in the mode; and the second method can solve the defects of the method to a certain extent, but the constructed graph may not beconnected, and if a subsequently adopted learning algorithm has a requirement on the connectivity of the graph or global node features need to be utilized to represent a graph feature method, the graph feature method can be flexibly selected according to actual requirements, so that the flexibility of text graph construction is improved.

Description

technical field [0001] The invention relates to a text graph construction method, in particular to a text graph construction method based on text content features. Background technique [0002] With the continuous development of deep learning, the algorithms in the image field are becoming more and more mature. In recent years, the graph neural network has been widely used in the image field. So many people began to try to apply graph neural network related algorithms to the text field for natural language processing. To apply algorithms for structured data to unstructured data, it is first necessary to generate graph-structured representations from unstructured data such as text. [0003] Most of the existing graph construction algorithms are to segment the text, regard words as points in the graph, and add edges between word nodes or appear in the same sentence according to the co-occurrence relationship of each word in the text in the same window Add connection edges be...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/36G06F40/253G06F40/284G06F40/289
CPCG06F16/367G06F40/289G06F40/284G06F40/253
Inventor 杨黎斌梅欣戴航蔡晓妍
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products