Chinese text feature extracting method with text mood fusion function

A feature extraction and text technology, applied in semantic analysis, special data processing applications, instruments, etc., can solve the problems of high cost, sparseness, and high latitude of text representation

Inactive Publication Date: 2018-02-23
YUNNAN UNIV
View PDF2 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] The number of texts from the Internet, e-commerce and other fields increases sharply every day. It will cost a lot of money to process and understand these massive text data manually.
In order to quickly and efficiently mine useful knowledge patterns in massive texts, it is a better choice to process and understand texts based on artificial intelligence-related technologies; the key to intelligent analysis of massive texts is to effectively represent the semantic features of texts, the most commonly used The text representation method is the Bag of Words (BOW) model. Although the bag of words model is simple and practical, the text representation is often high in latitude and sparse.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese text feature extracting method with text mood fusion function
  • Chinese text feature extracting method with text mood fusion function
  • Chinese text feature extracting method with text mood fusion function

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] Specific embodiments of the present invention will be described below in conjunction with the accompanying drawings, so that those skilled in the art can better understand the present invention.

[0045] figure 1 A kind of Chinese text feature extraction method of fusion text tone comprises: step (1), massive text word set and tone word set generation, generate the word, tone word set of each text by text set, text tone word set; (2 ), word embedding model construction, obtain text feature vectors and modal particle feature vectors by training Skip-gram and CBOW models; (3), text word representation model construction, generate contextual semantic features of words of each text through Bi-LSTM layer , and then combine the initialized word vector to generate the local feature vector of the text, and then obtain the intermediate global feature of the text through 2-dimensional convolution and 1-dimensional pooling; (4), text representation model construction

[0046] Th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese text feature extracting method with a text mood fusion function. By means of the method, it is achieved that the text feature representation fusing mood features, syntax features and semanteme features is obtained in a lengthened text. The method comprises the steps that firstly, a text word set and a mood word set are constructed, the text word set and the mood word set are transformed into word embedding forms respectively, and corresponding vector models are obtained; secondly, according to the text word embedding represented time step dimensions and feature dimensions, text features are screened, the mood features are fused into the time step dimension of the selected text feature, and the text feature representation which accurately represents the semanteme is obtained. According to the method, the contributions of modal particles to the text semanteme are fully utilized to fuse the mood features, the syntax features and the semanteme features into the text feature representation, and the text feature representation is low in dimension and continuous so that the text semanteme can be better represented, and natural language processing tasks, such as text analysis, language translation and relation extraction, can be better effectively supported.

Description

technical field [0001] The invention belongs to the field of natural language processing, and relates to a Chinese text feature extraction method that integrates text mood; based on massive Chinese texts, the Chinese mood features are integrated into the text features to better represent the semantics of the Chinese text. Background technique [0002] The number of texts from the Internet, e-commerce and other fields increases sharply every day. Manual processing and understanding of these massive text data will cost a lot of money and the losses outweigh the gains. In order to quickly and efficiently mine useful knowledge patterns in massive texts, it is a better choice to process and understand texts based on artificial intelligence related technologies; the key to intelligent analysis of massive texts is to effectively represent the semantic features of texts, the most commonly used The text representation method is the Bag of Words (BOW) model. Although the bag of words ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/30
Inventor 郭延哺金宸姬晨邓春云李维华王顺芳
Owner YUNNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products