Text semantic analysis method

A semantic analysis and text technology, applied in the field of sentence level semantic analysis and text data vocabulary level, can solve the problems of word segmentation and part-of-speech tagging that cannot meet the professional field, and does not involve analysis of different sentence types, and achieve automatic tagging of new words, The effect of reducing complexity and improving processing efficiency

Pending Publication Date: 2019-01-25
BEIJING UNIV OF TECH
View PDF4 Cites 43 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although Sandford Parser has done some research on the dependency syntax analysis of English text, it does not involve the analysis of differen

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text semantic analysis method
  • Text semantic analysis method
  • Text semantic analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] In order to enable those skilled in the art to better understand the solutions of the present invention, the implementation of the solutions of the present invention will be described in detail below in conjunction with the accompanying drawings in the examples of the present invention.

[0055] Such as figure 1 As shown, the present invention discloses a text semantic analysis method and system, which mainly involves text semantic processing at two granularities, including:

[0056] S1: Perform semantic analysis based on the lexical level on the input unstructured text data.

[0057] S2: Perform sentence-level semantic analysis on the input unstructured text data.

[0058] Such as figure 2 As shown, the specific process of the semantic analysis based on the vocabulary level in the present invention is as follows: steps S1-1 to S1-5.

[0059] Due to the versatility of the present invention, text data from different data sources can be processed. The format of the in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A text semantic analysis method and system can realize semantic analysis of text data base on lexical level and sentence level. Aiming at the semantic analysis at the lexical level, the invention firstly adopts an improved word segmentation algorithm to solve the problem that English words are segmented only by spaces. Secondly, based on word segmentation, TF-IDF modeling is performed to obtain weight value; Then the text is vectorized by weighting and summing the weight value and the word vector trained by Word2Vec, and finally the document similarity is solved. At the same time, the invention considers the contribution degree of the vocabulary to the document content and the semantic status to calculate the similarity degree of the document, the result has higher accuracy, and provide agood foundation for subsequent text clustering. The present invention extracts subject-predicate object structure based on text segmentation, part-of-speech tagging, syntactic analysis and dependencyrelation for sentence level semantic analysis. The invention realizes the extraction of subject-predicate-object structures of various sentence types in all aspects, and realizes the noun expansion function, which is more consistent with the manual extraction result.

Description

technical field [0001] The invention relates to a text semantic analysis method in natural language processing, in particular to a text data vocabulary level and sentence level semantic analysis method and system. Background technique [0002] With the continuous development of Internet technology and information technology, and the advent of the era of big data, the data in specific technical fields is constantly enriched, the total amount of data is increasing, and the relationship between data is becoming more and more complex. How to accurately and quickly Extracting valuable information from large-scale text data has become a challenge we are facing at this stage. [0003] Text segmentation is a necessary step in natural language processing, and a good word segmentation has a crucial impact on subsequent modeling and analysis. The existing English word segmentation methods are based on spaces to divide English words. Although they have been widely used, for a specific ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/211G06F40/284G06F40/30
Inventor 谢前前李欣黄鲁成
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products