A text semantic
analysis method and
system can realize semantic analysis of text data base on lexical level and
sentence level. Aiming at the semantic analysis at the lexical level, the invention firstly adopts an improved word segmentation
algorithm to solve the problem that English words are segmented only by spaces. Secondly, based on word segmentation, TF-IDF modeling is performed to obtain
weight value; Then the text is vectorized by weighting and summing the
weight value and the word vector trained by Word2Vec, and finally the
document similarity is solved. At the same time, the invention considers the contribution degree of the vocabulary to the document content and the semantic status to calculate the similarity degree of the document, the result has higher accuracy, and provide agood foundation for subsequent text clustering. The present invention extracts subject-predicate
object structure based on
text segmentation, part-of-speech tagging, syntactic analysis and dependencyrelation for
sentence level semantic analysis. The invention realizes the extraction of subject-predicate-object structures of various
sentence types in all aspects, and realizes the
noun expansion function, which is more consistent with the
manual extraction result.