Text processing technical method and system based on meaning group division

A text processing and technical technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of Chinese text semantic tendency recognition, etc., and achieve the effect of simple method, saving time and cost, and high precision

Pending Publication Date: 2019-11-01
BEIJING RUNUP INFORMATION TECH
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] In order to solve the above technical problems, the purpose of the present invention is to provide a text processing tec

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing technical method and system based on meaning group division
  • Text processing technical method and system based on meaning group division
  • Text processing technical method and system based on meaning group division

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0063] Embodiment 1. Sentences are divided into multiple meaning groups. Please refer to figure 2 , input sentences, output meaning groups:

[0064] Step 21, sentence is divided into multiple meaning groups according to comma and semicolon;

[0065] Step 22, processing each word segmentation of the meaning group into a plurality of words, judging whether the word belongs to a transition word one by one, if so, then completely intercepting the meaning group where the transition word is in, and performing step 23, otherwise directly performing step 23. Carry out word segmentation processing on the meaning group, and read the words word after the word segmentation one by one, if the word word belongs to a turning word, then intercept the meaning group;

[0066] Step 23, judge whether all sentences have been processed, if so, execute step 24, otherwise execute step 21;

[0067] Step 24, the meaning group division is completed.

Embodiment 2

[0068] Embodiment two, meaning group tendency (emotional words) algorithm. Obtain the article of semantic tendency to be analyzed, the article includes multiple paragraphs, and the paragraph includes multiple sentences, divide the sentence into continuous language fragments expressing a single semantic meaning as semantic meaning groups, and perform word segmentation processing on the semantic meaning groups to obtain Candidate words, according to the meaning group tendency algorithm to assign tendency weights to the sentence, the details are as follows:

[0069] Emotional words are the main components that affect the tendency of meaning groups. Based on the existing Chinese emotional thesaurus, a vocabulary of inclined words (emotional words) can be constructed. The existing Chinese emotional thesaurus includes, but is not limited to, praise and derogatory words and their synonyms, Extreme value table of Chinese emotional words, Tsinghua University Li Jun Chinese Complimentar...

Embodiment 3

[0073]Embodiment three, meaning group negative word algorithm. The orientation analysis of the text starts from the discovery of the emotional words in the sentence, and determines the orientation of the sentence through the orientation of the emotional words, thereby determining the orientation of the entire text. But in real life, it will be found that the modification of negative words will change the tendency and emotional polarity of emotional words. For example: "I am very uncomfortable today". In this sentence, "comfortable" is a commendatory word. Due to the modification of the negative word "no", its tendency and emotional polarity have changed, and it has turned into a negative tendency and emotion. Due to the phenomenon of multiple negations in Chinese, that is, when a negative word appears an odd number of times, it expresses a negative meaning; when a negative word appears an even number of times, it expresses an affirmative meaning. All it takes is not enough to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a text processing method and system based on meaning group division, and the method comprises the steps: obtaining an article to be analyzed in semantic tendency, wherein thearticle comprises paragraphs, the paragraphs comprise sentences, the sentences are divided into continuous language segments expressing a single meaning, ethe continuous language segments serve as a semantic meaning group, and the word segmentation of the semantic meaning group is carried out, and candidate words are obtained; obtaining a sentiment word library, allocating a tendency weight to each sentiment word in the word library, constructing a sentiment word list, retrieving candidate words in the sentiment word list, and extracting sentiment words corresponding to the candidate words astendency words of sentences; analyzing degree adverbs and negative words in front of the tendency words respectively; endowing the tendency words with degree weights and negative weights, and multiplying the negative weights, the degree weights and the tendency weights of the tendency words to obtain meaning group tendency components of the semantic meaning groups; and collecting the tendency component of each meaning group in the sentence to serve as a sentence tendency component, and obtaining a semantic tendency component of the article according to the sentence tendency component to serveas a semantic tendency analysis result of the article.

Description

technical field [0001] The present invention relates to the technical field of natural language processing, and in particular to a text processing method and system based on meaning group division. Background technique [0002] At present, there are three major branches of technical methods for text processing: [0003] The first is based on linguistic theory, according to linguists' understanding of language phenomena, using rules to describe or explain ambiguous behavior or ambiguous characteristics, called the rule school. The rule school's method is based on Chomsky's theory of language. It describes the language through a series of principles that the language must abide by, in order to judge whether a sentence is correct (following the language principles) or wrong (violating the language principles). The rule school first studies a large number of language phenomena and summarizes a series of language rules. Then form a set of responsible rule sets - language analy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
Inventor 杜登斌丁雨
Owner BEIJING RUNUP INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products