Text processing method and device

A text processing and text technology, applied in the field of data processing, can solve problems such as inaccurate determination of the meaning of ambiguous words, and achieve the effect of disambiguation

Inactive Publication Date: 2018-06-12
BEIJING GRIDSUM TECH CO LTD
View PDF4 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The quality of the disambiguation dictionary requires a lot of manual maintenance
If the quality of the dictionary is not good enough, it will lead to inaccurate determination of the meaning of ambiguous words when disambiguating

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and device
  • Text processing method and device
  • Text processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Embodiments of the present invention provide a text processing method, device, and system for conveniently classifying text data.

[0029] figure 1 It is a method flowchart of a text processing method provided by an embodiment of the present invention. Referring to the above, see figure 1 , the method of the embodiment of the present invention includes:

[0030] Step 101: Obtain the classified text,

[0031] Step 102: Segment the classified text to obtain a word segmentation result;

[0032] Step 103: construct the target feature vector according to the word segmentation result;

[0033] Step 104: Use the pre-established SVM classifier to analyze the target feature vector to obtain the target label, wherein the SVM classifier is established according to the correspondence between at least two types of feature vectors and labels, the feature vector is constructed from text information, and the correspondence between different classes Relationships are labeled differ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a text processing method and device. The text processing method and device are used for conveniently classifying text data. The text processing method comprises the steps of obtaining a text to be classified, and conducting word splitting on the text to be classified to obtain a word splitting result; according to the word splitting result, building a target characteristic vector; using a pre-established SVM classifier to analyze the target characteristic vector to obtain a target label, wherein the SVM classifier is built according to at least two kinds of corresponding relations of the characteristic vector and the label, the characteristic vector is obtained by building text information, and the labels of different kinds of corresponding relations are different. Since when the SVM classifier is built, the text information is utilized, thus the text to be classified can be classified, the classifying result is the target label, that is, the SVM classifier outputs the target label to be used for identifying the text to be classified, and therefore, through the target label, the text to be classified obtains a sole identification.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a text processing method and device. Background technique [0002] In the field of text analysis, a sentence often has multiple interpretations. For example, when the sentence includes ambiguous words, the ambiguous words include multiple meanings, but generally there is only one meaning in the sentence. When the machine analyzes the sentence, it needs to determine Find out the exact meaning of the ambiguous word in the sentence. [0003] In the existing methods, the disambiguation dictionary is generally used to solve this problem. That is, for a polysemous word, the context information of the word is constructed, and for the text, it is judged whether it is a meaning according to the word in the dictionary that appears. [0004] Existing methods for determining the meaning of ambiguous words in sentences rely on the quality of the disambiguation lexicon. The quality of the dis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30G06K9/62
CPCG06F16/35G06F40/30G06F40/289G06F18/2411
Inventor 郭秦龙
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products