Text handling method and system

A text processing and text technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as low reliability, accurate classification of training data, and reduced classification accuracy, so as to reduce impact and improve accuracy , the effect of improving reliability

Inactive Publication Date: 2007-08-22
HUAWEI TECH CO LTD
View PDF0 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In the above prior art, in the traditional text feature extraction method, each document in the training document set has a strong correlation with the corresponding category
For SMS texts, a large amount of SMS texts are needed as training data sets to train the model during training, but due to the huge amount of training texts, it is impossible to manually classify each piece of training data accurately, resulting in the training text set itself including A large amount of noise data has low reliability. Using the traditional feature extraction method to extract SMS features based on the training set will cause the extracted feature set to contain more noise features, reducing the training features extracted from the tra

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text handling method and system
  • Text handling method and system
  • Text handling method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In the embodiment of the present invention, the feature vector space extracted in the text feature extraction process is adaptively optimized, noise features are removed, and an optimal low-dimensional feature space is finally obtained.

[0026] Specifically, a text processing method provided by an embodiment of the present invention is applied to text feature extraction technology, and the method includes:

[0027] Step A, in the text training process, classify the training text based on the model parameters after training, and delete the wrongly classified training text, so that only the correctly classified training text is kept in the new training text set, and then based on the correctly classified training text Construct a new feature set; this step can be performed after training model parameters based on the training text set and the feature set obtained from the feature representation.

[0028] Step B, training model parameters based on the new training text se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for handling text,which is used for text feature extraction technology. The method includes it classifies the training text based on the training model parameters and deletes the wrong classified training text in order to concentrate new training text and retain the correctly classified training text, and it sets new version of feature according to the correctly classified training text. And it trains the model parameters based on the above new training set and the new version of the feature. The invention also provides a text-handling system.

Description

technical field [0001] The invention relates to the technical field of intelligent text information processing, in particular to a text processing method and a text processing system. Background technique [0002] Mobile phone text messages have great potential and prospects as a way of advertising, but judging from the current situation, spam text messages in mobile phone text messages have caused serious nuisance problems. In order to solve this problem, advertisement publishers need to adopt effective methods to obtain relevant information of advertisement audiences, so as to deliver targeted and responsive SMS advertisements. [0003] In order to obtain the relevant information of the advertising audience, it is necessary to mine the user's interest points from a large number of user text messages. How to quickly and effectively obtain users' points of interest from a large number of user text messages is a current problem, and text mining of text messages is just a met...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/21G06F17/30
Inventor 尚明生林劼傅彦邵刚
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products