A method for intelligent text classification

A text and intelligent technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of slow classification, inaccurate classification, and speed up extraction, and achieve a wide range of applications.

Active Publication Date: 2011-12-21
BEIJING JINHER SOFTWARE
View PDF3 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention aims at problems such as inaccurate classification and slow classification speed in the text classification process of products on the Internet, and provides a method ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for intelligent text classification
  • A method for intelligent text classification
  • A method for intelligent text classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described below in conjunction with the accompanying drawings, so that those of ordinary skill in the art can implement it after referring to this specification.

[0024] Such as figure 1 Shown, a kind of method for text intelligent classification of the present invention comprises the following steps:

[0025] Step 1. Prepare a certain amount of training text, divide these training texts into multiple categories, and create a text information linked list LIST in the system memory m_TextInfoLIst, save all text strings in the memory in TXT format, where the TEXTINFO data type is:

[0026]

[0027]

[0028] Step 2: Create a word segmentation list LIST in the system , traverse the text information linked list, segment each text through the Chinese word segmentation algorithm, and calculate the weight value W of each word according to the word frequency, word length and part of speech of the word segmentation, and save i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method of intelligently classifying texts, which comprises the following steps: a great number of training text sets are prepared; each text is segmented; feature items are extracted and stored; weight computation is carried out on each feature item; the text is converted into a text vector and is stored in a classifier and finally a feature item set and a classifier set are formed. The texts to be classified are segmented; features in the feature item set are defaulted to be matched with the text to be classified; and the weight computation of the feature items is carried out so that the features which are not matched with the text to be classified are filtered and the features which are matched with the text to be classified are left as the features of the text. The feature items are converted into the text vector. The text vector is compared with the vector in the classifier through a similarity algorithm and the classification of the text is determined according to the similar text vector. The texts can be classified and extracted more accurately through the effective method disclosed by the invention and the speed of the method is faster than the speed of the prior art.

Description

technical field [0001] The invention relates to the classification technology of data mining in the field of artificial intelligence, in particular to the classification technology applied to the classification of texts in Internet products. Background technique [0002] The rapid development of the Internet has led to an exponential growth of text data in the network, so how to efficiently process these text information has become an important research topic. As an important link in text information processing, automatic classification technology has aroused people's widespread attention. As my country's network penetration rate is getting higher and higher, and there are more and more network users, various websites contain a large amount of Chinese information, most of which exist in the form of text, so whether it is accurate It is of great practical significance to accurately classify texts. Classification technology is also used in many places in Internet products, su...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
Inventor 吕福军李军锋李跃海
Owner BEIJING JINHER SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products