A multi-level classification system and method based on news text information

A multi-level classification and text information technology, applied in text database clustering/classification, unstructured text data retrieval, special data processing applications, etc., can solve the unbalanced distribution of sample data and reduce the accuracy of news text information classification methods , Minority samples cannot be accurately identified, etc., to achieve the effect of improving accuracy and improving classification efficiency

Active Publication Date: 2020-07-21
北京时间有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, in the process of implementing the present invention, the inventors found that there are at least the following problems in the prior art: In many specific application scenarios, due to various reasons, the distribution of sample data may be unbalanced
When encountering unbalanced data, the non-hierarchical news text information classification method implemented by the machine learning algorithm in the prior art will cause the machine learning algorithm to pay too much attention to the samples of the majority class due to the imbalance of the sample data, and make the minority class Class samples cannot be accurately identified, thus reducing the accuracy of these news text information classification methods as a whole

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A multi-level classification system and method based on news text information
  • A multi-level classification system and method based on news text information
  • A multi-level classification system and method based on news text information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0018] figure 1 A multi-level classification system based on news text information provided by the present invention is shown, and the system includes: a training module 110 , a multi-level classification module 120 and a result determination module 130 .

[0019] The training module 110 is used to train the preset training sample set through various machine learning algorithms for the classification of news text information at all levels, and determine the number and type of classifiers corresponding to each classification according to the training results.

[0020] In the process of classifying news text information, different news text information can be classified into different categories according to the content of news text information. In order to make the classification of news text information accurate and fine, a multi-level classification system can be adopted. The multi-level classification system may increase in order according to the abstraction degree of the c...

Embodiment 2

[0030] figure 2 It shows a multi-level classification system based on news text information provided by the present invention, the system includes: a training module 210 , an evaluation module 220 , a multi-level classification module 230 , a model update module 240 and a result determination module 250 .

[0031] The training module 210 is used to train the preset training sample set through various machine learning algorithms for the classification of news text information at all levels, and determine the number and type of classifiers corresponding to each level of classification according to the training results.

[0032]Specifically, the training module 210 generates a training sample set according to the obtained label data, and extracts the training feature words contained in the training sample set, and assigns corresponding weights to the extracted training feature words; The training feature words and their weights generate corresponding training feature vectors, an...

Embodiment 3

[0050] image 3 A multi-level classification method based on news text information provided by the present invention is shown, the method includes:

[0051] Step S310: For the classification of news text information at all levels, train the preset training sample set through various machine learning algorithms, and determine the number and type of classifiers corresponding to the classification at each level according to the training results.

[0052] Specifically, in the solution provided in this embodiment, corresponding training sample sets need to be preset for each node of each level, and the data in each training sample set should contain all or at least most of the features of the corresponding node category data , and then train the training sample set corresponding to each node through a variety of classification algorithms, and select the optimal classification algorithm for each node, so as to determine the number and type of classifiers corresponding to each level ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-level classification system and method based on news text information, and relates to the technical field of file classification. The system includes a training module, a multi-level classification module and a result determining module; the training module is used for regarding all levels of categories of the news text information, training a preset training sample set through multiple machine learning algorithms and determining the number and type of classifiers corresponding to each level of category according to training results; the multi-level classification module is used for configuring corresponding multi-level classification models according to the number and type of the classifiers corresponding to each level of category, wherein the number and type of the classifiers are determined by the training module; the result determining module is used for inputting the obtained news text information to be classified into the multi-level classification models to conduct the classification, and is used for determining output results of the multi-level classification models as final classification results of the news text information to be classified. In the way, according to the multi-level classification system and method, the problem of inaccurate classification results caused by unbalanced sample data is solved in a pointed mode, the classification accuracy is effectively improved, and the classification efficiency is improved.

Description

technical field [0001] The invention relates to the technical field of file classification, in particular to a multi-level classification system and method based on news text information. Background technique [0002] With the development of the Internet age, network resources are becoming more and more abundant, and there are more and more types. In order to effectively retrieve and utilize various resources on the network, it is particularly important to classify the above-mentioned network resources accurately and comprehensively. With the emergence and development of machine learning algorithms, more and more people have applied machine learning algorithms to news text information classification methods. [0003] However, during the process of implementing the present invention, the inventors found at least the following problems in the prior art: in many specific application scenarios, due to various reasons, the distribution of sample data may be unbalanced. When enc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35
CPCG06F16/35
Inventor 赵毅强
Owner 北京时间有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products