Unlock instant, AI-driven research and patent intelligence for your innovation.

A news text automatic classification system based on a fast Text algorithm

An automatic classification and text technology, applied in text database clustering/classification, text database query, unstructured text data retrieval, etc., can solve the problems of weak feature expression ability, manual, high latitude, etc., and achieve fast browsing Effect

Pending Publication Date: 2019-05-17
DONGHUA UNIV
View PDF4 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the main problem of these traditional methods is that the text representation is too sparse and high-latitude, and the feature expression ability is not strong. In addition, feature engineering is required manually, which is very costly.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A news text automatic classification system based on a fast Text algorithm
  • A news text automatic classification system based on a fast Text algorithm
  • A news text automatic classification system based on a fast Text algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] Below in conjunction with specific embodiment, further illustrate the present invention. It should be understood that these examples are only used to illustrate the present invention and are not intended to limit the scope of the present invention. In addition, it should be understood that after reading the teachings of the present invention, those skilled in the art can make various changes or modifications to the present invention, and these equivalent forms also fall within the scope defined by the appended claims of the present application.

[0016] Embodiments of the present invention relate to a news text automatic classification system based on fastText algorithm, such as figure 1 As shown, it includes: news text preprocessing module, which is used to filter and clean news texts crawled by crawlers; Chinese word segmentation and stop word removal module: used to perform word segmentation on text data and pass stop words The table removes meaningless words for te...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a news text automatic classification system based on a fast Text algorithm, and the system comprises a news text preprocessing module which is used for carrying out the screening and cleaning operation of news texts crawled through crawlers; a Chinese word segmentation and stop word removal module used for performing word segmentation operation on the text data and removing words which are meaningless to text classification through a stop word table; a digital feature extraction module used for converting the text features into digital features; and a fast Text classifier module used for constructing a multi-classification model through a fast Text algorithm and classifying each section of news text prediction into a corresponding category. According to the system,the news texts can be automatically classified.

Description

technical field [0001] The invention relates to the technical field of news text automatic classification, in particular to a news text automatic classification system based on fastText algorithm. Background technique [0002] With the rapid development of network information technology and the gradual transformation of traditional paper media to information media, more and more information is accumulated in the network, especially the paperless news makes people more inclined to search for information on the network. Most of them exist in text form. Text classification can effectively solve this problem, while traditional text classification mainly uses manual classification, which has many disadvantages: first, it will consume a lot of manpower and material resources; Inconsistencies in requirements. The low-efficiency manual classification method is facing more and more difficulties, and it is even more difficult to start in the face of big data. In order to improve the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F16/35
Inventor 程徐韩芳孔维健
Owner DONGHUA UNIV