Unlock instant, AI-driven research and patent intelligence for your innovation.

Text classification method, electronic device and computer storage medium

A text classification and text technology, applied in text database clustering/classification, computing, unstructured text data retrieval, etc., can solve problems such as poor performance, a large number of manual feature engineering costs, low automation, and low level of intelligence

Active Publication Date: 2021-09-17
BEIJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] These are mainly used in scenarios that lack annotation, require a lot of manual feature engineering costs, rely on external knowledge bases, expert knowledge, and synonyms, etc., and have a relatively low level of automation and intelligence. Poor performance on multiple evaluation indicators such as

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method, electronic device and computer storage medium
  • Text classification method, electronic device and computer storage medium
  • Text classification method, electronic device and computer storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] In order to make the objects, technical solutions, and advantages of the embodiments of the present invention, the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings. Here, a schematic embodiment of the present invention and a description thereof are for explaining the invention, but is not limited to the present invention.

[0041] It is to be described in order, and the following embodiments or examples thereof may be described in the same or similar embodiments in the same or similar feature, or replace features in other embodiments or examples. To form possible embodiments. In addition, the term "comprising / comprising" as used herein refers to the presence of features, elements, steps, but not excluded that there is still one or more other features, elements, steps or components.

[0042]A large number of high-quality labels are required for traditional algorithms as learning corpus, while high-pe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a text classification method, electronic equipment and computer storage medium, wherein, the method includes: acquiring multiple topic categories and corresponding multiple words and multiple documents; counting all documents of each topic category containing The number of first documents of each word corresponding to the topic category and the number of second documents containing each word of each topic category in all documents of all topic categories; calculate the number of first documents of each word of each topic category The ratio of the number of documents to the second document is used as the word-to-topic coverage; if the word-to-topic coverage of the word is greater than the set threshold, the word is selected as the feature word of the corresponding topic category, and the corresponding category feature word bag is obtained; the classification document is treated Word segmentation to obtain the bag-of-words model of the document to be classified; calculate the similarity between the bag-of-words model of the document to be classified and the bag of feature words of each category; determine the category of the document to be classified according to each similarity. Through the above scheme, the text classification task can be completed with fewer annotations.

Description

Technical field [0001] The present invention relates to the field of natural language processing, and more particularly to a text classification method, an electronic device and a computer storage medium. Background technique [0002] Text classification is a traditional natural language processing problem, which has a large value of practical value, is a supporting technology that realizes spam filtering, news automatic archiving, text data mining and other tasks. The classic text classification problem is the process of imparting reasonable category labels for new text after learning from high quality labeling. Traditional text classification method focuses on the statistical characteristics of the text, the shallow semantic characteristics, such as TF-IDF, N-GRAM model, word embedding, etc., and establish text feature vectors, combined with classification algorithms to implement text classification tasks, classification algorithms include simple Ye Lus, logic regression, suppo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F40/289G06F40/35
CPCG06F16/35G06F40/289G06F40/35
Inventor 杜军平喻博文邵蓥侠徐欣李昂
Owner BEIJING UNIV OF POSTS & TELECOMM