Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text classification method, text classification device and computer readable storage medium

A text classification and computer program technology, applied in the fields of text database clustering/classification, computer components, computing, etc., can solve problems such as efficiency to be improved, low efficiency of manual feature selection, etc.

Pending Publication Date: 2019-09-06
TENCENT TECH (SHENZHEN) CO LTD
View PDF5 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the low efficiency of manually selecting features, the efficiency of traditional text classification methods based on machine learning needs to be improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method, text classification device and computer readable storage medium
  • Text classification method, text classification device and computer readable storage medium
  • Text classification method, text classification device and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] The embodiment of this application provides a text classification method, please refer to Figure 1-a , the text classification method in the embodiment of the present application includes:

[0070] Step 101, perform text representation on the text to be classified, so as to obtain the sentence set of the above-mentioned text to be classified and the word set of each sentence;

[0071] Wherein, the sentence set is composed of a word set of each sentence in the text to be classified, and the word set is composed of word vectors of words included in the corresponding sentence.

[0072] In the embodiment of the present application, the text representation of the text to be classified may include two processes of preprocessing and vector representation. The above preprocessing refers to processing the text to be classified to obtain a structured representation, so as to obtain the sentence set of the above text to be classified and the word set of each sentence, and the abo...

Embodiment 2

[0152] An embodiment of the present application provides a text classification device. Such as Figure 2-a As shown, the text classification device in the embodiment of the present application includes:

[0153] The text representation unit 201 is configured to perform text representation on the text to be classified, so as to obtain a sentence set of the text to be classified, wherein the sentence set is composed of a word set of each sentence in the text to be classified, and the word set consists of The word vector composition of the words contained in the corresponding sentence;

[0154] The feature extraction unit 202 is used to obtain the feature vector of the sentence set based on the neural network, the attention mechanism and the word set;

[0155] The classification unit 203 is configured to input the feature vector of the sentence set into a text classification model to obtain a classification result of the text to be classified, wherein the text classification mo...

Embodiment 3

[0176] The embodiment of this application provides a text classification device, please refer to image 3 , the text classification device in the embodiment of the present application also includes: memory 301, one or more processors 302 ( image 3 Only one of them is shown) and a computer program stored on the memory 301 and executable on the processor. Wherein: the memory 301 is used to store software programs and modules, and the processor 302 executes various functional applications and data processing by running the software programs and units stored in the memory 301 . Specifically, the processor 302 implements the following steps by running the above-mentioned computer program stored in the memory 301:

[0177] Textual representation of the text to be classified to obtain a set of sentences of the text to be classified, wherein the set of sentences is composed of a set of words in each sentence in the text to be classified, and the set of words is composed of the words...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a text classification method, a text classification device and a computer readable storage medium. The text classification method comprises the steps that text representation isconducted on a to-be-classified text to obtain a sentence set of the to-be-classified text, the sentence set is composed of word sets of all sentences in the to-be-classified text, and the word setsare composed of word vectors of words contained in corresponding sentences; based on a neural network, an attention mechanism and the word set, a feature vector of the sentence set is obtained; and the feature vectors of the sentence set is input into a text classification model to obtain a classification result of the to-be-classified text. According to the technical scheme, the text classification efficiency can be improved.

Description

technical field [0001] The present application belongs to the technical field of text classification, and in particular relates to a text classification method, a text classification device and a computer-readable storage medium. Background technique [0002] With the rapid development of the information age, a large amount of text information has accumulated in the Internet. In order to effectively manage and utilize these distributed massive information, content-based information retrieval and data mining have gradually become the areas of concern. [0003] Text classification technology is an important basis for information retrieval and text mining. Its main task is to classify the corresponding text according to the text content under the pre-given set of category tags. Text classification technology has a wide range of applications in natural language processing and understanding, information organization and management, content information filtering and other fields. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06K9/62G06F17/27
CPCG06F40/205G06F16/35G06F18/211G06F18/2411Y02D10/00
Inventor 王煦祥
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products