Text classification method, apparatus, computer device, and storage medium

A text classification and text technology, applied in the field of artificial intelligence, can solve the problems that the semantic expression features cannot be obtained, affect the classification accuracy, ignore the sequence relationship of word vectors, etc., so as to improve the accuracy, improve the accuracy, and enhance the feature expressivity. Effect

Inactive Publication Date: 2019-02-01
深圳市牛鼎丰科技有限公司
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, the neural network + Word-Embedding model commonly used in deep learning mainly has the following problems in text classification. The MLP full-link neural network completely ignores the sequence of word vectors when extracting text features, and treats the entire text as only one The collection of vocabulary, which will lead to the inability to obtain the features in many semantic expressions, which will affect the classification accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method, apparatus, computer device, and storage medium
  • Text classification method, apparatus, computer device, and storage medium
  • Text classification method, apparatus, computer device, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0031] It should be understood that when used in this specification and the appended claims, the terms "comprising" and "comprises" indicate the presence of described features, integers, steps, operations, elements and / or components, but do not exclude one or Presence or addition of multiple other features, integers, steps, operations, elements, components and / or collections thereof.

[0032] It should also be understood that the terminology used ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a text classification method, a device, a computer device and a storage medium. The method comprises the following steps: the input text is processed by wordsegmentation and part-of-speech tagging to obtain a word segmentation list, wherein the word segmentation list comprises a word obtained by word segmentation of the input text and a part-of-speech ofthe word; Obtaining a word vector of each word and a word vector of a part of speech of each word in the word segmentation list; Obtaining a word vector matrix composed of a splicing word vector of each word in the word segmentation list, wherein, the splicing word vector is obtained by splicing a word vector of a word and a word vector of a part of speech of the word; Inputting the word vector matrix to Bi-LSTM to obtain a text feature vector of each word in the word segmentation list; Obtaining a text classification result of the input text according to a text feature vector of each word inthe word segmentation list. By implementing the method of the embodiment of the invention, the accuracy of the text classification can be improved.

Description

technical field [0001] The present invention relates to the technical field of artificial intelligence, in particular to a text classification method, device, computer equipment and storage medium. Background technique [0002] At present, the neural network models commonly used in text classification include CNN+Word-Embedding, RNN+Word-Embedding, MLP+Word-Embedding and other structures. The usual way to use this type of structure is to convert and map the text into a real number space after word segmentation, and then combine it into a floating-point matrix or vector that can be accepted by the neural network as an input, and then obtain the probability density distribution of the classification through the calculation of the neural network model , use gradient descent or some improved training methods to optimize the model until convergence during training. [0003] At present, the neural network + Word-Embedding model commonly used in deep learning mainly has the follow...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06N3/02
CPCG06N3/02
Inventor 陶恺
Owner 深圳市牛鼎丰科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products