Target-oriented short text classification method

A classification method and goal-oriented technology, applied in the field of short text classification, which can solve the problems of poor TextCNN classification effect, short text, and little information.

Pending Publication Date: 2021-06-25
大有秦鼎(北京)科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Due to the short text and the source of information, a lot of information is useless, so there is little valuable information, and some key information is mainly represented by numbers, which leads to poor text classification results such as ordinary TextCNN.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Target-oriented short text classification method
  • Target-oriented short text classification method
  • Target-oriented short text classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0024] Such as figure 1 As shown, a goal-oriented short text classification method includes the following steps:

[0025] 1. According to the needs, classify the text annotation; for example, if you want to judge whether the user is rich, you can divide it into two categories (see figure 2 );

[0026] 2. According to the needs, the position and attribute of the text classification label name entity; for example: I (invalid), N (number), K (keyword) (see i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a target-oriented short text classification method, which comprises the following steps of: 1, marking and classifying texts according to requirements; 2, according to needs, labeling positions and attributes of name entities for the text classification; 3, building a deep learning model network, with texts corresponding to data input, classification results corresponding to classification results in the step 1, and name entity recognition corresponding to attributes in the step 2; 4, setting a loss function used during training, the loss function being the weighted sum of the CRF loss function and the text classification loss function, and the weights of the two loss functions are adjusted through tests so that the text classification effect can be optimal. According to the method, in the training process, a data set for text classification needs to be prepared, and the data set needs to be labeled with the name entity recognition result. In the training process, the loss function of the network is set, the function is a weighted sum of a CRF loss function and a text classification loss function, and the weight is adjusted according to a test result. According to the method, the useful information can be accurately extracted under the conditions that the text is short and effective samples are few.

Description

technical field [0001] The invention relates to the technical field of short text classification, in particular to a goal-oriented short text classification method. Background technique [0002] Text classification is a technology in the field of NLP. Its goal is to divide texts into predefined categories. It has applications in many aspects. The current effect is mainly based on deep learning text classification. [0003] Name entity recognition technology is also a technology in the field of NLP. The goal is to identify specific categories of names from text, such as identifying the names of people in articles. At present, LSTM+CRF is generally used for name entity recognition. [0004] Copynet is a text generation model that also has an automatic target guidance mechanism. [0005] User portraits are the basis of Internet marketing today. When creating user portraits, a lot of information is unstructured text information, such as forums and reviews. This type of informa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06N3/04G06N3/08
CPCG06F40/295G06N3/08G06N3/045
Inventor 孙俊
Owner 大有秦鼎(北京)科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products