Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A short text classification method based on semantic enhancement and multi-level label embedding

A classification method and short text technology, applied in text database clustering/classification, semantic tool creation, unstructured text data retrieval, etc., can solve problems such as poor text classification performance, and achieve the effect of fast and accurate classification

Active Publication Date: 2021-09-03
XI AN JIAOTONG UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem of poor performance of text classification in the prior art, and provide a short text classification method based on semantic enhancement and multi-level label embedding

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A short text classification method based on semantic enhancement and multi-level label embedding
  • A short text classification method based on semantic enhancement and multi-level label embedding
  • A short text classification method based on semantic enhancement and multi-level label embedding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only The embodiments are a part of the present invention, not all embodiments, and are not intended to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concepts disclosed in the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0069] Various structural schematic diagrams according to the disclosed embodiments of the p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a short text classification method based on semantic enhancement and multi-level label embedding. First, on the basis of pre-training a multi-layer language model to obtain character-level embedding representations, the traditional word embedding method is used to embed word semantics into character-level In the text representation; secondly, the local and sequence information of the text is used as the multi-dimensional feature representation of the sentence; finally, a multi-level label embedding is proposed, and the fast and accurate classification of short text data is realized through the Softmax function. The present invention uses the traditional text representation method to expand the text encoding information of the pre-training model, and solves the problem of insufficient semantic expression of the word embedding module; cooperates with the multi-scale CNN and bidirectional GRU modules to enhance the advanced and deep semantic representation of the text, and strengthen the text encoding of short texts , replace the traditional one-hot label representation, vectorize the classification label, use the semantic information contained in it, filter the text representation and assist classification decision-making at the word level and sentence level respectively, and improve the performance of short text classification.

Description

【Technical field】 [0001] The invention belongs to the technical field of machine learning and data mining, and relates to a short text classification method based on semantic enhancement and multi-level label embedding. 【Background technique】 [0002] With the rapid development of social media and the rapid increase of network user groups, the network is full of comments from netizens and is growing at an explosive speed. However, due to the limitation of text input in social media, most of these comments are presented in the form of short texts, such as product reviews, questions raised by users in the Q&A system, and updates posted by users in Weibo. Quickly extracting valuable information from massive data requires basic and effective data management, namely short text classification. In addition, short text classification has also become the basis for many fields such as automatic question answering, text retrieval, topic tracking and search engines. Sexual technology h...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F16/36G06F40/289
CPCG06F16/35G06F16/374
Inventor 饶元祁江楠
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products