Short text labeling method, system and device for large-scale classification system

A classification system and classification method technology, applied in the direction of unstructured text data retrieval, text database clustering/classification, special data processing applications, etc., can solve the problem of low stability of the short text label system and reduce the complexity of the model , the effect of improving stability

Active Publication Date: 2019-07-26
INST OF AUTOMATION CHINESE ACAD OF SCI +1
View PDF8 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the above-mentioned problems in the prior art, that is, in order to solve the problem of low stability of the short text label system for large-scale classification systems under the condition of limited data, the first aspect of the present invention proposes a large-scale classification-oriented A systematic short text automatic labeling method, the method comprises the following steps:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short text labeling method, system and device for large-scale classification system
  • Short text labeling method, system and device for large-scale classification system
  • Short text labeling method, system and device for large-scale classification system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present invention, rather than Full examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0033] The application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain related inventions, rather than to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of text classification, particularly relates to a short text labeling method, system and device for a large-scale classification system, and aims to solve the problem that the short text labeling system for the large-scale classification system is low in stability under the condition of limited data. The method comprises the steps that a first short text information set to be classified is acquired, and preprocessing is carried out based on a forward maximum matching segmented word and a word2vec word vector representation technology to obtain a second shorttext information set; based on a rule-based classification method and a supervised neural network classification method, perform binary classification on a second short text information set, then perform short text filtering, perform first-level and second-level classification labels of each short text based on the same classification method, and perform third-level and fourth-level classificationlabels of each short text based on a label propagation method of semi-supervised learning. According to the method, the stability of the short text label system oriented to the large-scale classification system is ensured under the condition of limited data.

Description

technical field [0001] The invention belongs to the field of text classification, and in particular relates to a short text labeling method, system and device for a large-scale classification system. Background technique [0002] With the widespread use of new Internet platforms such as Internet official media and WeChat public accounts We Media, Weibo, Tieba, etc., it is of great significance to automate the labeling system for short texts published on these platforms. With the rapid expansion of text information, how to effectively organize and manage this information, and quickly, accurately and comprehensively find the information users need is a major challenge in the field of information science and technology. As a key technology to process and organize a large amount of text data, the automatic labeling of short text can solve the problem of information clutter to a large extent, conveniently and accurately locate the required information and divert information. As ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35
CPCG06F16/353
Inventor 孔庆超王磊闫鹏张丽郎佳奇王帅潘进毛文吉王钲淇段运强
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products