Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text classification method for open network questions in specific field

An open network and text classification technology, applied in text database clustering/classification, biological neural network model, unstructured text data retrieval, etc., can solve problems such as high cost, a large number of labeled training samples, and large data volume. To achieve the effect of accelerated training

Active Publication Date: 2020-04-21
HARBIN ENG UNIV
View PDF11 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the text classification methods of most current neural network models have the following problems: 1) the amount of data required for the learning of neural network model classifiers is large, and a large number of labeled training samples are required, and the cost is relatively high
Indeed, deep convolutional models are derived from deep models originally developed for image processing, but new deep architectures for text processing may challenge this conclusion in the near future

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method for open network questions in specific field
  • Text classification method for open network questions in specific field
  • Text classification method for open network questions in specific field

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0080] The purpose of the present invention is to face open network question texts in a specific field, and propose a semi-supervised hierarchical classification method, while making up for the above-mentioned deficiencies in the existing technical problems, and can also use additional knowledge training when there are few labeled samples. Classify the model and maintain decent classification accuracy.

[0081] The specific train of thought that the present invention realizes above-mentioned purpose is:

[0082] 1. Considering the different classification difficulties among various categories, set coarse-grained and fine-grained classification levels, and set the categories in each level;

[0083] 2. Domain-specific network questions, question-and-answer texts, and written texts of domains for preprocessing tasks;

[0084] 3. Generate word embedding representations of domain-specific corpus through representation learning;

[0085] 4. Divide a small part of the balanced trai...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of text classification processing, and particularly relates to a text classification method for open network questions in a specific field. According to the method, the problems of lack of enough available corpus sets with category marks, low network text information amount and high noise under the condition of executing network open text classificationtasks in certain specific fields are solved, and a new method is provided for hierarchical classification of open network questions in the fields. According to the method, open network questions andwritten texts in a specific domain are utilized to enable word embedding representation in the domain to better conform to domain knowledge features, and meanwhile, a semi-supervised method is used for accelerating classification model training and reducing required marked samples; and in addition, category classification at a multi-granularity level is realized in combination with conditional probability. The method can assist the extraction, discrimination and construction of data in such fields as question and answer system, emotion analysis and domain knowledge bases and the like.

Description

technical field [0001] The invention belongs to the technical field of text classification processing, and in particular relates to a text classification method for open network questions in specific fields. Background technique [0002] The multiple intelligences of human beings are closely related to language. Human logical thinking is in the form of language, and most of human knowledge is also recorded and handed down in the form of language. Therefore, it is also an important, even core part of artificial intelligence. From the beginning of artificial intelligence research, people have been looking for ways for machines to understand the world. Among them, Text Classification is a widely used subject in the field of natural language processing. The specific description of the text classification task is to use the computer to automatically classify and mark the text set according to a certain classification system or standard. Since the Internet developed at an asto...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06K9/62G06N3/04
CPCG06F16/35G06N3/045G06F18/2155G06F18/24155
Inventor 黄少滨余日昌刘汪洋杨辉李熔盛申林山李轶张柏嘉
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products