Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text classification method and device applying outlier detection algorithm LOF model

A technology for outlier detection and text testing, which is applied in text database clustering/classification, neural learning methods, text database query, etc. It can solve the problems of decreased classification accuracy and inability to correctly identify text, so as to improve accuracy Effect

Pending Publication Date: 2022-02-11
ZHONGKE DINGFU BEIJING TECH DEV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] It can be seen that the current text classification tasks implemented by deep learning models cannot correctly identify texts other than known categories, and will also classify texts other than known categories into wrong categories, resulting in a decrease in classification accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and device applying outlier detection algorithm LOF model
  • Text classification method and device applying outlier detection algorithm LOF model
  • Text classification method and device applying outlier detection algorithm LOF model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] Text classification is one of the basic tasks in the field of natural language processing technology. It has very rich applications in real life. For example, applications such as public opinion monitoring, news classification, and sentiment classification based on natural language processing technology are all realized through text classification tasks. of.

[0027] The text classification task trains a classification model with training texts of several known categories, enabling the classification model to classify unknown texts into a known category. The current classification model is usually trained by a deep learning model, and the deep learning model can only give the category judgment of the input text in several known categories that have been trained. For other categories of input text, the deep learning model will also give the category with the highest probability among all known categories, resulting in the input text being classified into the wrong catego...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text classification method and device applying an outlier detection algorithm LOF model. The method comprises the following steps: acquiring a training data set comprising training texts and corresponding category labels, wherein the training texts comprise training texts with known labels and training texts with other labels, which are configured according to a preset proportion; training a classification model by using the training data set; inputting the training text with the known label into the classification model to obtain an embedded representation, an intermediate result and a final representation vector of the training text with the known label; training a plurality of LOF models according to the embedded representation of the training text with the known label, the intermediate result and the final representation vector; and judging whether the test text is of an unknown category according to the classification model and the LOF models. According to the technical scheme provided by the invention, the text classification model can identify an unknown category from the test text, the situation that the text of the unknown category is allocated to a known category label is avoided, and the accuracy of text classification is improved.

Description

technical field [0001] The present application relates to the technical field of natural language processing, in particular to a text classification method and device using an outlier detection algorithm LOF model. Background technique [0002] Text classification is one of the basic tasks in the field of natural language processing technology. It has very rich applications in real life. For example, applications such as public opinion monitoring, news classification, and sentiment classification based on natural language processing technology are all realized through text classification tasks. of. [0003] The text classification task trains the classification model through several fixed categories of training texts, so that the classification model can recognize several fixed categories of texts from unknown texts. The current classification model is usually trained by a deep learning model, and the deep learning model can only give the category judgment of the input text...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F16/35G06F40/126G06F40/279G06F40/30G06N3/04G06N3/08
CPCG06F16/3344G06F16/355G06F40/126G06F40/279G06F40/30G06N3/08G06N3/045
Inventor 胡加明李健铨刘小康
Owner ZHONGKE DINGFU BEIJING TECH DEV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products