Linguistic data classifying method and device, as well as terminal

A classification method and technology of a classification device, which are applied in the field of intelligent interaction, can solve the problems of inaccurate classification results and low quality of training samples, and achieve the effects of improving accuracy, good category characteristics and quality.

Pending Publication Date: 2017-03-15
SHANGHAI XIAOI ROBOT TECH CO LTD
View PDF1 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the quality of training samples in the prior art is usually not high, resulting in inaccurate classification results; and when classifying, the classifier model divides the text into the categor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Linguistic data classifying method and device, as well as terminal
  • Linguistic data classifying method and device, as well as terminal
  • Linguistic data classifying method and device, as well as terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] As mentioned in the background technology, the quality of training samples in the prior art is usually not high, resulting in inaccurate classification results; and when classifying, the classifier model divides the text into the maximum probability corresponding to the text according to the probability that the text belongs to each predefined category category, when the above-mentioned maximum probability is still small, the classification result will be inaccurate.

[0039]In the embodiment of the present invention, after classifying the classifier through the feature words of each category and the training corpus, the probability threshold in the classifier is obtained, so that when the classifier is used to classify the corpus to be classified, only when the corpus to be classified is divided Only when the probability of each category satisfies the restriction of the probability threshold can it be classified into each category, which avoids the situation in the prio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a linguistic data classifying method, a linguistic data classifying device and a terminal. The linguistic data classifying method comprises the following steps: performing word classification on training linguistic data and to-be-classified linguistic data; extracting feature words of each category from the result after the training linguistic data is subjected to word classification; according to the feature words of each category and a training linguistic data training classifier, determining a probability threshold value in the classifier; classifying the to-be-classified linguistic data by utilizing the classifier after training is finished to obtain the classification result. According to the technical scheme of the invention, the accuracy of linguistic material classification is improved.

Description

technical field [0001] The present invention relates to the field of intelligent interaction technology, in particular to a corpus classification method, device and terminal. Background technique [0002] Text classification refers to determining a category for each document in a document collection according to a predefined category. Text classification is an important part of text mining. As a basic research, text classification has applications in many fields, such as information retrieval, automatic classification of Web documents, digital library, automatic summarization, classified newsgroups, text filtering, semantic analysis of words, organization and management of documents, etc. fields. [0003] In the prior art, statistical methods and machine learning methods are usually used for automatic text classification. In text classification, the quality of training sample data and the trained classifier model directly determine the accuracy of classification results. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F40/289
Inventor 张昊谢瑜朱频频
Owner SHANGHAI XIAOI ROBOT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products