Unlock instant, AI-driven research and patent intelligence for your innovation.

Text classification method and device

A text classification and text technology, applied in the direction of text database clustering/classification, unstructured text data retrieval, etc.

Inactive Publication Date: 2021-06-22
HITACHI LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the problem with such a classification method is that the words contained in the text are considered to be independent of each other, without considering the correlation and mutual influence between words and words.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and device
  • Text classification method and device
  • Text classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0030] In the following description of the present invention, one sentence or several sentences or short sentences are used as an example of a text. However, it should be noted that this is done only for the convenience of describing the embodiment, and cannot be regarded as an actual processing situation. In fact, in the actual application process, it is preferable to treat a paragraph or an article as a text.

[0031] Adopt the text classification method provided according to the embodiment of the present invention, can divide text into ordinary text and valuable text according to the size of the value (effective information amount) of text, wherein, ordinary text is considered as value (effective information amount) Smaller, that is, texts of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text classification method, comprising: establishing a training text set, generating first and second text classifiers; preprocessing the text to be classified, replacing the text noise with a replacement character string; counting the probability of the replacement character string , when the probability is greater than or equal to the filtering threshold of the first text classifier, the text to be classified is classified as ordinary text; when the probability is less than the filtering threshold, the preprocessed text to be classified is segmented; the first text of the text to be classified is established Representation, second textual representation and third textual representation; method based on feature representation, computing the first textual feature representation of the first textual representation, the second textual feature representation of the second textual representation and the third textual feature representation of the third textual representation representation; and based on the first text feature representation, the second text feature representation and the third text feature representation, using a second classifier to classify the text to be classified. Also disclosed is a text classification device.

Description

technical field [0001] The invention relates to a text classification method and device. Background technique [0002] With the continuous development of information technology, the amount of text information faced by human beings is increasing day by day, and there are more and more channels to obtain text information, for example, by browsing the web, using search engines for information retrieval, receiving emails, etc. However, among the massive text information available to users, the value (effective information volume) of the text information is uneven. Therefore, classifying text information according to the value (effective information amount) contained in the text information is an effective means of organizing and managing text information. value (effective information) text information, in order to facilitate the further processing and utilization of text information with higher value, reduce the waste caused by the processing of text information with lower valu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35
Inventor 周樟俊张学
Owner HITACHI LTD