Text clustering method and device

A text clustering, text technology, applied in the field of semantic analysis, can solve problems such as inability to text clustering

Active Publication Date: 2016-12-07
SHANGHAI XIAOI ROBOT TECH CO LTD
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, the embodiment of the present invention provides a text clustering method and device, which solves the problem that the text clustering methods in the prior art cannot achieve text clustering at the syntactic level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text clustering method and device
  • Text clustering method and device
  • Text clustering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0104] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0105] figure 1 Shown is a schematic flowchart of a text clustering method provided by an embodiment of the present invention. Such as figure 1 As shown, the text clustering method includes:

[0106] Step 101: Identify the dependent syntactic relationship between words in each text to be clustered in the text library.

[0107] Specifically, each text to be clustered is composed of words, and there are certain dependent syntactic relationships betwe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a text clustering method and device, and solves the problem in the prior art that a text clustering way can not realize text clustering on a dependence grammar level. The text clustering method comprises the following steps of: identifying a dependence grammar relationship among terms in each text to be clustered of a text library; converting the dependence grammar relationship in each text to be clustered into syntactic encoding; calculating a similarity among the syntactic encodings of the text to be clustered in the text library; and according to the calculation result of the similarity, clustering the text to be clustered in the text library.

Description

technical field [0001] The invention relates to the technical field of semantic analysis, in particular to a text clustering method and device. Background technique [0002] As an important means to effectively organize, summarize and navigate text information, text clustering has attracted more and more researchers' attention. The existing text clustering method is to convert the text into a vector model, and then perform clustering based on the literal meaning of the words in the text. However, the same literal meaning may be expressed through a variety of dependency syntactic structures, some of which are more commonly used and some are less popular. Existing text clustering methods cannot cluster texts at the level of dependency syntax . [0003] For example, "How old are you?" and "How old are you?" have the same literal meaning, but they have different dependent syntactic structures, and the dependent syntactic structure of "How old are you?" is more commonly used. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F40/211
Inventor 白杨张磊朱频频
Owner SHANGHAI XIAOI ROBOT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products