Unlock instant, AI-driven research and patent intelligence for your innovation.

Clustering method and device

A clustering method and clustering technology, applied in the field of information retrieval, can solve problems such as difficulty in generating readable cluster labels

Inactive Publication Date: 2011-03-23
CHINA MOBILE COMM GRP CO LTD
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present invention provides a clustering method and device to solve the defect that it is difficult to generate more readable clustering labels according to the retrieval result clustering method provided by the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Clustering method and device
  • Clustering method and device
  • Clustering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The embodiment of the present invention provides a clustering scheme, by following the preset selection strategy and related parameters that can reflect the frequency of word strings appearing in all documents to be clustered and the document category representativeness of word strings, etc., from the Among the strings contained in each document to be clustered, select the clustering label as the clustering label for each document to be clustered, so that the generated clustering label can fully reflect the category of the document to be clustered, and achieve better reliability. readability.

[0024] The main realization principles, specific implementation modes and corresponding beneficial effects that can be achieved of the technical solutions of the embodiments of the present invention will be described in detail below in conjunction with each accompanying drawing.

[0025] Embodiments of the present invention firstly provide a clustering method, the specific flowch...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a clustering method for overcome the defect that the retrieval result provided by the prior art is difficult to generate a clustering label with relatively good readability. The method comprises the following steps: selecting a first candidate string set from the documents to be clustered according to a pre-set selection policy; for each string in the first candidate string set, selecting a second candidate string from the first candidate string set according to a string related parameter, wherein the string related parameter comprises at least one parameter of the total times of the string appearing in all documents to be clustered, the total times of the string appearing in a designated document, the number of characters included in the string and the number of documents including each string in the documents to be clustered; and determining the second candidate string as the clustering label for clustering the documents to be clustered, and classifying the documents to be clustered into a cluster corresponding to the clustering label. The invention also discloses a clustering device.

Description

technical field [0001] The invention relates to the field of information retrieval, in particular to a clustering method and device. Background technique [0002] Retrieval result clustering refers to the process of aggregating similar search results from search engines into clusters, where a cluster is a set of similar search results, the search results in the same cluster are similar to each other, and The retrieval results in different clusters are often different from each other. The clustering of retrieval results can help users to use search engines better, for example, it can help users locate the required information more quickly, or it can help users obtain more comprehensive information. [0003] In the prior art, existing retrieval result clustering methods are mainly divided into two categories: one is called Documents-Based; the other is called Label-Based method. The so-called document-based method refers to first clustering documents into multiple categorie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 孙宏伟胡珉罗治国
Owner CHINA MOBILE COMM GRP CO LTD