Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text clustering method and device

A text clustering and text technology, applied in the computer field, can solve the problem of poor large-scale text clustering effect, and achieve the effect of reducing the number of texts and increasing the clarity

Inactive Publication Date: 2017-05-31
BEIJING GRIDSUM TECH CO LTD
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a method and device for text clustering, which can solve the problem of poor effect of large-scale text clustering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text clustering method and device
  • Text clustering method and device
  • Text clustering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0022] This embodiment provides a text clustering method. Please refer to figure 1 , which shows a flowchart of a text clustering method provided by this embodiment. Such as figure 1 As shown, the method of text clustering may include the following steps:

[0023] 101. Perform a clustering on a text collection according to a predetermined number k of text clusters to obtain k first-level text clusters, where k may be a positive ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text clustering method and device and relates to the technical field of computers. The text clustering method and device is invented for solving the problem of a poor large-scale text clustering effect. The text clustering method comprises the steps that primary clustering is conducted on a text set according to a predetermined text cluster number k to obtain k first-class text clusters, wherein k is a positive integer greater than 1; a target first-class text cluster is obtained, wherein the text number included by the target first-class text cluster is greater than k; secondary clustering is conducted on the target first-class text clusters according to the k. The text clustering method and device is mainly applied to the clustering process of large-scale text sets.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a text clustering method and device. Background technique [0002] Text clustering refers to dividing the text in the text collection into multiple text clusters, the texts in the same text cluster have high similarity, and the texts in different text clusters have low similarity. Different from the pre-given classification topics or labels in the classification field, the classification basis in clustering is obtained by randomly selecting text features, or by calculating the mean value of all text features. This classification basis is also called centroid or center object. When clustering, texts with the same or similar text features are classified into a text cluster. Usually a text cluster corresponds to a centroid, and the centroids of different text clusters are different from each other. [0003] The existing text clustering process needs to manually set the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 林漫鹏
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products