Method and device for clustering sentences

A clustering and sentence technology, applied in the computer field, can solve problems such as long computing time, many hyperparameters, and inflexibility

Pending Publication Date: 2020-10-30
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Existing sentence clustering algorithms are usually divided into two types, one will rely on the pre-set number of centers and pre-selected initialization centers, such as k-means, this type of algorithm will rely heavily on initialization, which is not flexible enough
The other is a density-based clustering algorithm, such as DBSCAN (Density-Based Spatial Clustering of Applications with Noise, a density-based clustering method with noise), and this type of algorithm has many hyperparameters and long calculation time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for clustering sentences
  • Method and device for clustering sentences
  • Method and device for clustering sentences

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain related inventions, rather than to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

[0029] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0030] figure 1 An exemplary system architecture 100 is shown to which embodiments of the method for clustering sentences or the apparatus for clustering sentences of the present application can be applied.

[0031] like figure 1 As shown, the system architecture 100 may i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a method and device for clustering sentences. One specific embodiment of the method comprises the steps of determining a set composed of semantic vectors corresponding to all sentences in a to-be-clustered sentence set as a semantic vector set; for each semantic vector in the semantic vector set, executing the following density calculation operation; for each semantic vector in the semantic vector set, executing the following clustering division operation; for each established cluster, determining the semantic vector with the maximum density in the semantic vectors divided into the cluster as the clustering center semantic vector of the cluster; and determining to-be-clustered sentences corresponding to the determined clustering center semantic vectors as a clustering center sentence set. According to the embodiment, the sentence clustering accuracy is improved.

Description

technical field [0001] The embodiment of the present application relates to the field of computer technology, and specifically relates to a method and device for clustering sentences. Background technique [0002] Sentence clustering is to divide multiple sentences into different categories according to semantics. Currently, there are many occasions where sentence clustering is used. For example, in a self-service dialogue system, user question sentences can be clustered, the overall intent distribution of users can be analyzed, and corresponding standard question sentences and answer sentences can be extracted for online responses. [0003] Existing sentence clustering algorithms are usually divided into two types, one will rely on the pre-set number of centers and pre-selected initialization centers, such as k-means, this type of algorithm will rely heavily on initialization, which is not flexible enough . The other is a density-based clustering algorithm, such as DBSCA...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/30
Inventor 黄强甘露卜建辉刘剑吴伟佳谢炜坚
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products