Improved K-means service clustering method based on topic modeling

A clustering method and topic modeling technology, applied in text database clustering/classification, character and pattern recognition, semantic analysis, etc., can solve the problems of unable to pick out the cluster center point, large amount of data, and clustering effect dependent truncation The choice of distance etc.

Active Publication Date: 2020-07-31
ZHEJIANG UNIV OF TECH
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the idea of ​​the DPC algorithm is concise and efficient, there are still some problems in practical applications: (1) the clustering effect is very dependent on the selection of the cut-off distance; (2) when the amount of data is large, it may not be easy to pick out Appropriate cluster center point

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Improved K-means service clustering method based on topic modeling
  • Improved K-means service clustering method based on topic modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] The present invention will be further described below in conjunction with the accompanying drawings.

[0078] refer to figure 1 and figure 2 , a kind of improved K-means service clustering method around topic modeling, it is characterized in that, described method comprises the following steps:

[0079] The first step is to preprocess all Mashup service data that requires feature representation;

[0080] The second step is to extract functional nouns based on the preprocessed Mashup service data;

[0081] The third step, for the functional noun set FS of each mashup service, use the topic model to represent the mashup feature vector, the process is as follows:

[0082] By using the mashup service information as the corpus to construct the LDA model, the topic distribution of each mashup service information is obtained, and the feature vector of the mashup service is represented by this. Given in the form, the probability distribution of topics in the text is simplifi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an improved K-means service clustering method based on topic modeling. The improved K-means service clustering method comprises the following steps: 1, preprocessing Mashup service data needing feature representation; 2, based on the preprocessed Mashup service data, performing a function noun extraction operation; 3, for the function noun set FS of each Mashup service, utilizing a topic model to express a Mashup feature vector; 4, calculating density information of all Mashup feature vectors participating in clustering; step 5, based on the density information calculated in the step 5, screening out candidate points of the clustering center from all Mashup feature vectors; and step 6, for the clustering center candidate points obtained in the step 5, further screening out the most appropriate K initial clustering centers, and performing K-means clustering. According to the method, the final effect of Mahsup service clustering is improved.

Description

technical field [0001] The invention relates to the field of Mashup service data clustering in the Web environment, in particular to an improved K-means clustering method around topic modeling. Background technique [0002] As one of the core technologies of the Web 2.0 era, Mashup technology can realize the integration of heterogeneous resources by combining WebAPI services with different functions. Once this convenient and efficient development technology came out, it has been favored by the majority of software developers, and many organizations have also released their own developed Mashup services and data resources to the Internet for users to call. However, with the continuous growth of mashup service resources on the Internet, how to help users quickly locate mashup services that meet their own needs has become an urgent problem to be solved. In addition, because most of the current Mashup services lack normative WSDL documents and related service attribute descript...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/289G06F40/30G06K9/62
CPCG06F16/3335G06F16/3344G06F16/35G06F18/23213
Inventor 陆佳炜马超治吴涵程振波徐俊肖刚
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products