An Improved K-means Service Clustering Method Based on Topic Modeling

A clustering method and topic modeling technology, applied in text database clustering/classification, character and pattern recognition, instruments, etc., can solve the problem that the clustering effect depends on the selection of truncation distance, the amount of data is large, and it is impossible to pick out clusters. Center point and other issues to achieve the effect of improving the final effect

Active Publication Date: 2022-04-05
ZHEJIANG UNIV OF TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the idea of ​​the DPC algorithm is concise and efficient, there are still some problems in practical applications: (1) the clustering effect is very dependent on the selection of the cut-off distance; (2) when the amount of data is large, it may not be easy to pick out Appropriate cluster center point

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Improved K-means Service Clustering Method Based on Topic Modeling
  • An Improved K-means Service Clustering Method Based on Topic Modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] The present invention will be further described below in conjunction with the accompanying drawings.

[0078] refer to figure 1 and figure 2 , a kind of improved K-means service clustering method around topic modeling, it is characterized in that, described method comprises the following steps:

[0079] The first step is to preprocess all Mashup service data that requires feature representation;

[0080] The second step is to extract functional nouns based on the preprocessed Mashup service data;

[0081] The third step, for the functional noun set FS of each mashup service, use the topic model to represent the mashup feature vector, the process is as follows:

[0082] By using the mashup service information as the corpus to construct the LDA model, the topic distribution of each mashup service information is obtained, and the feature vector of the mashup service is represented by this. Given in the form, the probability distribution of topics in the text is simplifi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An improved K-means service clustering method around topic modeling, including the following steps: the first step, preprocessing all mashup service data that needs feature representation; the second step, based on the preprocessed mashup service data , to extract the functional nouns; the third step, for the functional noun set FS of each mashup service, use the topic model to represent the mashup feature vector; the fourth step, for all the mashup feature vectors participating in the clustering, carry out the density information Calculation; the fifth step, based on the density information calculated in the fifth step, select the candidate points of the cluster centers from all the Mashup feature vectors; the sixth step, further screen out the candidate points of the cluster centers obtained in the fifth step The most suitable K initial cluster centers are used for K-means clustering. The present invention improves the final effect of Mahsup service clustering.

Description

technical field [0001] The invention relates to the field of Mashup service data clustering in the Web environment, in particular to an improved K-means clustering method around topic modeling. Background technique [0002] As one of the core technologies of the Web 2.0 era, Mashup technology can realize the integration of heterogeneous resources by combining WebAPI services with different functions. Once this convenient and efficient development technology came out, it has been favored by the majority of software developers, and many organizations have also released their own developed Mashup services and data resources to the Internet for users to call. However, with the continuous growth of mashup service resources on the Internet, how to help users quickly locate mashup services that meet their own needs has become an urgent problem to be solved. In addition, because most of the current Mashup services lack normative WSDL documents and related service attribute descript...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/289G06F40/30G06K9/62
CPCG06F16/3335G06F16/3344G06F16/35G06F18/23213
Inventor 陆佳炜马超治吴涵程振波徐俊肖刚
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products