Improved high-frequency occupational skill life curve clustering method based on K-Means algorithm

A clustering method and technology of skills, applied in computing, computer parts, instruments, etc., can solve the problem of difficulty in selecting the center point of vocational skills life curve clustering

Pending Publication Date: 2020-03-17
HANGZHOU DIANZI UNIV
View PDF19 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem that it is difficult to describe the life curve of professional skills, and it is difficult to select the center point of the life curve clustering of professional skills, to provide a model for describing the life curve of high-frequency vocational skills and to provide efficient and stable clustering method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Improved high-frequency occupational skill life curve clustering method based on K-Means algorithm
  • Improved high-frequency occupational skill life curve clustering method based on K-Means algorithm
  • Improved high-frequency occupational skill life curve clustering method based on K-Means algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] combined with figure 1 As shown, a kind of improved high-frequency vocational skill life curve clustering method based on K-Means algorithm proposed by the present invention comprises the following steps:

[0038] 1) Use the WebMagic crawler system to crawl user information to form a user document library:

[0039] Randomly collected 64,442 user files from the LinkedIn website, among which 13,536 users filled in professional skills, and used these 13,536 user files with professional skills to form a user file library. The principle of crawler collection of user files is as follows figure 2 As shown, it mainly includes four large modules. They are URL management module, web page download module, data parsing module and data persistence module respectively. The main steps of user information collection are as follows: 1. Add the initial user information URL address in the URL queue in the URL management module. 2. The downloader obtains a URL from the URL queue, and d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an improved high-frequency occupational skill life curve clustering method based on a K-Means algorithm, and the method is characterized by comprising the following steps: 1) crawling user information to form a user document library; 2) mining high-frequency occupational skills; 3) constructing a high-frequency professional skill life curve; 4) selecting K occupational skill life curve clustering centers by using a density peak algorithm; 5) clustering the occupational skill life curves by using a K-Means algorithm; 6) evaluating a clustering result by adopting a Thevenberg D index, if the evaluation is unqualified, increasing the number K of occupational skill life curve clustering centers, returning to the step 4, and entering the next step if the evaluation is qualified; 7) obtaining a final high-frequency occupational skill life curve clustering result. Through defining the high-frequency occupational skill life curve, the density peak algorithm is used forselecting the appropriate clustering center point, then the K-Means algorithm is used for clustering, and then occupational skills similar to the life curve are found out stably and effectively.

Description

technical field [0001] The invention relates to the field of data mining, in particular to an improved high-frequency vocational skill life curve clustering method based on the K-Means algorithm. Background technique [0002] The life curve of a professional skill is affected by many factors, such as the difficulty of mastering the professional skill, the social demand for the professional skill, the physical fitness requirements of the professional skill and the social recognition of the professional skill degree and other aspects of the impact. Therefore, it is very difficult to draw the life curve of professional skills. However, with the development of the Internet, social networking platforms have become an indispensable and important part of the development of the Internet. Professional social networking sites contain a large number of users' professional information. For example, the skills mastered by the user, the number of likes of the skill mastered by other use...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06Q50/00
CPCG06Q50/01G06F18/23213
Inventor 陈冲司华友万健吴浩鹏张伟
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products