A clustering analysis method for software development activities based on event logs

A technology of software development and cluster analysis, which is applied in text database clustering/classification, unstructured text data retrieval, creation/generation of source code, etc. Insufficient attention to data and other issues, to achieve the effect of process optimization, reduce vector space dimension, and improve training efficiency

Active Publication Date: 2022-03-25
YUNNAN NORMAL UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the lack of attention to event log data, the complexity and difficulty of understanding the original event log data, the event log in the software development process has not given full play to its value

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A clustering analysis method for software development activities based on event logs
  • A clustering analysis method for software development activities based on event logs
  • A clustering analysis method for software development activities based on event logs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0062] Example 1: as figure 1 As shown, a method for cluster analysis of software development activities based on event logs, this embodiment takes the development log file of the open source software ArgoUML as an example, and performs cluster analysis of software development activities on it. The software has been downloaded more than 100,000 times, and its log files have recorded in detail 17,795 event information from 1998 to 2015, which is highly representative. The specific process includes: extracting event log feature words (Step1), vectorizing the event log (Step2), clustering the event log (Step3), and completing the association between software development process events and software development activities according to the clustering results (Step4 ).

[0063] The specific steps of the software development activity cluster analysis method are as follows (the specific program implementation adopts Python language):

[0064] Step1. Event log feature word extraction....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an event log-based software development activity cluster analysis method, belonging to the technical fields of software engineering and process mining. First, use natural language processing technology to analyze the text and extract feature words from the event log data of the software development process version control system, and realize the vectorization of software development activity event logs based on word2vec, and then use the K-means clustering algorithm to vectorize the event logs The events of software development activities are clustered, and the optimal number of clusters is obtained by using the silhouette coefficient method, and finally the software development activities and the relationship between events and activities are obtained. The invention can enhance the comprehensibility of software development event logs, reveal the information contained in event log data, facilitate the discovery of software development activities, guide and standardize software development behaviors, and provide technical support for software development.

Description

technical field [0001] The invention relates to a method for cluster analysis of software development activities based on event logs, belonging to the technical fields of software engineering and process mining. Background technique [0002] In the process of software development, a series of software development activities and event log data will be generated. These growing event log data record the specific details of the implementation of software development activities, which are helpful for avoiding software development project risks and improving software development project maturity and control. The quality of software products is of great significance. However, due to insufficient attention to the event log data, the original event log data is complex and difficult to understand, etc., the event log in the software development process has not fully utilized its value. Using the method of process mining to mine and analyze software development activities will help re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F8/30G06F16/35G06F40/289
CPCG06F8/30G06F40/289
Inventor 唐明靖文斌王俊陈建兵邹伟
Owner YUNNAN NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products