Software project knowledge graph automatic construction method and system

A technology of software project and knowledge map, which is applied in the field of automatic construction of software project knowledge map, can solve the problems of isolated information islands, inability to organize effectively, and waste of energy for reusers, etc., and achieve the effect of strong scalability and wide application range

Inactive Publication Date: 2018-06-22
PEKING UNIV
View PDF4 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The coexistence of multiple types of data has brought about the problem of isolated islands of information, and the hidden multiple correlations between software resource data need to be explored;
[0004] 2) Many software resource data exist in the form of natural language, and the machine cannot understand its semantic inf

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Software project knowledge graph automatic construction method and system
  • Software project knowledge graph automatic construction method and system
  • Software project knowledge graph automatic construction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] In this embodiment, the user needs to build a software project knowledge graph including source code, question-and-answer documents related vertices and associations, and the software project uses the open source project Apache Lucene. The specific implementation steps are as follows:

[0053] 1) Prepare the original data required for knowledge map generation. Generating such a knowledge map requires a version of Apache Lucene's complete source code and StackOverflow file archives;

[0054] 2) Insert the data analysis plug-in. In this step, data analysis is performed on the source code and the question-and-answer document, which are specifically divided into source code analysis plug-ins and question-and-answer document analysis plug-ins.

[0055] The source code parsing plug-in analyzes the source code written in Java to obtain such figure 2 The shown code structure diagram, and stores the entities and association relationships in the structure diagram into the grap...

Embodiment 2

[0069] In this embodiment, the user needs to extract new "code-question-answer document association" knowledge from a basic knowledge graph that already contains source code vertices and question-answer document vertices, thereby constructing a new software project knowledge graph. Specific steps are as follows:

[0070] 1) Prepare the basic knowledge map for knowledge refinement. This example uses the knowledge map generated in Example 1.

[0071] 2) Insert the knowledge extraction plug-in. In this embodiment, it is necessary to insert the "code element tracking association plug-in in the question-and-answer document".

[0072] Specific steps are as follows:

[0073] 2-1) Use the name attribute of the code element (class, method, interface) in the source code to establish the index of the code element vertex in the graph database;

[0074] 2-2) Traversing the vertices of the Q&A document in the graph database, if the text content of the vertex contains an indexable code el...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a software project knowledge graph automatic construction method and system. The method comprises the steps of (1) analyzing original software resource data to obtain basic knowledge entities and entity relationships of a software project, and storing the basic knowledge entities and the entity relationships in a graph database in vertex and edge forms; 2) based on the existing basic knowledge entities and the entity relationships, establishing new relationships between the entities by adopting a knowledge abstraction method, and/or adding new basic knowledge entitiesand entity relationships into a knowledge graph, and storing the new basic knowledge entities and entity relationships in the graph database in the vertex and edge forms; and 3) selecting part or allof the basic knowledge entities and the entity relationships to form the knowledge graph of the software project. Each software resource data analysis method and each knowledge abstraction method areexistent in a plug-in form; and required plug-ins are selected and run to generate the knowledge graph of the software project. The problem of extraction and organization of domain-specific knowledgein multi-source heterogeneous software resources is solved; the application range is wide; and the expandability is high.

Description

technical field [0001] The invention belongs to the technical field of computer software, and relates to a technology for automatically constructing knowledge graphs of software items, in particular to a method and system for automatically constructing knowledge graphs of software items containing multi-source heterogeneous resources. Background technique [0002] A software project usually includes multiple types of software resources, such as source code, Q&A documents, requirements / design documents, defect reports, mailing lists, etc. In the process of software reuse, in order to help users quickly and effectively obtain the software resources they need, it is necessary to mine and utilize rich domain-specific knowledge. At present, acquiring domain-specific knowledge in software projects is a time-consuming and labor-intensive process for multiplexers, and the process contains the following two main difficulties: [0003] 1) Software resource data is usually multi-sourc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F8/74
CPCG06F8/74
Inventor 谢冰沈琦林泽琦邹艳珍赵俊峰
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products