Method and system for extracting knowledge graph from software project data and answering knowledge graph

A software project and knowledge graph technology, applied in the field of computer software, can solve problems such as difficult analysis and mining, multi-source heterogeneity, lack of association, etc., to achieve the effect of friendly and easy-to-use automatic question and answer support and good query effect

Active Publication Date: 2018-12-07
PEKING UNIV
View PDF5 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Aiming at the problems of multi-source heterogeneity, lack of correlation, and difficulty in analysis and mining of current software project data, the purpose of the present invention is to provide a method and system for extracting knowledge graphs from software project data and answering question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for extracting knowledge graph from software project data and answering knowledge graph
  • Method and system for extracting knowledge graph from software project data and answering knowledge graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0033] In this embodiment, the user needs to extract the knowledge map from the data of the open source software project Apache Lucene. Specific to various different types of data, including:

[0034] 82.4MB source code data;

[0035] 368MB git repository data;

[0036] 1.98GB defect report data;

[0037] 1.08GB mail data;

[0038] · 171MB StackOverflow Q&A document data.

[0039] Through module 1 and module 2, the present invention can automatically extract corresponding entities and associated relationships from these data, and store them in the neo4j graph database. The following are some examples of extracted entities and relationships:

[0040] The class IndexReader is an entity, and the method maxDoc is also an entity. The former has an edge of type "declaration method" pointing to the latter;

[0041] The class AutomaticReader is an entity, and there is an edge of type "inheritance" pointing to the class IndexReader;

[0042] A developer entity named Alex can be ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for extracting a knowledge graph from software project data and answering the knowledge graph. The method comprises the following steps that for each type of software project data in a software project database, entities and an incidence relation between entities are extracted from the type of the software project data to be stored in a corresponding graph database; based on a traceability correlation technology of software data, correlation processing is carried out on the data in each graph database, and the correlation between entities of different types of software project data is obtained; and a corresponding edge is added into each graph database according to the incidence relation between entities of different types of software project data, and the entities with different sources are connected to generate the knowledge graph of the software project data; for the input natural language query statement, a matched communication sub-graphis inquired from the knowledge graph to serve as an answer. By virtue of the method and system, the problems of data correlation deletion, serious information isolation and difficulty in connection query and analysis of the software project are solved.

Description

technical field [0001] The invention relates to a method and system for extracting a knowledge map from software project data and asking and answering questions, belonging to the technical field of computer software. Background technique [0002] Reuse of existing large-scale software projects is an important way to improve software productivity and software quality of software enterprises. The premise of successful software reuse is that the reuser can quickly and correctly learn and understand a large amount of relevant knowledge in software projects, such as domain concepts, system architecture, interface design, change history, and so on. This knowledge is contained in the multi-source heterogeneous data generated during the entire life cycle of a software project, such as: source code, requirements documents, design documents, version library, defect library, email records, forum discussions, technical blogs, etc. [0003] At present, a large number of researchers in t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F8/75
CPCG06F8/75
Inventor 谢冰林泽琦邹艳珍赵俊峰
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products