Semantic association mining method for defect report and mail list

A defect report, mailing list technology, applied in semantic analysis, other database retrieval, network data retrieval, etc., can solve the problems of complicated content, time-consuming and energy-consuming, complex association between defect reports and mailing lists, etc.

Active Publication Date: 2017-05-10
北京大学(天津滨海)新一代信息技术研究院
View PDF2 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] 1. Huge number of bug reports and mailing lists
[0007] 2. Bug reports and mailing lists are complicated
[0008] 3. The relationship between defect reports and mailing lists is complicated
[0010] To sum up, defect reports and mailing lists contain a wealth of software project information, but the amount of information is huge, the content is complicated, and the association is complicated. Manual sorting requires a lot of time and energy, and cross-document positioning of the information that developers are concerned about very difficult

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semantic association mining method for defect report and mail list
  • Semantic association mining method for defect report and mail list
  • Semantic association mining method for defect report and mail list

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0111] In this implementation example, the semantic association between the defect report of the project Lucene and the mailing list is mined, and the graph database Neo4j is used to store and query the association to verify the effect of the method.

[0112] According to the previous description, the method will obtain resources in order → mine explicit semantics → mine implicit semantics → query document execution.

[0113] The execution steps of the document acquisition and parsing unit are as follows:

[0114] Step 1: Construct the file address, such as "http: / / mail-archives.apache.org / mod_mbox / lucene-general / 201509.mbox", and crawl the two types of documents according to the address.

[0115] Step 2: Parse the project defect report according to the Json format, and store the parsed text in the database Neo4j according to the type.

[0116] Step 3: According to the MIME4J format, parse the project mailing list, and store the parsed text in the database Neo4j according to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a semantic association mining method for a defect report and a mail list. The method comprises the steps that 1, the defect report and the mail list of an acquired target project are analyzed to obtain stack information, code fragments and main body texts of the defect report and stack information, code fragments and main body texts of the mail list; 2, an explicit document semantic association mining unit identifies explicit semantic correlations between the defect report and the mail list according to an analyzing result, wherein the explicit semantic correlations comprise the citation correlation and the common code element correlation; 3, an implicit document semantic association mining unit identifies implicit semantic correlations between the defect report and the mail list according to the analyzing result, wherein the implicit semantic correlations comprise the similar correlation and the potential semantic correlation. According to the method, the related defect report and mail list can be efficiently positioned, and a developer can be helped to better reuse software resources.

Description

technical field [0001] The invention is used in the process of software reuse to dig out the semantic correlation between defect reports and mailing lists, reducing the search, reading and learning burden of developers. Background technique [0002] In the software development process, software reuse hopes to make full use of the knowledge and experience accumulated in the past application system development, avoid duplication of labor, and focus the development on the unique components of the application to improve the efficiency and quality of software development. [0003] In recent years, a large number of open source project hosting sites have emerged on the Internet, and more and more high-quality software has appeared in the field of development. In these projects, we not only provide us with rich code resources, but also some excellent and mature open source projects have produced a large number of documents in various forms. Bug reports and mailing lists are two of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F16/951G06F40/30
Inventor 赵俊峰陈秀招曹英魁
Owner 北京大学(天津滨海)新一代信息技术研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products