Code searching method based on semantics

A search method and code technology, applied in the field of semantic-based code search, can solve problems such as inability to accurately locate class names, method names, variables, rely only on keywords, and inaccurate search results, so as to improve efficiency and accuracy Accuracy, rich search results, and improved precision

Inactive Publication Date: 2011-05-18
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF4 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Current code search engines such as Google code search and Koders mainly retrieve files of some open source projects on the Internet, ignoring code fragments in large blogs or forums, resulting in a relatively narrow search scope
They mainly use full-text indexing technology to index publicly released code files, so as to quickly locate the searched code, but they do not identify the structural information of the code, and cannot accurately locate descriptions such as class names, method names, and variables. information, making search results inaccurate
[0003] Some current mainstream search engines have the following defects: 1. The code search range is small, and only the files of some open source projects in the network are retrieved; 2. The search results are inaccurate, because the retrieval method is a full-text search, and the structure of the code cannot be used Information (such as class name, method name, variable name) to retrieve
Therefore, the existing search technology has shortcomings such as relying only on keywords and limited search scope.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Code searching method based on semantics
  • Code searching method based on semantics
  • Code searching method based on semantics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The system adopted in the present invention is divided into four modules: a data collection module (Data Collection), a data analysis module (Data Analysis), a data sorting module (Data Sort), and a user interaction module (User Interaction).

[0022] The implementation of each module is described below:

[0023] 1. Data collection module

[0024] There are two ways to obtain ICS data sources: one is to call Google Code Search and Koders, the mainstream code search engines in the market, to obtain the first 10 pages of search results according to the keywords entered by users (in order to improve search efficiency, the precision rate Under the premise of little impact, the system intercepts the first ten pages as search results); the second is to use the crawler tool JoBo to pre-set some website addresses in the configuration file, such as CSDN, CVS knowledge base, Subversion knowledge base, etc. Moment Crawler will automatically search for the code source under the we...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a code searching method based on semantics. A system adopted in the method mainly comprises a data collection module, a data analysis module, a data sorting module and a user interaction module, wherein a crawler tool JoBo in the data collection module can preset certain website addresses in a configuration file, a code source is fetched from a preset forum and blog to most effectively fetch the webpage at the highest speed; the code source fetched by the crawler tool JoBo is subjected to semantic analysis by an abstract syntax tree AST frame in an open-source tool JDT (Java Development Tools ); the data sorting module is matched according to keywords input by a user; and after a corresponding search result is obtained by analysis, factors on five aspects are comprehensively considered, and the search results are successively sorted and displayed to the user from the higher score to the lower score by sorting data. On the basis of utilizing the traditional best search engine, the code searching method correspondingly expands semantic information identification and sorting, defines a search range by configuring a crawler, improves the search efficiency and the precision accuracy, and considers user favor.

Description

technical field [0001] The invention relates to a code search method, in particular to a semantic-based code search method. Background technique [0002] The current code search engines such as Google code search and Koders mainly search the files of some open source projects in the network, ignoring the code fragments in large blogs or forums, resulting in a relatively narrow search scope. They mainly use full-text indexing technology to index publicly released code files, so as to quickly locate the searched code, but they do not identify the structural information of the code, and cannot accurately locate descriptions such as class names, method names, and variables. information, making the search results inaccurate. [0003] Some current mainstream search engines have the following defects: 1. The code search range is small, and only the files of some open source projects in the network are retrieved; 2. The search results are inaccurate, because the retrieval method is...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 钱巨黄志球刘通洪宏
Owner NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products