Open source project personalized retrieval recommendation method based on GitHub software warehouse data set

A technology of software warehouse and recommendation method, which is applied in digital data information retrieval, unstructured text data retrieval, electronic digital data processing, etc. It can solve the problem that it is difficult for developers to easily search for open source software, and achieve similar semantics Accuracy judgment, improvement of accuracy, and improvement of discovery efficiency

Pending Publication Date: 2021-03-26
SHANGHAI MARITIME UNIVERSITY
View PDF2 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

With the increase of open source software project resources, it is difficult for developers to easily search for high-quality open source software in a short period of time using this traditional retrieval method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Open source project personalized retrieval recommendation method based on GitHub software warehouse data set
  • Open source project personalized retrieval recommendation method based on GitHub software warehouse data set
  • Open source project personalized retrieval recommendation method based on GitHub software warehouse data set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The following is attached Figure 1~3 And specific implementation mode The open source project personalized retrieval and recommendation method, electronic device and readable storage medium based on the GitHub software warehouse data set provided by the present invention are further described in detail. The advantages and features of the present invention will become clearer from the following description. It should be noted that the drawings are in a very simplified form and all use imprecise scales, which are only used to facilitate and clearly assist the purpose of illustrating the embodiments of the present invention. In order to make the objects, features and advantages of the present invention more comprehensible, please refer to the accompanying drawings. It should be noted that the structures, proportions, sizes, etc. shown in the drawings attached to this specification are only used to match the content disclosed in the specification, for those who are famili...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an open source project personalized retrieval recommendation method based on a GitHub software warehouse data set, which comprises the following steps: preprocessing a GitHub activity data set to form a title-description-URL data set and a title-Star-watch-fork data set; establishing a keyword search engine based on a Milvus search engine in combination with a Bert preprocessing model, and taking a title-description-URL data set as a search data source; receiving a query keyword input by a user, and performing software resource retrieval positioning by using the keywordsearch engine to obtain an open source project candidate set; performing quality scoring on each candidate item in the open source item candidate set according to the "title- Star-watch-fork" data set; and recommending the candidate items of the Top-N to the user according to a quality scoring result. According to the invention, the quality of the open source software project is evaluated, so thatthe quality of the search result is improved, and the referenceability of the search project is improved.

Description

technical field [0001] The invention belongs to the technical field of personalized recommendation for open source project retrieval, and in particular relates to a method for personalized retrieval and recommendation of open source projects based on a GitHub software warehouse data set, electronic equipment and a readable storage medium. Background technique [0002] Bert is a pre-trained language representation method released by Google. It trains a general-purpose language understanding model on a large text corpus, and then uses the model to perform downstream NLP tasks, such as answering questions and judging emotions. Compared with the previous Word2Vec or Elmo methods, it is the first unsupervised, deep two-way system for pre-training NLP, so in downstream NLP tasks, Bert's effect is far better than previous methods. Experiments have proved that Bert has refreshed the best indicators on 11 natural language understanding tasks. [0003] Milvus is an open source vector...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/9535G06K9/62
CPCG06F16/9535G06F16/3344G06F18/214
Inventor 傅栩萌任洪敏
Owner SHANGHAI MARITIME UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products