Automatic annotating method for subjects of open source software
A technology of automatic labeling and open source software, which is applied in special data processing applications, instruments, electrical digital data processing, etc., and can solve the problems of LabeledLDA inapplicability, short project description text, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0027] The technical solution of the present invention will be specifically described below in conjunction with the embodiments.
[0028] Step 1, crawl the open source community, obtain the open source project data, the project data includes the open source project name, label and project description, carry out preprocessing to described project description and project label, and described preprocessing comprises: described project label After converting to its root, the tags of the same root are merged, and items with less than three tags are deleted, and the item description is converted into a word bag through word segmentation, stop word deletion, and root extraction.
[0029] In the embodiment, crawler technology and web page extraction technology are used to obtain the names, labels and project descriptions of a large number (>100K) of open source projects from open source communities (such as ohloh, sourceforge). For example, use crawler technology and web page extracti...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com