Transformer-based code programming language classification method
A programming language and classification method technology, applied in the computer field, can solve problems such as performance bottlenecks, poor results, and poor classification effects, and achieve the effect of improving accuracy, improving classification effects, and easy implementation
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0041] see figure 1 As shown, the present invention provides a Transformer-based code programming language classification method, specifically including the following:
[0042] (1) Collect the content of question and answer posts in Stack Overflow, organize the data set format into , which contains 224445 pairs of code fragments and corresponding language types;
[0043] (2) Use the BPE algorithm to segment the code fragment as text, split the words and symbols in the code fragment into character sequences, and add the suffix "" at the end to avoid more "[UNK]" symbols in the training set, The BPE algorithm can effectively solve the OOV (Out-Of-Vocabulary) problem when using the test set to test the model by segmenting code fragments;
[0044](3) We divide the data in the data set into training set and verification set according to the ratio of 4:1, the number of training set is 179556, and the number of verification set is 44889; according to the identification of language t...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com