Open source community development evaluation method and system based on deep learning

By using deep learning technology and combining multiple data sources to build an open-source community evaluation system, the problem of accuracy in evaluating the development of open-source communities has been solved. This system enables reasonable evaluation of communities and prediction of future trends, and provides optimization strategies to improve the efficiency of community operations.

CN115293534BActive Publication Date: 2026-06-26SHANDONG INSPUR SCI RES INST CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANDONG INSPUR SCI RES INST CO LTD
Filing Date
2022-07-22
Publication Date
2026-06-26

Smart Images

  • Figure CN115293534B_ABST
    Figure CN115293534B_ABST
Patent Text Reader

Abstract

The application discloses an open source community development evaluation method and system based on deep learning, belongs to the technical field of open source community evaluation, and aims to solve the technical problem of how to effectively utilize deep learning technology, based on open source community operation data and in combination with external internet related data such as social media and search engines, to more reasonably and accurately evaluate the development of the open source community and provide guidance suggestions. The method comprises the following steps: collecting historical evaluation data from multiple data sources based on time, and labeling the development of the open source community; and performing model training on an open source community feature generator, an open source community portrait model, an open source community analyzer, an open source community development evaluation portrait model, an open source community development predictor and an open source community development optimizer through a gradient descent optimization algorithm; and collecting evaluation data from multiple data sources based on a set time point, taking the evaluation data as input, and evaluating and guiding the open source community through the trained model.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of open source community evaluation technology, specifically to a deep learning-based method and system for evaluating the development of open source communities. Background Technology

[0002] With the rapid development of deep learning technology and the support of massive data and efficient computing power in the Internet and cloud computing era, deep learning technologies, represented by CNN convolutional neural networks and RNN recurrent neural networks, have achieved breakthroughs in computer vision, speech recognition, natural language understanding and other fields by training and building large-scale neural networks similar to the structure of the human brain. They are bringing disruptive changes to the whole society.

[0003] In recent years, the development of new-generation information technology has been rapid, and the important value of open source has become increasingly prominent. Whether it is operating systems, databases, cloud computing, big data, or artificial intelligence, many commercial software are built on open source. Open source software has become the source of innovation and "standard parts library" of the global software industry. Among them, the open source community is the biggest difference between open source software and traditional closed source software, and it is also the most critical factor for the success of open source software. As a platform for resource aggregation, the open source community promotes the development of open source software by connecting global developers, users, and partners to collaborate in the community.

[0004] Open governance has always been a focal point in the open-source field. How to better leverage the power of the community and realize the commercial and social value of open-source projects is a challenge for community operators. Accurate evaluation of the development of open-source communities is a prerequisite for improving community governance and operational efficiency. However, current evaluations of the maturity of open-source communities are mostly based on superficial data provided by code hosting platforms, such as the number of users, stars, forks, and issues. Evaluations based on these metrics often deviate significantly from the actual development of the community, especially given the difficulty in preventing community operators from maliciously inflating star and fork counts. Therefore, effectively utilizing deep learning technology, based on open-source community operational data and integrating external internet data from social media, search engines, and other sources, to conduct a more reasonable and accurate evaluation of open-source community development and provide guidance has become an urgent problem to solve. Summary of the Invention

[0005] The technical objective of this invention is to address the above-mentioned shortcomings by providing a method and system for evaluating the development of open-source communities based on deep learning. This aims to solve the technical problem of how to effectively utilize deep learning technology, based on open-source community operation data and integrating external Internet-related data such as social media and search engines, to conduct a more reasonable and accurate evaluation of the development of open-source communities and provide guidance and suggestions.

[0006] In a first aspect, the present invention provides a method for evaluating the development of open-source communities based on deep learning, comprising the following steps:

[0007] An open-source community feature generator, OSC-GenFv, is built based on a neural network model. The OSC-GenFv is used to extract and fuse features from evaluation data from multiple data sources to obtain the feature vector OSC-Vect.

[0008] An open-source community profiling model, OSC-Snap-Profiler, is constructed. The OSC-Snap-Profiler is used to create a data profile of the open-source community based on the feature vector OSC-Vect, and outputs the data label and quantitative score OSC-Tag of the open-source community.

[0009] An open-source community analyzer, OSC-Analysis, is constructed based on a convolutional neural network model with a multi-head self-attention mechanism. The OSC-Analysis analyzer is used to analyze the open-source community based on the feature vector sequence OSC-Seq, which is composed of time-ordered feature vectors OSC-Vect.

[0010] An open-source community development evaluation profile model OSC-Profiler is constructed. The open-source community development evaluation profile model OSC-Profiler works with the open-source community analyzer OSC-Analysis to create a data profile of the open-source community based on the output of the open-source community analyzer OSC-Analysis, and outputs the data tags and quantitative scores OSC-Tag of the open-source community.

[0011] An open-source community development predictor, OSC-Predict, is constructed. The open-source community predictor works in conjunction with the open-source community analyzer, OSC-Analysis, to predict the development trend of the open-source community based on the output of the open-source community analyzer, OSC-Analysis, and predicts the output of the feature vector OSC-Next-Vect for the next time point.

[0012] An open-source community development optimizer, OSC-Optimize, is constructed. OSC-Optimize is used to optimize the development of the open-source community based on the feature vector OSC-Vect and data profile OSC-Tag corresponding to the current open-source community, and outputs improvement and optimization strategies.

[0013] Historical evaluation data for evaluating open-source communities is collected from multiple data sources based on time, and the development status of open-source communities is labeled. Based on the historical evaluation data and labels, the open-source community feature generator OSC-GenFv, the open-source community profile model OSC-Snap-Profiler, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize are trained using the gradient descent optimization algorithm.

[0014] Evaluation data is collected from multiple data sources based on set time points. Using the evaluation data as input, the trained open-source community feature generator OSC-GenFv, open-source community profile model, open-source community analyzer OSC-Analysis, open-source community development evaluation profile model OSC-Profiler, open-source community development predictor OSC-Predict, and open-source community development optimizer OSC-Optimize are used to evaluate and guide the open-source community, thereby obtaining the improvement and optimization strategy for the open-source community.

[0015] Preferably, the plurality of data sources include:

[0016] The code hosting platform of the open source community, the evaluation data of the code hosting platform includes the general data metric Git-Index and the code hosting platform log text data Git-Log. The general data metric Git-Index includes the number of stars, the number of forks, the number of issues, the number of merges, the number of contributors, the number of documents, the number of dependent libraries, and the update frequency.

[0017] The pipeline of the open source community, and the corresponding evaluation data of the pipeline of the open source community includes website documentation data, news data, discussion group data and wiki data;

[0018] Internet search results from the open-source community, where the evaluation data for the internet search results from the open-source community are based on the keyword OSC-Search and obtained using multiple search engines;

[0019] Media discussions within the open-source community, where the corresponding evaluation data is the social media discussion data of the open-source community, OSC-Social.

[0020] Preferably, the open-source community feature generator OSC-GenFv includes:

[0021] The code platform-related data feature vector generator Git-GenFv includes a data normalization module, a BERT-based semantic model and a time series feature extractor, and a fusion module. The data normalization module is used to normalize the general data index Git-Index to obtain feature vectors. The feature extractor is used to extract features from the code hosting platform log text data Git-Log to obtain feature vectors. The fusion module is used to fuse the feature vectors output by the data normalization module and the feature extractor to obtain the final feature vector Git-Vect.

[0022] The Open Source Project Official Website Related Data Feature Vector Generator Website-GenFv is a text recognition neural network model based on a language model. It is used to extract features from the evaluation data of the pipeline network whose data source is the open source community, and obtain the feature vector Website-Vect.

[0023] Search-GenFv, an Internet search-related data feature vector generator, is used to extract features from evaluation data of Internet searches from open-source communities based on a text recognition semantic extraction model, and obtain the feature vector Search-Vect.

[0024] The Social-GenFv feature vector generator is used to extract features from evaluation data of media discussions from open-source communities using a neural network model based on text recognition and sentiment analysis, and obtain the feature vector Social-Vect.

[0025] The feature vector fusion engine OSC-FusFv fuses the feature vectors Git-Vect, Website-Vect, Search-Vect, and Social-Vect through a fully connected layer, and adds a timestamp to generate the feature vector OSC-Vect.

[0026] Preferably, based on the historical evaluation data and tags, the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler are optimized sequentially; the parameters of the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler are optimized; the parameters of the open-source community development predictor OSC-Predict are optimized; and the parameters of the open-source community development optimizer OSC-Optimize are optimized.

[0027] The parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler are optimized, including the following steps:

[0028] The open-source community feature generator OSC-GenFv is connected to the open-source community profile model OSC-Snap-Profiler. Based on the acquired general data metric Git-Index, the data normalization calculation method of the data normalization module is set. Based on the existing general model of BERT, the initialization parameters of the feature extractor are set, the fusion module is initialized, the parameter values ​​of the feature vectors Website-Vect, Search-Vect, and Social-Vect are fixed, and the gradient descent optimization algorithm is used to train the code platform-related data feature vector generator Git-GenFv, the feature fusion module OSC-FusFv, and the open-source community profile model OSC-Snap-Profiler.

[0029] The model parameters of the Git-GenFv feature vector generator for the code platform and the parameter values ​​of the feature vectors Search-Vect and Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Website-GenFv feature vector generator for the open source project website, the OSC-FusFv fusion engine, and the OSC-Snap-Profiler open source community profile model.

[0030] The model parameters of Git-GenFv (the code platform-related data feature vector generator) and Website-GenFv (the open-source project website-related data feature vector generator) are fixed, and the parameter values ​​of the feature vector Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Internet search-related data feature vector generator Search-GenFv, the fusion machine OSC-FusFv, and the open-source community profile model OSC-Snap-Profiler.

[0031] The model parameters of Git-GenFv (data feature vector generator related to the code platform), Website-GenFv (data feature vector generator related to the open source project website), and Search-GenFv (data feature vector generator related to internet search) are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train Social-GenFv (data feature vector generator related to social media), OSC-FusFv (fusion engine), and OSC-Snap-Profiler (open source community profiling module).

[0032] The open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler are connected. Based on the labeled data, the gradient descent optimization algorithm is used to train the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler to obtain the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler.

[0033] The training of the open-source community analyzer OSC-Analysis and the open-source community development evaluation profiling model OSC-Profiler includes the following steps:

[0034] Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0035] Based on the open-source community profiling model OSC-Snap-Profiler, set the model initialization parameters of the open-source community development evaluation profiling model OSC-Profiler;

[0036] Connect the open-source community analyzer OSC-Analysis and the open-source community development evaluation and profiling model OSC-Profiler;

[0037] Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0038] Using the vector sequence OSC-Seq as input, and based on the labeled data, the gradient descent optimization algorithm is used to train the open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler, resulting in the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler.

[0039] Training the open-source community development predictor OSC-Predict involves the following steps:

[0040] By fixing the model parameters of the open-source community analyzer OSC-Analysis, the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict are connected to form a model network;

[0041] Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0042] Using the vector sequence OSC-Seq as input, the feature vector OSC-Next-Vect for the next time point is generated through the model parameters of the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict. The error between the two is calculated, and the error is backpropagated to update the model parameters of the open-source community development predictor OSC-Predict, thus obtaining the trained open-source community development predictor OSC-Predict.

[0043] Training the OSC-Optimize open-source community-developed optimizer model includes the following steps:

[0044] Based on the operation of the open source community, various improvement and optimization strategies were set up;

[0045] Using the acquired training data as input, the feature vector OSC-Vect is generated through the trained open-source community feature extractor OSC-GenFv. Based on the trained open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, a data profile of the open-source community is created, resulting in the data label and quantitative score OSC-Tag of the open-source community.

[0046] Based on the data tags and quantitative scores OSC-Tag from the open-source community, set the tags and quantitative scores to be optimized, and mark the corresponding improvement and optimization strategies;

[0047] The gradient descent optimization algorithm was used to train the open-source community development optimizer OSC-Optimize, resulting in the trained open-source community development optimizer OSC-Optimize.

[0048] Preferably, evaluation data is collected from multiple data sources based on set time points. Using this evaluation data as input, the trained open-source community feature generator OSC-GenFv, open-source community profiling model, open-source community analyzer OSC-Analysis, open-source community development evaluation profiling model OSC-Profiler, open-source community development predictor OSC-Predict, and open-source community development optimizer OSC-Optimize are used to evaluate and guide the open-source community, resulting in improvement and optimization strategies. This includes the following steps:

[0049] Based on the set time points, evaluation data is obtained from multiple data sources;

[0050] For the evaluation data obtained from the multiple data sources, feature extraction and feature fusion are performed using the trained open-source community feature generator OSC-GenFv to obtain the feature vector OSC-Vect.

[0051] Based on the recommended time points, generate the feature vector sequence OSC-Seq of the feature vector OSC-Vect;

[0052] Using the feature vector sequence OSC-Seq as input, the open source community is profiled by the trained open source community analyzer OSC-Analysis and the open source community development evaluation and profiling model OSC-Profiler, generating data tags and quantitative scores OSC-Tags for the open source community.

[0053] Using the feature vector sequence OSC-Seq as input, the trained open-source community analyzer OSC-Analysis and open-source community development predictor OSC-Predict are used to generate the feature vector OSC-Next-Vect for future moments. Using the feature vector OSC-Next-Vect as input, the trained open-source community analyzer OSC-Analysis and open-source community development evaluation and profiling model OSC-Profiler are used to continuously perform data profiling analysis on the open-source community, generating data tags and quantitative scores OSC-Tag for the open-source community.

[0054] Combining the community profiles of the two parties, a series of open source community data tags and quantitative scores (OSC-Tag) are output in chronological order for evaluating the current and future development of the open source community.

[0055] Simulate open-source community development scenarios to generate various evaluation directions for the future development of open-source communities;

[0056] Based on the time-series open-source community feature vector OSC-Vect and the data labels and quantitative scores of the open-source community obtained from the community profile OSC-Tag, the trained open-source community development optimizer OSC-Optimize is used to optimize the open-source community and generate improved optimization strategies.

[0057] We regularly acquire evaluation data from multiple data sources and continuously optimize the model parameters of the open-source community feature generator OSC-GenFv, the open-source community profile model, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize.

[0058] Secondly, the deep learning-based open-source community development evaluation system of the present invention is used to evaluate and guide the development of open-source communities through the deep learning-based open-source community development evaluation as described in any one of the first aspects, the system comprising:

[0059] The module includes a model building module, which is used to construct an open-source community feature generator (OSC-GenFv) based on a neural network model. OSC-GenFv extracts and fuses features from evaluation data from multiple data sources to obtain feature vectors (OSC-Vect). It also constructs an open-source community profiling model (OSC-Snap-Profiler) to create a data profile of the open-source community based on the feature vectors (OSC-Vect), outputting data labels and quantitative scores (OSC-Tag). Furthermore, it constructs an open-source community analyzer (OSC-Analysis) based on a convolutional neural network model with a multi-head self-attention mechanism. OSC-Analysis analyzes the open-source community based on a feature vector sequence (OSC-Seq), which is composed of time-ordered feature vectors (OSC-Vect). Finally, it constructs an open-source community development evaluation profile model (OSC-Seq). C-Profiler, the open-source community development evaluation and profiling model OSC-Profiler, works in conjunction with the open-source community analyzer OSC-Analysis to create a data profile of the open-source community based on the output of OSC-Analysis, outputting data tags and quantitative scores (OSC-Tag) for the open-source community; it is also used to construct an open-source community development predictor OSC-Predict, which, in conjunction with OSC-Analysis, predicts the development trend of the open-source community based on the output of OSC-Analysis, predicting and outputting the feature vector OSC-Next-Vect for the next time point; and it is used to construct an open-source community development optimizer OSC-Optimize, which optimizes the development of the open-source community based on the current feature vector OSC-Vect and the data profile OSC-Tag, outputting improvement and optimization strategies.

[0060] The model training module is used to collect historical evaluation data for evaluating open source communities from multiple data sources based on time, and to label the development status of open source communities. Based on the historical evaluation data and labels, the module is used to train the open source community feature generator OSC-GenFv, the open source community profile model OSC-Snap-Profiler, the open source community analyzer OSC-Analysis, the open source community development evaluation profile model OSC-Profiler, the open source community development predictor OSC-Predict, and the open source community development optimizer OSC-Optimize using the gradient descent optimization algorithm.

[0061] The community evaluation guidance module is used to collect evaluation data from multiple data sources based on a set time point. Using the evaluation data as input, the module evaluates and guides the open-source community through the trained open-source community feature generator OSC-GenFv, open-source community profile model, open-source community analyzer OSC-Analysis, open-source community development evaluation profile model OSC-Profiler, open-source community development predictor OSC-Predict, and open-source community development optimizer OSC-Optimize, thereby obtaining improvement and optimization strategies for the open-source community.

[0062] Preferably, the plurality of data sources include:

[0063] The code hosting platform of the open source community, the evaluation data of the code hosting platform includes the general data metric Git-Index and the code hosting platform log text data Git-Log. The general data metric Git-Index includes the number of stars, the number of forks, the number of issues, the number of merges, the number of contributors, the number of documents, the number of dependent libraries, and the update frequency.

[0064] The pipeline of the open source community, and the corresponding evaluation data of the pipeline of the open source community includes website documentation data, news data, discussion group data and wiki data;

[0065] Internet search results from the open-source community, where the evaluation data for the internet search results from the open-source community are based on the keyword OSC-Search and obtained using multiple search engines;

[0066] Media discussions within the open-source community, where the corresponding evaluation data is the social media discussion data of the open-source community, OSC-Social.

[0067] Preferably, the open-source community feature generator OSC-GenFv includes:

[0068] The code platform-related data feature vector generator Git-GenFv includes a data normalization module, a BERT-based semantic model and a time series feature extractor, and a fusion module. The data normalization module is used to normalize the general data index Git-Index to obtain feature vectors. The feature extractor is used to extract features from the code hosting platform log text data Git-Log to obtain feature vectors. The fusion module is used to fuse the feature vectors output by the data normalization module and the feature extractor to obtain the final feature vector Git-Vect.

[0069] The Open Source Project Official Website Related Data Feature Vector Generator Website-GenFv is a text recognition neural network model based on a language model. It is used to extract features from the evaluation data of the pipeline network whose data source is the open source community, and obtain the feature vector Website-Vect.

[0070] Search-GenFv, an Internet search-related data feature vector generator, is used to extract features from evaluation data of Internet searches from open-source communities based on a text recognition semantic extraction model, and obtain the feature vector Search-Vect.

[0071] The Social-GenFv feature vector generator is used to extract features from evaluation data of media discussions from open-source communities using a neural network model based on text recognition and sentiment analysis, and obtain the feature vector Social-Vect.

[0072] The feature vector fusion engine OSC-FusFv fuses the feature vectors Git-Vect, Website-Vect, Search-Vect, and Social-Vect through a fully connected layer, and adds a timestamp to generate the feature vector OSC-Vect.

[0073] Preferably, the model training module is used to optimize the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize, based on the historical evaluation data and labels.

[0074] The model training module is used to optimize the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler through the following steps:

[0075] The open-source community feature generator OSC-GenFv is connected to the open-source community profile model OSC-Snap-Profiler. Based on the acquired general data metric Git-Index, the data normalization calculation method of the data normalization module is set. Based on the existing general model of BERT, the initialization parameters of the feature extractor are set, the fusion module is initialized, the parameter values ​​of the feature vectors Website-Vect, Search-Vect, and Social-Vect are fixed, and the gradient descent optimization algorithm is used to train the code platform-related data feature vector generator Git-GenFv, the feature fusion module OSC-FusFv, and the open-source community profile model OSC-Snap-Profiler.

[0076] The model parameters of the Git-GenFv feature vector generator for the code platform and the parameter values ​​of the feature vectors Search-Vect and Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Website-GenFv feature vector generator for the open source project website, the OSC-FusFv fusion engine, and the OSC-Snap-Profiler open source community profile model.

[0077] The model parameters of Git-GenFv (the code platform-related data feature vector generator) and Website-GenFv (the open-source project website-related data feature vector generator) are fixed, and the parameter values ​​of the feature vector Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Internet search-related data feature vector generator Search-GenFv, the fusion machine OSC-FusFv, and the open-source community profile model OSC-Snap-Profiler.

[0078] The model parameters of Git-GenFv (data feature vector generator related to the code platform), Website-GenFv (data feature vector generator related to the open source project website), and Search-GenFv (data feature vector generator related to internet search) are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train Social-GenFv (data feature vector generator related to social media), OSC-FusFv (fusion engine), and OSC-Snap-Profiler (open source community profiling module).

[0079] The open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler are connected. Based on the labeled data, the gradient descent optimization algorithm is used to train the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler to obtain the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler.

[0080] The model training module is used to train the open-source community analyzer OSC-Analysis and the open-source community development evaluation and profiling model OSC-Profiler through the following steps:

[0081] Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0082] Based on the open-source community profiling model OSC-Snap-Profiler, set the model initialization parameters of the open-source community development evaluation profiling model OSC-Profiler;

[0083] Connect the open-source community analyzer OSC-Analysis and the open-source community development evaluation and profiling model OSC-Profiler;

[0084] Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0085] Using the vector sequence OSC-Seq as input, and based on the labeled data, the gradient descent optimization algorithm is used to train the open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler, resulting in the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler.

[0086] The model training module is used to train the open-source community development predictor OSC-Predict through the following steps:

[0087] By fixing the model parameters of the open-source community analyzer OSC-Analysis, the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict are connected to form a model network;

[0088] Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0089] Using the vector sequence OSC-Seq as input, the feature vector OSC-Next-Vect for the next time point is generated through the model parameters of the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict. The error between the two is calculated, and the error is backpropagated to update the model parameters of the open-source community development predictor OSC-Predict, thus obtaining the trained open-source community development predictor OSC-Predict.

[0090] The model training module is used to train the open-source community-developed optimizer OSC-Optimize model through the following steps:

[0091] Based on the operation of the open source community, various improvement and optimization strategies were set up;

[0092] Using the acquired training data as input, the feature vector OSC-Vect is generated through the trained open-source community feature extractor OSC-GenFv. Based on the trained open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, a data profile of the open-source community is created, resulting in the data label and quantitative score OSC-Tag of the open-source community.

[0093] Based on the data tags and quantitative scores OSC-Tag from the open-source community, set the tags and quantitative scores to be optimized, and mark the corresponding improvement and optimization strategies;

[0094] The gradient descent optimization algorithm was used to train the open-source community development optimizer OSC-Optimize, resulting in the trained open-source community development optimizer OSC-Optimize.

[0095] Preferably, the community evaluation guidance module is used to guide and evaluate the open-source community through the following steps:

[0096] Based on the set time points, evaluation data is obtained from multiple data sources;

[0097] For the evaluation data obtained from the multiple data sources, feature extraction and feature fusion are performed using the trained open-source community feature generator OSC-GenFv to obtain the feature vector OSC-Vect.

[0098] Based on the recommended time points, generate the feature vector sequence OSC-Seq of the feature vector OSC-Vect;

[0099] Using the feature vector sequence OSC-Seq as input, the open source community is profiled by the trained open source community analyzer OSC-Analysis and the open source community development evaluation and profiling model OSC-Profiler, generating data tags and quantitative scores OSC-Tags for the open source community.

[0100] Using the feature vector sequence OSC-Seq as input, the trained open-source community analyzer OSC-Analysis and open-source community development predictor OSC-Predict are used to generate the feature vector OSC-Next-Vect for future moments. Using the feature vector OSC-Next-Vect as input, the trained open-source community analyzer OSC-Analysis and open-source community development evaluation and profiling model OSC-Profiler are used to continuously perform data profiling analysis on the open-source community, generating data tags and quantitative scores OSC-Tag for the open-source community.

[0101] Combining the community profiles of the two parties, a series of open source community data tags and quantitative scores (OSC-Tag) are output in chronological order for evaluating the current and future development of the open source community.

[0102] Simulate open-source community development scenarios to generate various evaluation directions for the future development of open-source communities;

[0103] Based on the time-series open-source community feature vector OSC-Vect and the data labels and quantitative scores of the open-source community obtained from the community profile OSC-Tag, the trained open-source community development optimizer OSC-Optimize is used to optimize the open-source community and generate improved optimization strategies.

[0104] We regularly acquire evaluation data from multiple data sources and continuously optimize the model parameters of the open-source community feature generator OSC-GenFv, the open-source community profile model, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize.

[0105] The deep learning-based open-source community development evaluation method and system of the present invention have the following advantages:

[0106] 1. Taking full account of the main factors affecting the development of open source communities, based on multiple data sources such as open source code hosting platforms, community websites, Internet search engines, and social media, deep learning technology is used to model the evaluation of open source community development. An open source community feature extractor, an open source community analyzer, an open source community development evaluation profile, and an open source community development predictor are designed. Targeted feature extraction and data fusion are carried out to uncover the deep connections between various factors affecting community development. Combined with expert knowledge in the field of community governance, an open source community profile is generated to achieve a more reasonable and accurate evaluation of open source community development.

[0107] 2. Compared with traditional expert evaluation methods, this approach makes the most of existing internet resources such as code hosting platforms, search engines, and social media, while also taking into account the temporal characteristics of community development. This approach provides a more reasonable and effective evaluation of the development status of open source communities. On the other hand, it also avoids situations where community operators maliciously inflate the number of stars and forks, and allows for a more reasonable assessment of the maturity and development potential of open source communities.

[0108] 3. By predicting future development trends of open source communities and simulating open source community development scenarios in a targeted manner, we propose corresponding community operation and governance optimization strategies to guide the improvement of community operation and governance efficiency, thereby enhancing the value and influence of open source communities;

[0109] 4. In addition, we continuously collect relevant data from the open source community and continuously optimize the evaluation model to meet the personalized development needs of the open source community. Attached Figure Description

[0110] To more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0111] The invention will be further described below with reference to the accompanying drawings.

[0112] Figure 1 Example 2 shows a working principle block diagram of an open-source community development evaluation method based on deep learning. Detailed Implementation

[0113] The present invention will be further described below with reference to the accompanying drawings and specific embodiments, so that those skilled in the art can better understand and implement the present invention. However, the embodiments are not intended to limit the present invention. In the absence of conflict, the embodiments of the present invention and the technical features in the embodiments can be combined with each other.

[0114] This invention provides a method and system for evaluating the development of open-source communities based on deep learning. It addresses the technical problem of how to effectively utilize deep learning technology, based on open-source community operation data and integrating external Internet-related data such as social media and search engines, to conduct a more reasonable and accurate evaluation of the development of open-source communities and provide guidance and suggestions.

[0115] Example 1:

[0116] This invention discloses a deep learning-based method for evaluating the development of open-source communities, comprising the following steps:

[0117] S100. An open-source community feature generator OSC-GenFv is constructed based on a neural network model. The open-source community feature generator OSC-GenFv is used to extract and fuse features from evaluation data from multiple data sources to obtain feature vector OSC-Vect.

[0118] An open-source community profiling model, OSC-Snap-Profiler, is constructed. The OSC-Snap-Profiler is used to create a data profile of the open-source community based on the feature vector OSC-Vect, and outputs the data label and quantitative score OSC-Tag of the open-source community.

[0119] An open-source community analyzer, OSC-Analysis, is constructed based on a convolutional neural network model with a multi-head self-attention mechanism. The OSC-Analysis analyzer is used to analyze the open-source community based on the feature vector sequence OSC-Seq, which is composed of time-ordered feature vectors OSC-Vect.

[0120] An open-source community development evaluation profile model OSC-Profiler is constructed. The open-source community development evaluation profile model OSC-Profiler works with the open-source community analyzer OSC-Analysis to create a data profile of the open-source community based on the output of the open-source community analyzer OSC-Analysis, and outputs the data tags and quantitative scores OSC-Tag of the open-source community.

[0121] An open-source community development predictor, OSC-Predict, is constructed. The open-source community predictor works in conjunction with the open-source community analyzer, OSC-Analysis, to predict the development trend of the open-source community based on the output of the open-source community analyzer, OSC-Analysis, and predicts the output of the feature vector OSC-Next-Vect for the next time point.

[0122] An open-source community development optimizer, OSC-Optimize, is constructed. OSC-Optimize is used to optimize the development of the open-source community based on the feature vector OSC-Vect and data profile OSC-Tag corresponding to the current open-source community, and outputs improvement and optimization strategies.

[0123] S200. Collect historical evaluation data from multiple data sources based on time to evaluate the open source community, and label the development status of the open source community. Based on the historical evaluation data and labels, train the open source community feature generator OSC-GenFv, the open source community profile model OSC-Snap-Profiler, the open source community analyzer OSC-Analysis, the open source community development evaluation profile model OSC-Profiler, the open source community development predictor OSC-Predict, and the open source community development optimizer OSC-Optimize using the gradient descent optimization algorithm.

[0124] S300. Based on the set time points, evaluation data is collected from multiple data sources. Using the evaluation data as input, the trained open-source community feature generator OSC-GenFv, open-source community profile model, open-source community analyzer OSC-Analysis, open-source community development evaluation profile model OSC-Profiler, open-source community development predictor OSC-Predict, and open-source community development optimizer OSC-Optimize are used to evaluate and guide the open-source community, thereby obtaining the improvement and optimization strategy for the open-source community.

[0125] This embodiment fully considers the main factors influencing the development of open-source communities, making full use of data resources such as open-source community operation platforms, internet search engines, and social media. It employs deep learning technology to model the evaluation of open-source community development, designing an open-source community feature extractor, an open-source community analyzer, an open-source community development evaluation profile, and an open-source community development predictor. By comprehensively considering various factors affecting community development and combining expert knowledge in the field of community governance, it uncovers the deep connections between various elements influencing community development, generating an open-source community profile and achieving a more reasonable and accurate evaluation of open-source community development. By predicting future open-source community development trends and simulating targeted development scenarios, it proposes corresponding community operation and governance optimization strategies to enhance the commercial value of open-source communities.

[0126] In practice, the data sources include open-source community code hosting platforms, open-source community networks, open-source community internet searches, and open-source community media discussions. The evaluation data for open-source community code hosting platforms includes the general data metric Git-Index and the code hosting platform log text data Git-Log. The general data metric Git-Index includes the number of stars, forks, issues, merges, contributors, documents, dependencies, and update frequency. The evaluation data for open-source community networks includes website documentation data, news data, discussion group data, and wiki data. The evaluation data for open-source community internet searches is the query results obtained using multiple search engines based on the keyword OSC-Search. The evaluation data for open-source community media discussions is the social media discussion data OSC-Social.

[0127] Corresponding to the above data sources, the open-source community feature generator OSC-GenFv in this embodiment includes Git-GenFv (code platform related data feature vector generator), Website-GenFv (open source project website related data feature vector generator), Search-GenFv (internet search related data feature vector generator), Social-GenFv (social media related data feature vector generator), and OSC-FusFv (feature vector fusion generator).

[0128] The Git-GenFv feature vector generator for code platform-related data includes a data normalization module, a BERT-based semantic model and time series feature extractor, and a fusion module. The data normalization module is used to normalize the general data metric Git-Index to obtain feature vectors. The feature extractor is used to extract features from the code hosting platform log text data Git-Log to obtain feature vectors. The fusion module is used to fuse the feature vectors output by the data normalization module and the feature extractor to obtain the final feature vector Git-Vect.

[0129] The Website-GenFv feature vector generator for data related to the official website of open source projects is a text recognition neural network model based on a language model. It is used to extract features from the evaluation data of the pipeline network whose data source is the open source community, and obtain the feature vector Website-Vect.

[0130] The Search-GenFv feature vector generator for Internet search-related data is used to extract features from evaluation data of Internet searches from open-source communities based on a text recognition semantic extraction model, and obtain the feature vector Search-Vect.

[0131] The Social-GenFv feature vector generator is used to extract features from evaluation data of media discussions from open-source communities using a neural network model based on text recognition and sentiment analysis, and obtain the feature vector Social-Vect.

[0132] The feature vector fusion engine OSC-FusFv fuses the feature vectors Git-Vect, Website-Vect, Search-Vect, and Social-Vect through a fully connected layer, and adds a timestamp to generate the feature vector OSC-Vect.

[0133] The open-source community profiling model OSC-Snap-Profiler uses the feature vectors of open-source communities, OSC-Vect, to create data profiles of open-source communities and output data tags and quantitative scores, OSC-Tag.

[0134] The core of the open-source community analyzer OSC-Analysis is a convolutional neural network model based on a multi-head self-attention mechanism. It processes the open-source community feature vector sequence OSC-Seq to obtain contextual features and combines it with the open-source community development evaluation and profiling model OSC-Profiler and the open-source community development predictor OSC-Predict to output a data profile of the open-source community and predicted feature vectors. Specifically, the open-source community feature vector sequence OSC-Seq is composed of feature vectors OSC-Vect representing the time sequence of open-source community development; the open-source community development evaluation and profiling model OSC-Profiler fully considers the time-series characteristics of open-source community development, creating a data profile of the open-source community and outputting data labels and quantitative scores OSC-Tags. The open-source community development predictor OSC-Predict predicts future community trends based on the current community development status and outputs the feature vector OSC-Next-Vect.

[0135] The open-source community development optimizer OSC-Optimize generates improved optimization strategies based on the current open-source community feature vector OSC-Vect and the input open-source community data profile OSC-Tag.

[0136] Step S200, based on the historical evaluation data and labels, sequentially optimizes the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, optimizes the parameters of the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, optimizes the parameters of the open-source community development predictor OSC-Predict, and optimizes the parameters of the open-source community development optimizer OSC-Optimize.

[0137] The parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler are optimized, including the following steps:

[0138] (1) Based on the time point, collect a large amount of data from various sources such as the open source community code hosting platform OSC-Git, the open source project community official website OSC-Website, the open source community Internet search results OSC-Search, and social media discussions OSC-Social, and label the community development status.

[0139] (2) Connect the open-source community feature extractor OSC-GenFv with the open-source community profile model OSC-Snap-Profiler. Based on the set code hosting platform OSC-Git quantization index Git-Index, set the data normalization calculation method of the code platform related data feature vector generator Git-GenFv. Based on the existing general model of BERT, set the module initialization parameters, initialize the fusion module, fix the parameter values ​​of feature vectors Website-Vect, Search-Vect and Social-Vect, and based on the labeled tags, use the gradient descent optimization algorithm to train the generator Git-GenFv, fusion OSC-FusFv and open-source community profile model OSC-Snap-Profiler of the open-source community feature extractor OSC-GenFv.

[0140] (3) Fix the parameters of the Git-GenFv model of the data feature vector generator related to the code platform and the parameter values ​​of the feature vectors Search-Vect and Social-Vect. Based on the label, the gradient descent optimization algorithm is used to train the generator Website-GenFv, the fusion unit OSC-FusFv and the open source community profile module OSC-Snap-Profiler of the open source community feature extractor OSC-GenFv.

[0141] (4) Fix the model parameters of Git-GenFv, the data feature vector generator related to the code platform, and Website-GenFv, the data feature vector generator related to the open source project website. Fix the parameter values ​​of the feature vector Social-Vect. Based on the labeled tags, use the gradient descent optimization algorithm to train the generator Search-GenFv, the fusion machine OSC-FusFv, and the open source community portrait model OSC-Snap-Profiler of the open source community feature extractor OSC-GenFv.

[0142] (5) The model parameters of Git-GenFv (data feature vector generator related to code platform), Website-GenFv (data feature vector generator related to open source project website), and Search-GenFv (data feature vector generator related to Internet search) are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the generator Social-GenFv, the fusion unit OSC-FusFv, and the open source community profile model OSC-Snap-Profiler of the open source community feature extractor OSC-GenFv.

[0143] (6) The open-source community feature extractor OSC-GenFv is connected to the open-source community portrait model OSC-Snap-Profiler. Based on the labeled data, the gradient descent optimization algorithm is used for training to obtain the model parameters of the open-source community feature extractor OSC-GenFv model and the open-source community portrait model OSC-Snap-Profiler.

[0144] The training of the open-source community analyzer OSC-Analysis and the open-source community development evaluation profiling model OSC-Profiler includes the following steps:

[0145] (1) Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0146] (2) Set the model initialization parameters of the open source community development evaluation profile model OSC-Profiler based on the open source community profile model OSC-Snap-Profiler;

[0147] (3) Connect the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler;

[0148] (4) Using historical evaluation data as input, generate feature vector OSC-Vect based on the trained open-source community feature generator OSC-GenFv and open-source community profile model OSC-Snap-Profiler, and generate vector sequence OSC-Seq of feature vector OSC-Vect according to the time progression.

[0149] (5) Using the vector sequence OSC-Seq as input, and based on the labeled data, the gradient descent optimization algorithm is used to train the open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler to obtain the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler.

[0150] In the Git-GenFv feature vector generator for code platform-related data, the fusion module can fuse feature vectors by concatenation or by using a fully connected layer. In this embodiment, the fusion module fuses feature vectors by concatenation. During model training, parameter optimization of the fusion module can be disregarded.

[0151] Training the open-source community development predictor OSC-Predict involves the following steps:

[0152] (1) Fix the model parameters of the open source community analyzer OSC-Analysis, and connect the open source community analyzer OSC-Analysis and the open source community development predictor OSC-Predict to form a model network;

[0153] (2) Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0154] (3) Using the vector sequence OSC-Seq as input, the feature vector OSC-Next-Vect for the next time point is generated through the model parameters of the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict. The error between the two is calculated, and the error is backpropagated to update the model parameters of the open-source community development predictor OSC-Predict, so as to obtain the trained open-source community development predictor OSC-Predict.

[0155] Training the OSC-Optimize open-source community-developed optimizer model includes the following steps:

[0156] (1) Based on the operation of the open source community, set up a variety of improvement and optimization strategies;

[0157] (2) Using the acquired training data as input, the feature vector OSC-Vect is generated by the trained open source community feature extractor OSC-GenFv, and the open source community is profiled based on the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler to obtain the open source community data label and quantitative score OSC-Tag.

[0158] (3) Based on the data tags and quantitative scores OSC-Tag from the open source community, set the tags and quantitative scores to be optimized, and mark the corresponding improvement and optimization strategies;

[0159] (4) The open-source community development optimizer OSC-Optimize is trained using the gradient descent optimization algorithm to obtain the trained open-source community development optimizer OSC-Optimize.

[0160] Step S300 includes the following steps:

[0161] S310. Based on the set time points, obtain evaluation data from multiple data sources;

[0162] S320. For the evaluation data obtained from the multiple data sources, feature extraction and feature fusion are performed using the trained open-source community feature generator OSC-GenFv to obtain the feature vector OSC-Vect.

[0163] S330. Based on the time point recommendation, generate the feature vector sequence OSC-Seq of the feature vector OSC-Vect;

[0164] S340. Using the feature vector sequence OSC-Seq as input, the open source community is profiled using the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler, generating data tags and quantitative scores OSC-Tags for the open source community.

[0165] S350. Using the feature vector sequence OSC-Seq as input, the trained open-source community analyzer OSC-Analysis and open-source community development predictor OSC-Predict are used to generate a feature vector OSC-Next-Vect for future moments. Using the feature vector OSC-Next-Vect as input, the trained open-source community analyzer OSC-Analysis and open-source community development evaluation and profiling model OSC-Profiler are used to continuously perform data profiling analysis on the open-source community, generating data tags and quantitative scores OSC-Tag for the open-source community.

[0166] S360, combining the community profiles of the two parties, outputs a series of open source community data tags and quantitative scores OSC-Tags in chronological order, which are used to evaluate the current and future development of the open source community;

[0167] S370 simulates open-source community development scenarios and generates various evaluation directions for the future development of open-source communities.

[0168] S380, time-series open-source community feature vector OSC-Vect, and open-source community data labels and quantitative scores OSC-Tag obtained from community profiling are used to optimize the open-source community through the trained open-source community development optimizer OSC-Optimize to generate improved optimization strategies;

[0169] S390. Regularly acquire evaluation data from multiple data sources and continuously optimize the model parameters of the open-source community feature generator OSC-GenFv, the open-source community profile model, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize.

[0170] The specific implementation steps 310 and 320 are as follows:

[0171] (1) Select an open source community, obtain the code hosting platform address and official website address, set search keywords, select multiple Internet search engines and social media websites, and set time points;

[0172] (2) Obtain access permissions to its code hosting platform interface, call the interface of the open source code hosting platform, obtain current and historical quantitative indicators Git-Index data, including the number of stars, the number of forks, the number of issues, the number of merges, the number of contributors, the number of documents, the number of dependent libraries, the update frequency, etc., and at the same time obtain the code platform log text Git-Log.

[0173] (3) Input according to the set time point order, normalize the Git-Index data through the code platform related data feature vector generator Git-GenFv, extract the feature data of the code platform log text Git-Log, perform vector concatenation and fusion, and obtain the feature vector Git-Vect.

[0174] (4) Obtain the data of the official website of the open source project community, OSC-Website, in the order of the set time points, including website documents, news, discussion groups, wiki, etc., and generate the feature vector Website-Vect through the aforementioned open source project official website related data feature vector generator Website-GenFv;

[0175] (5) According to the set time point order, use the open source project official website related data feature vector generator Website-GenFv, set the keyword data through OSC-Search to query multiple search engines, obtain the query results, select and filter the query results according to the characteristics of different search engines, and combine the results of multiple search engines to generate the feature vector Search-Vect.

[0176] (6) According to the set time point order, use the social media related data feature vector generator Social-GenFv to generate the feature vector Social-Vect based on the events and comments of the open source community related social media discussions OSC-Social, and fully consider the emotional characteristics of the comment time.

[0177] (7) Using the feature vector fusion machine OSC-FusFv, the feature vectors Social-Vect, Social-Vect, Social-Vect and Social-Vect at the same time point are fused together, and a timestamp component is added to generate the feature vector OSC-Vect.

[0178] This embodiment of the method fully considers the main factors influencing the development of open-source communities. Based on multiple data sources such as open-source code hosting platforms, community websites, internet search engines, and social media, it employs deep learning technology to model the evaluation of open-source community development. It designs an open-source community feature extractor, an open-source community analyzer, an open-source community development evaluation profile, and an open-source community development predictor. Targeted feature extraction and data fusion are performed to uncover the deep connections between various factors influencing community development. Combined with expert knowledge in the field of community governance, an open-source community profile is generated, achieving a more reasonable and accurate evaluation of open-source community development. Compared with traditional expert evaluation methods, this method maximizes the use of existing internet resources such as code hosting platforms, search engines, and social media, while considering the temporal characteristics of community development. This results in a more reasonable and effective evaluation of the development status of open-source communities. Furthermore, it avoids situations where community operators maliciously inflate star and fork counts, providing a more reasonable assessment of the maturity and development potential of open-source communities. By predicting future open-source community development trends and simulating targeted development scenarios, corresponding community operation and governance optimization strategies are proposed to guide improvements in community operation and governance efficiency, thereby enhancing the value and influence of open-source communities. In addition, we continuously collect relevant data from the open-source community and continuously optimize the evaluation model to meet the personalized development needs of the open-source community.

[0179] Example 2:

[0180] The present invention discloses an open-source community development evaluation system based on deep learning, which includes a model building module, a model training module, and a community evaluation guidance module. The system can evaluate and guide the development of open-source communities through the method disclosed in Example 1.

[0181] The model building module is used to construct an open-source community feature generator, OSC-GenFv, based on a neural network model. OSC-GenFv extracts and fuses features from evaluation data from multiple data sources to obtain feature vectors, OSC-Vect. It is also used to construct an open-source community profiling model, OSC-Snap-Profiler, which creates a data profile of the open-source community based on the feature vectors OSC-Vect, outputting data labels and quantitative scores, OSC-Tag. Furthermore, it is used to construct an open-source community analyzer, OSC-Analysis, based on a convolutional neural network model with a multi-head self-attention mechanism. OSC-Analysis analyzes the open-source community based on a feature vector sequence, OSC-Seq, which consists of time-ordered feature vectors, OSC-Vect. Finally, it is used to construct an open-source community development evaluation profiling model, OSC-Pro. The open-source community development evaluation and profiling model OSC-Profiler works in conjunction with the open-source community analyzer OSC-Analysis to create a data profile of the open-source community based on the output of OSC-Analysis, outputting data tags and quantitative scores (OSC-Tag) for the open-source community; it also constructs an open-source community development predictor OSC-Predict, which, in conjunction with OSC-Analysis, predicts the development trend of the open-source community based on the output of OSC-Analysis, predicting and outputting the feature vector OSC-Next-Vect for the next time point; and it constructs an open-source community development optimizer OSC-Optimize, which optimizes the development of the open-source community based on the current feature vector OSC-Vect and the data profile OSC-Tag, outputting improvement and optimization strategies.

[0182] In practice, the data sources include open-source community code hosting platforms, open-source community networks, open-source community internet searches, and open-source community media discussions. The evaluation data for open-source community code hosting platforms includes the general data metric Git-Index and the code hosting platform log text data Git-Log. The general data metric Git-Index includes the number of stars, forks, issues, merges, contributors, documents, dependencies, and update frequency. The evaluation data for open-source community networks includes website documentation data, news data, discussion group data, and wiki data. The evaluation data for open-source community internet searches is the query results obtained using multiple search engines based on the keyword OSC-Search. The evaluation data for open-source community media discussions is the social media discussion data OSC-Social.

[0183] Corresponding to the above data sources, the open-source community feature generator OSC-GenFv in this embodiment includes Git-GenFv (code platform related data feature vector generator), Website-GenFv (open source project website related data feature vector generator), Search-GenFv (internet search related data feature vector generator), Social-GenFv (social media related data feature vector generator), and OSC-FusFv (feature vector fusion generator).

[0184] The Git-GenFv feature vector generator for code platform-related data includes a data normalization module, a BERT-based semantic model and time series feature extractor, and a fusion module. The data normalization module is used to normalize the general data metric Git-Index to obtain feature vectors. The feature extractor is used to extract features from the code hosting platform log text data Git-Log to obtain feature vectors. The fusion module is used to fuse the feature vectors output by the data normalization module and the feature extractor to obtain the final feature vector Git-Vect.

[0185] The Website-GenFv feature vector generator for data related to the official website of open source projects is a text recognition neural network model based on a language model. It is used to extract features from the evaluation data of the pipeline network whose data source is the open source community, and obtain the feature vector Website-Vect.

[0186] The Search-GenFv feature vector generator for Internet search-related data is used to extract features from evaluation data of Internet searches from open-source communities based on a text recognition semantic extraction model, and obtain the feature vector Search-Vect.

[0187] The Social-GenFv feature vector generator is used to extract features from evaluation data of media discussions from open-source communities using a neural network model based on text recognition and sentiment analysis, and obtain the feature vector Social-Vect.

[0188] The feature vector fusion engine OSC-FusFv fuses the feature vectors Git-Vect, Website-Vect, Search-Vect, and Social-Vect through a fully connected layer, and adds a timestamp to generate the feature vector OSC-Vect.

[0189] The open-source community profiling model OSC-Snap-Profiler uses the feature vectors of open-source communities, OSC-Vect, to create data profiles of open-source communities and output data tags and quantitative scores, OSC-Tag.

[0190] The core of the open-source community analyzer OSC-Analysis is a convolutional neural network model based on a multi-head self-attention mechanism. It processes the open-source community feature vector sequence OSC-Seq to obtain contextual features and combines it with the open-source community development evaluation and profiling model OSC-Profiler and the open-source community development predictor OSC-Predict to output a data profile of the open-source community and predicted feature vectors. Specifically, the open-source community feature vector sequence OSC-Seq is composed of feature vectors OSC-Vect representing the time sequence of open-source community development; the open-source community development evaluation and profiling model OSC-Profiler fully considers the time-series characteristics of open-source community development, creating a data profile of the open-source community and outputting data labels and quantitative scores OSC-Tags. The open-source community development predictor OSC-Predict predicts future community trends based on the current community development status and outputs the feature vector OSC-Next-Vect.

[0191] The open-source community development optimizer OSC-Optimize generates improved optimization strategies based on the current open-source community feature vector OSC-Vect and the input open-source community data profile OSC-Tag.

[0192] The model training module is used to collect historical evaluation data from multiple data sources over time to evaluate open-source communities, and to label the development status of open-source communities. Based on the historical evaluation data and labels, the module is used to train the open-source community feature generator OSC-GenFv, the open-source community profile model OSC-Snap-Profiler, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize using the gradient descent optimization algorithm.

[0193] In this embodiment, the model training module is used to optimize the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize, based on the historical evaluation data and labels.

[0194] As a specific embodiment, the model training module is used to optimize the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler:

[0195] (1) Based on the time point, collect a large amount of data from various sources such as the open source community code hosting platform OSC-Git, the open source project community official website OSC-Website, the open source community Internet search results OSC-Search, and social media discussions OSC-Social, and label the community development status.

[0196] (2) Connect the open-source community feature extractor OSC-GenFv with the open-source community profile model OSC-Snap-Profiler. Based on the set code hosting platform OSC-Git quantization index Git-Index, set the data normalization calculation method of the code platform related data feature vector generator Git-GenFv. Based on the existing general model of BERT, set the module initialization parameters, initialize the fusion module, fix the parameter values ​​of feature vectors Website-Vect, Search-Vect and Social-Vect, and based on the labeled tags, use the gradient descent optimization algorithm to train the generator Git-GenFv, fusion OSC-FusFv and open-source community profile model OSC-Snap-Profiler of the open-source community feature extractor OSC-GenFv.

[0197] (3) Fix the parameters of the Git-GenFv model of the data feature vector generator related to the code platform and the parameter values ​​of the feature vectors Search-Vect and Social-Vect. Based on the label, the gradient descent optimization algorithm is used to train the generator Website-GenFv, the fusion unit OSC-FusFv and the open source community profile module OSC-Snap-Profiler of the open source community feature extractor OSC-GenFv.

[0198] (4) Fix the model parameters of Git-GenFv, the data feature vector generator related to the code platform, and Website-GenFv, the data feature vector generator related to the open source project website. Fix the parameter values ​​of the feature vector Social-Vect. Based on the labeled tags, use the gradient descent optimization algorithm to train the generator Search-GenFv, the fusion machine OSC-FusFv, and the open source community portrait model OSC-Snap-Profiler of the open source community feature extractor OSC-GenFv.

[0199] (5) The model parameters of Git-GenFv (data feature vector generator related to code platform), Website-GenFv (data feature vector generator related to open source project website), and Search-GenFv (data feature vector generator related to Internet search) are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the generator Social-GenFv, the fusion unit OSC-FusFv, and the open source community profile model OSC-Snap-Profiler of the open source community feature extractor OSC-GenFv.

[0200] (6) The open-source community feature extractor OSC-GenFv is connected to the open-source community portrait model OSC-Snap-Profiler. Based on the labeled data, the gradient descent optimization algorithm is used for training to obtain the model parameters of the open-source community feature extractor OSC-GenFv model and the open-source community portrait model OSC-Snap-Profiler.

[0201] The model training module is used to train the open-source community analyzer OSC-Analysis and the open-source community development evaluation profiling model OSC-Profiler through the following steps:

[0202] (1) Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0203] (2) Set the model initialization parameters of the open source community development evaluation profile model OSC-Profiler based on the open source community profile model OSC-Snap-Profiler;

[0204] (3) Connect the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler;

[0205] (4) Using historical evaluation data as input, generate feature vector OSC-Vect based on the trained open-source community feature generator OSC-GenFv and open-source community profile model OSC-Snap-Profiler, and generate vector sequence OSC-Seq of feature vector OSC-Vect according to the time progression.

[0206] (5) Using the vector sequence OSC-Seq as input, and based on the labeled data, the gradient descent optimization algorithm is used to train the open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler to obtain the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler.

[0207] In the Git-GenFv feature vector generator for code platform-related data, the fusion module can fuse feature vectors by concatenation or by using a fully connected layer. In this embodiment, the fusion module fuses feature vectors by concatenation. During model training, parameter optimization of the fusion module can be disregarded.

[0208] The model training module is used to train the open-source community development predictor OSC-Predict through the following steps:

[0209] (1) Fix the model parameters of the open source community analyzer OSC-Analysis, and connect the open source community analyzer OSC-Analysis and the open source community development predictor OSC-Predict to form a model network;

[0210] (2) Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression.

[0211] (3) Using the vector sequence OSC-Seq as input, the feature vector OSC-Next-Vect for the next time point is generated through the model parameters of the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict. The error between the two is calculated, and the error is backpropagated to update the model parameters of the open-source community development predictor OSC-Predict, so as to obtain the trained open-source community development predictor OSC-Predict.

[0212] The model training module is used to train the OSC-Optimize open-source community-developed optimizer model through the following steps:

[0213] (1) Based on the operation of the open source community, set up a variety of improvement and optimization strategies;

[0214] (2) Using the acquired training data as input, the feature vector OSC-Vect is generated by the trained open source community feature extractor OSC-GenFv, and the open source community is profiled based on the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler to obtain the open source community data label and quantitative score OSC-Tag.

[0215] (3) Based on the data tags and quantitative scores OSC-Tag from the open source community, set the tags and quantitative scores to be optimized, and mark the corresponding improvement and optimization strategies;

[0216] (4) The open-source community development optimizer OSC-Optimize is trained using the gradient descent optimization algorithm to obtain the trained open-source community development optimizer OSC-Optimize.

[0217] The community evaluation guidance module is used to collect evaluation data from multiple data sources based on set time points. Using the evaluation data as input, the module evaluates and guides the open-source community through the trained open-source community feature generator OSC-GenFv, open-source community profile model, open-source community analyzer OSC-Analysis, open-source community development evaluation profile model OSC-Profiler, open-source community development predictor OSC-Predict, and open-source community development optimizer OSC-Optimize, thereby obtaining improvement and optimization strategies for the open-source community.

[0218] As a specific implementation, the community evaluation and guidance module is used to evaluate and guide the open-source community through the following steps:

[0219] (1) Based on the set time points, obtain evaluation data from multiple data sources;

[0220] (2) For the evaluation data obtained from the multiple data sources, feature extraction and feature fusion are performed using the trained open-source community feature generator OSC-GenFv to obtain the feature vector OSC-Vect;

[0221] (3) Based on the time point recommendation, generate the feature vector sequence OSC-Seq of the feature vector OSC-Vect;

[0222] (4) Using the feature vector sequence OSC-Seq as input, the open source community is profiled by the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler, and data tags and quantitative scores OSC-Tags of the open source community are generated.

[0223] (5) Using the feature vector sequence OSC-Seq as input, the trained open source community analyzer OSC-Analysis and open source community development predictor OSC-Predict are used to generate the feature vector OSC-Next-Vect for the future moment. Using the feature vector OSC-Next-Vect as input, the trained open source community analyzer OSC-Analysis and open source community development evaluation and profile model OSC-Profiler are used to continuously perform data profile analysis for the open source community and generate data tags and quantitative scores OSC-Tag for the open source community.

[0224] (6) Combining the community profiles of the two parties, output a series of open source community data tags and quantitative scores OSC-Tag in chronological order for the evaluation of the current and future development of the open source community;

[0225] (7) Simulate the development scenario of open source communities and generate various evaluation directions for the future development of open source communities;

[0226] (8) Based on the time-series open source community feature vector OSC-Vect and the data label and quantitative score OSC-Tag of the open source community obtained from the community profile, the open source community is optimized by the trained open source community development optimizer OSC-Optimize to generate an improved optimization strategy;

[0227] (9) Regularly obtain evaluation data from multiple data sources and continuously optimize the model parameters of the open source community feature generator OSC-GenFv, the open source community profile model, the open source community analyzer OSC-Analysis, the open source community development evaluation profile model OSC-Profiler, the open source community development predictor OSC-Predict, and the open source community development optimizer OSC-Optimize.

[0228] The specific implementation of steps (1) and (2) is as follows:

[0229] (1) Select an open source community, obtain the code hosting platform address and official website address, set search keywords, select multiple Internet search engines and social media websites, and set time points;

[0230] (2) Obtain access permissions to its code hosting platform interface, call the interface of the open source code hosting platform, obtain current and historical quantitative indicators Git-Index data, including the number of stars, the number of forks, the number of issues, the number of merges, the number of contributors, the number of documents, the number of dependent libraries, the update frequency, etc., and at the same time obtain the code platform log text Git-Log.

[0231] (3) Input according to the set time point order, normalize the Git-Index data through the code platform related data feature vector generator Git-GenFv, extract the feature data of the code platform log text Git-Log, perform vector concatenation and fusion, and obtain the feature vector Git-Vect.

[0232] (4) Obtain the data of the official website of the open source project community, OSC-Website, in the order of the set time points, including website documents, news, discussion groups, wiki, etc., and generate the feature vector Website-Vect through the aforementioned open source project official website related data feature vector generator Website-GenFv;

[0233] (5) According to the set time point order, use the open source project official website related data feature vector generator Website-GenFv, set the keyword data through OSC-Search to query multiple search engines, obtain the query results, select and filter the query results according to the characteristics of different search engines, and combine the results of multiple search engines to generate the feature vector Search-Vect.

[0234] (6) According to the set time point order, use the social media related data feature vector generator Social-GenFv to generate the feature vector Social-Vect based on the events and comments of the open source community related social media discussions OSC-Social, and fully consider the emotional characteristics of the comment time.

[0235] (7) Using the feature vector fusion machine OSC-FusFv, the feature vectors Social-Vect, Social-Vect, Social-Vect and Social-Vect at the same time point are fused together, and a timestamp component is added to generate the feature vector OSC-Vect.

[0236] The present invention has been shown and described in detail above with reference to the accompanying drawings and preferred embodiments. However, the present invention is not limited to these disclosed embodiments. Based on the above embodiments, those skilled in the art will know that more embodiments of the present invention can be obtained by combining the code review methods in the different embodiments. These embodiments are also within the protection scope of the present invention.

Claims

1. A method for evaluating the development of open-source communities based on deep learning, characterized in that... Includes the following steps: An open-source community feature generator, OSC-GenFv, is built based on a neural network model. The OSC-GenFv is used to extract and fuse features from evaluation data from multiple data sources to obtain the feature vector OSC-Vect. An open-source community profiling model, OSC-Snap-Profiler, is constructed. The OSC-Snap-Profiler is used to create a data profile of the open-source community based on the feature vector OSC-Vect, and outputs the data label and quantitative score OSC-Tag of the open-source community. An open-source community analyzer, OSC-Analysis, is constructed based on a convolutional neural network model with a multi-head self-attention mechanism. The OSC-Analysis analyzer is used to analyze the open-source community based on the feature vector sequence OSC-Seq, which is composed of time-ordered feature vectors OSC-Vect. An open-source community development evaluation profile model OSC-Profiler is constructed. The open-source community development evaluation profile model OSC-Profiler works with the open-source community analyzer OSC-Analysis to create a data profile of the open-source community based on the output of the open-source community analyzer OSC-Analysis, and outputs the data tags and quantitative scores OSC-Tag of the open-source community. An open-source community development predictor OSC-Predict is constructed. The open-source community development predictor works in conjunction with the open-source community analyzer OSC-Analysis to predict the development trend of the open-source community based on the output of the open-source community analyzer OSC-Analysis, and predicts and outputs the feature vector OSC-Next-Vect for the next time point. An open-source community development optimizer, OSC-Optimize, is constructed. OSC-Optimize is used to optimize the development of open-source communities based on the feature vector OSC-Vect corresponding to the current open-source community and the data tags and quantitative scores OSC-Tag obtained from the community profile, and outputs improvement optimization strategies. Historical evaluation data for evaluating open-source communities is collected from multiple data sources based on time, and the development status of open-source communities is labeled. Based on the historical evaluation data and labels, the open-source community feature generator OSC-GenFv, the open-source community profile model OSC-Snap-Profiler, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize are trained using the gradient descent optimization algorithm. Evaluation data is collected from multiple data sources based on set time points. Using the evaluation data as input, the open source community is evaluated and guided by the trained open source community feature generator OSC-GenFv, open source community profile model, open source community analyzer OSC-Analysis, open source community development evaluation profile model OSC-Profiler, open source community development predictor OSC-Predict, and open source community development optimizer OSC-Optimize, so as to obtain the improvement and optimization strategy of the open source community. The plurality of data sources include: The code hosting platform of the open source community, the evaluation data of the code hosting platform includes the general data metric Git-Index and the code hosting platform log text data Git-Log. The general data metric Git-Index includes the number of stars, the number of forks, the number of issues, the number of merges, the number of contributors, the number of documents, the number of dependent libraries, and the update frequency. The official website of the open source community, and the evaluation data corresponding to the official website of the open source community includes website documentation data, news data, discussion group data and wiki data; Internet search results from the open-source community, where the evaluation data for the internet search results from the open-source community are based on the keyword OSC-Search and obtained using multiple search engines; Media discussions within the open-source community, where the corresponding evaluation data is the social media discussion data of the open-source community, OSC-Social.

2. The method for evaluating the development of open-source communities based on deep learning according to claim 1, characterized in that... The open-source community feature generator OSC-GenFv includes: The code platform-related data feature vector generator Git-GenFv includes a data normalization module, a BERT-based semantic model and a time series feature extractor, and a fusion module. The data normalization module is used to normalize the general data metric Git-Index to obtain feature vectors. The feature extractor is used to extract features from the code hosting platform log text data Git-Log to obtain feature vectors. The fusion module is used to fuse the feature vectors output by the data normalization module and the feature extractor to obtain the final feature vector Git-Vect. The Website-GenFv feature vector generator for official website data of open source projects is a text recognition neural network model based on a language model. It is used to extract features from the evaluation data of the official website of the open source community to obtain the feature vector Website-Vect. Search-GenFv, an Internet search-related data feature vector generator, is used to extract features from evaluation data of Internet searches from open-source communities based on a text recognition semantic extraction model, and obtain the feature vector Search-Vect. The Social-GenFv feature vector generator is used to extract features from evaluation data of media discussions from open-source communities using a neural network model based on text recognition and sentiment analysis, and obtain the feature vector Social-Vect. The feature vector fusion unit OSC-FusFv fuses the feature vectors Git-Vect, Website-Vect, Search-Vect, and Social-Vect through a fully connected layer, and adds a timestamp to generate the feature vector OSC-Vect.

3. The method for evaluating the development of open-source communities based on deep learning according to claim 2, characterized in that... Based on the historical evaluation data and tags, the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler are optimized sequentially; the parameters of the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler are optimized; the parameters of the open-source community development predictor OSC-Predict are optimized; and the parameters of the open-source community development optimizer OSC-Optimize are optimized. The parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler are optimized, including the following steps: The open-source community feature generator OSC-GenFv is connected to the open-source community profile model OSC-Snap-Profiler. Based on the acquired general data metric Git-Index, the data normalization calculation method of the data normalization module is set. Based on the existing general model of Bert, the initialization parameters of the feature extractor are set, the fusion module is initialized, the parameter values ​​of the feature vectors Website-Vect, Search-Vect, and Social-Vect are fixed, and the gradient descent optimization algorithm is used to train the code platform related data feature vector generator Git-GenFv, feature vector fusion machine OSC-FusFv, and open-source community profile model OSC-Snap-Profiler. The model parameters of the Git-GenFv feature vector generator related to the code platform and the parameter values ​​of the feature vectors Search-Vect and Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Website-GenFv feature vector generator related to the open source project website, the OSC-FusFv feature vector fusion engine, and the OSC-Snap-Profiler open source community profile model. The model parameters of Git-GenFv (the code platform-related data feature vector generator) and Website-GenFv (the open-source project website-related data feature vector generator) are fixed, and the parameter values ​​of the feature vector Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Internet search-related data feature vector generator Search-GenFv, the feature vector fusion machine OSC-FusFv, and the open-source community profile model OSC-Snap-Profiler. The model parameters of Git-GenFv (data feature vector generator related to the code platform), Website-GenFv (data feature vector generator related to the open source project website), and Search-GenFv (data feature vector generator related to internet search) are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train Social-GenFv (data feature vector generator related to social media), OSC-FusFv (feature vector fusion), and OSC-Snap-Profiler (open source community profiling module). The open-source community feature generator OSC-GenFv is connected to the open-source community profile model OSC-Snap-Profiler. Based on the labeled data, the gradient descent optimization algorithm is used to train both the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, resulting in the trained open-source community feature generator OSC-GenFv and open-source community profile model OSC-Snap-Profiler. The training of the open-source community analyzer OSC-Analysis and the open-source community development evaluation profiling model OSC-Profiler includes the following steps: Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the trained open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, and the feature vector sequence OSC-Seq of the feature vector OSC-Vect is generated according to the time progression. Based on the open-source community profiling model OSC-Snap-Profiler, set the model initialization parameters of the open-source community development evaluation profiling model OSC-Profiler; Connect the open-source community analyzer OSC-Analysis and the open-source community development evaluation and profiling model OSC-Profiler; Using the feature vector sequence OSC-Seq as input, and based on the labeled data, the gradient descent optimization algorithm is used to train the open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler, resulting in the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler. Training the open-source community development predictor OSC-Predict involves the following steps: By fixing the model parameters of the open-source community analyzer OSC-Analysis, the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict are connected to form a model network; Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler. The feature vector sequence OSC-Seq of the feature vector OSC-Vect is then generated according to the time progression. Using the feature vector sequence OSC-Seq as input, the feature vector OSC-Next-Vect for the next time point is generated through the model parameters of the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict. The error between the two is calculated, and the error is backpropagated to update the model parameters of the open-source community development predictor OSC-Predict, thus obtaining the trained open-source community development predictor OSC-Predict. Training the OSC-Optimize open-source community-developed optimizer model includes the following steps: Based on the operation of the open source community, various improvement and optimization strategies were set up; Using the acquired training data as input, the feature vector OSC-Vect is generated through the trained open-source community feature generator OSC-GenFv. Based on the trained open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, a data profile of the open-source community is created, resulting in the data label and quantitative score OSC-Tag of the open-source community. Based on the data tags and quantitative scores OSC-Tag from the open-source community, set the tags and quantitative scores to be optimized, and mark the corresponding improvement and optimization strategies; The gradient descent optimization algorithm was used to train the open-source community development optimizer OSC-Optimize, resulting in the trained open-source community development optimizer OSC-Optimize.

4. The method for evaluating the development of open-source communities based on deep learning according to claim 3, characterized in that... Evaluation data is collected from multiple data sources based on predetermined time points. Using this evaluation data as input, the trained open-source community feature generator OSC-GenFv, open-source community profiling model, open-source community analyzer OSC-Analysis, open-source community development evaluation profiling model OSC-Profiler, open-source community development predictor OSC-Predict, and open-source community development optimizer OSC-Optimize are used to evaluate and guide the open-source community, resulting in improvement and optimization strategies for the open-source community. The process includes the following steps: Based on the set time points, evaluation data is obtained from multiple data sources; For the evaluation data obtained from the multiple data sources, feature extraction and feature fusion are performed using the trained open-source community feature generator OSC-GenFv to obtain the feature vector OSC-Vect. Based on the recommended time points, generate the feature vector sequence OSC-Seq of the feature vector OSC-Vect; Using the feature vector sequence OSC-Seq as input, the open source community is profiled by the trained open source community analyzer OSC-Analysis and the open source community development evaluation and profiling model OSC-Profiler, generating data tags and quantitative scores OSC-Tags for the open source community. Using the feature vector sequence OSC-Seq as input, the trained open-source community analyzer OSC-Analysis and open-source community development predictor OSC-Predict are used to generate the feature vector OSC-Next-Vect for future moments. Using the feature vector OSC-Next-Vect as input, the trained open-source community analyzer OSC-Analysis and open-source community development evaluation and profiling model OSC-Profiler are used to continuously perform data profiling analysis on the open-source community, generating data tags and quantitative scores OSC-Tag for the open-source community. Combining the community profiles of both parties, a series of data tags and quantitative scores (OSC-Tag) for open source communities are output in chronological order for evaluating the current and future development of open source communities. Simulate open-source community development scenarios to generate various evaluation directions for the future development of open-source communities; Based on the time-series open-source community feature vector OSC-Vect and the data labels and quantitative scores of the open-source community obtained from the community profile OSC-Tag, the trained open-source community development optimizer OSC-Optimize is used to optimize the open-source community and generate improved optimization strategies. We regularly acquire evaluation data from multiple data sources and continuously optimize the model parameters of the open-source community feature generator OSC-GenFv, the open-source community profile model, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize.

5. A deep learning-based open-source community development evaluation system, characterized in that, The system includes: The module includes a model building module, which is used to construct an open-source community feature generator (OSC-GenFv) based on a neural network model. OSC-GenFv extracts and fuses features from evaluation data from multiple data sources to obtain feature vectors (OSC-Vect). It also constructs an open-source community profiling model (OSC-Snap-Profiler) to create a data profile of the open-source community based on the feature vectors (OSC-Vect), outputting data labels and quantitative scores (OSC-Tag). Furthermore, it constructs an open-source community analyzer (OSC-Analysis) based on a convolutional neural network model with a multi-head self-attention mechanism. OSC-Analysis analyzes the open-source community based on a feature vector sequence (OSC-Seq), which is composed of time-ordered feature vectors (OSC-Vect). Finally, it constructs an open-source community development evaluation profile model (OSC). -Profiler: The OSC-Profiler, an open-source community development evaluation and profiling model, works in conjunction with the OSC-Analysis, the open-source community analyzer, to create a data profile of the open-source community based on the output of the OSC-Analysis, outputting data tags and quantitative scores (OSC-Tag). It is also used to construct an OSC-Predict, an open-source community development predictor, which, in conjunction with the OSC-Analysis, predicts the development trend of the open-source community based on the output of the OSC-Analysis, predicting and outputting the feature vector OSC-Next-Vect for the next time point. Finally, it is used to construct an OSC-Optimize, an open-source community development optimizer, which optimizes the development of the open-source community based on the current feature vector OSC-Vect and the data profile OSC-Tag, outputting improvement and optimization strategies. The model training module is used to collect historical evaluation data for evaluating open source communities from multiple data sources based on time, and to label the development status of open source communities. Based on the historical evaluation data and labels, the module is used to train the open source community feature generator OSC-GenFv, the open source community profile model OSC-Snap-Profiler, the open source community analyzer OSC-Analysis, the open source community development evaluation profile model OSC-Profiler, the open source community development predictor OSC-Predict, and the open source community development optimizer OSC-Optimize using the gradient descent optimization algorithm. The community evaluation guidance module is used to collect evaluation data from multiple data sources based on a set time point. Taking the evaluation data as input, the module evaluates and guides the open source community through a trained open source community feature generator OSC-GenFv, an open source community profile model, an open source community analyzer OSC-Analysis, an open source community development evaluation profile model OSC-Profiler, an open source community development predictor OSC-Predict, and an open source community development optimizer OSC-Optimize, thereby obtaining improvement and optimization strategies for the open source community. The plurality of data sources include: The code hosting platform of the open source community, the evaluation data of the code hosting platform includes the general data metric Git-Index and the code hosting platform log text data Git-Log. The general data metric Git-Index includes the number of stars, the number of forks, the number of issues, the number of merges, the number of contributors, the number of documents, the number of dependent libraries, and the update frequency. The official website of the open source community, and the evaluation data corresponding to the official website of the open source community includes website documentation data, news data, discussion group data and wiki data; Internet search results from the open-source community, where the evaluation data for the internet search results from the open-source community are based on the keyword OSC-Search and obtained using multiple search engines; Media discussions within the open-source community, where the corresponding evaluation data is the social media discussion data of the open-source community, OSC-Social.

6. The deep learning-based open-source community development evaluation system according to claim 5, characterized in that... The open-source community feature generator OSC-GenFv includes: The code platform-related data feature vector generator Git-GenFv includes a data normalization module, a BERT-based semantic model and a time series feature extractor, and a fusion module. The data normalization module is used to normalize the general data metric Git-Index to obtain feature vectors. The feature extractor is used to extract features from the code hosting platform log text data Git-Log to obtain feature vectors. The fusion module is used to fuse the feature vectors output by the data normalization module and the feature extractor to obtain the final feature vector Git-Vect. The Website-GenFv feature vector generator for official website data of open source projects is a text recognition neural network model based on a language model. It is used to extract features from the evaluation data of the official website of the open source community to obtain the feature vector Website-Vect. Search-GenFv, an Internet search-related data feature vector generator, is used to extract features from evaluation data of Internet searches from open-source communities based on a text recognition semantic extraction model, and obtain the feature vector Search-Vect. The Social-GenFv feature vector generator is used to extract features from evaluation data of media discussions from open-source communities using a neural network model based on text recognition and sentiment analysis, and obtain the feature vector Social-Vect. The feature vector fusion unit OSC-FusFv fuses the feature vectors Git-Vect, Website-Vect, Search-Vect, and Social-Vect through a fully connected layer, and adds a timestamp to generate the feature vector OSC-Vect.

7. The deep learning-based open-source community development evaluation system according to claim 6, characterized in that... The model training module is used to optimize the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, the open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize, based on the historical evaluation data and labels. The model training module is used to optimize the parameters of the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler through the following steps: The open-source community feature generator OSC-GenFv is connected to the open-source community profile model OSC-Snap-Profiler. Based on the acquired general data metric Git-Index, the data normalization calculation method of the data normalization module is set. Based on the existing general model of Bert, the initialization parameters of the feature extractor are set, the fusion module is initialized, the parameter values ​​of the feature vectors Website-Vect, Search-Vect, and Social-Vect are fixed, and the gradient descent optimization algorithm is used to train the code platform related data feature vector generator Git-GenFv, feature vector fusion machine OSC-FusFv, and open-source community profile model OSC-Snap-Profiler. The model parameters of the Git-GenFv feature vector generator related to the code platform and the parameter values ​​of the feature vectors Search-Vect and Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Website-GenFv feature vector generator related to the open source project website, the OSC-FusFv feature vector fusion engine, and the OSC-Snap-Profiler open source community profile model. The model parameters of Git-GenFv (the code platform-related data feature vector generator) and Website-GenFv (the open-source project website-related data feature vector generator) are fixed, and the parameter values ​​of the feature vector Social-Vect are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train the Internet search-related data feature vector generator Search-GenFv, the feature vector fusion machine OSC-FusFv, and the open-source community profile model OSC-Snap-Profiler. The model parameters of Git-GenFv (data feature vector generator related to the code platform), Website-GenFv (data feature vector generator related to the open source project website), and Search-GenFv (data feature vector generator related to internet search) are fixed. Based on the labeled data, the gradient descent optimization algorithm is used to train Social-GenFv (data feature vector generator related to social media), OSC-FusFv (feature vector fusion), and OSC-Snap-Profiler (open source community profiling module). The open-source community feature generator OSC-GenFv is connected to the open-source community profile model OSC-Snap-Profiler. Based on the labeled data, the gradient descent optimization algorithm is used to train both the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler, resulting in the trained open-source community feature generator OSC-GenFv and open-source community profile model OSC-Snap-Profiler. The model training module is used to train the open-source community analyzer OSC-Analysis and the open-source community development evaluation and profiling model OSC-Profiler through the following steps: Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler. The feature vector sequence OSC-Seq of the feature vector OSC-Vect is then generated according to the time progression. Based on the open-source community profiling model OSC-Snap-Profiler, set the model initialization parameters of the open-source community development evaluation profiling model OSC-Profiler; Connect the open-source community analyzer OSC-Analysis and the open-source community development evaluation and profiling model OSC-Profiler; Using the feature vector sequence OSC-Seq as input, and based on the labeled data, the gradient descent optimization algorithm is used to train the open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler, resulting in the trained open source community analyzer OSC-Analysis and the open source community development evaluation profile model OSC-Profiler. The model training module is used to train the open-source community development predictor OSC-Predict through the following steps: By fixing the model parameters of the open-source community analyzer OSC-Analysis, the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict are connected to form a model network; Using historical evaluation data as input, the feature vector OSC-Vect is generated based on the open-source community feature generator OSC-GenFv and the open-source community profile model OSC-Snap-Profiler. The feature vector sequence OSC-Seq of the feature vector OSC-Vect is then generated according to the time progression. Using the feature vector sequence OSC-Seq as input, the feature vector OSC-Next-Vect for the next time point is generated through the model parameters of the open-source community analyzer OSC-Analysis and the open-source community development predictor OSC-Predict. The error between the two is calculated, and the error is backpropagated to update the model parameters of the open-source community development predictor OSC-Predict, thus obtaining the trained open-source community development predictor OSC-Predict. The model training module is used to train the open-source community-developed optimizer OSC-Optimize model through the following steps: Based on the operation of the open source community, various improvement and optimization strategies were set up; Using the acquired training data as input, the feature vector OSC-Vect is generated through the trained open-source community feature generator OSC-GenFv. Based on the trained open-source community analyzer OSC-Analysis and the open-source community development evaluation profile model OSC-Profiler, a data profile of the open-source community is created, resulting in the data label and quantitative score OSC-Tag of the open-source community. Based on the data tags and quantitative scores OSC-Tag from the open-source community, set the tags and quantitative scores to be optimized, and mark the corresponding improvement and optimization strategies; The gradient descent optimization algorithm was used to train the open-source community development optimizer OSC-Optimize, resulting in the trained open-source community development optimizer OSC-Optimize.

8. The deep learning-based open-source community development evaluation system according to claim 7, characterized in that... The community evaluation guidance module is used to guide and evaluate the open-source community through the following steps: Based on the set time points, evaluation data is obtained from multiple data sources; For the evaluation data obtained from the multiple data sources, feature extraction and feature fusion are performed using the trained open-source community feature generator OSC-GenFv to obtain the feature vector OSC-Vect. Based on the recommended time points, generate the feature vector sequence OSC-Seq of the feature vector OSC-Vect; Using the feature vector sequence OSC-Seq as input, the open source community is profiled by the trained open source community analyzer OSC-Analysis and the open source community development evaluation and profiling model OSC-Profiler, generating data tags and quantitative scores OSC-Tags for the open source community. Using the feature vector sequence OSC-Seq as input, the trained open-source community analyzer OSC-Analysis and open-source community development predictor OSC-Predict are used to generate the feature vector OSC-Next-Vect for future moments. Using the feature vector OSC-Next-Vect as input, the trained open-source community analyzer OSC-Analysis and open-source community development evaluation and profiling model OSC-Profiler are used to continuously perform data profiling analysis on the open-source community, generating data tags and quantitative scores OSC-Tag for the open-source community. Combining the community profiles of both parties, a series of data tags and quantitative scores (OSC-Tag) for open source communities are output in chronological order for evaluating the current and future development of open source communities. Simulate open-source community development scenarios to generate various evaluation directions for the future development of open-source communities; Based on the time-series open-source community feature vector OSC-Vect and the data labels and quantitative scores of the open-source community obtained from the community profile OSC-Tag, the trained open-source community development optimizer OSC-Optimize is used to optimize the open-source community and generate improved optimization strategies. We regularly acquire evaluation data from multiple data sources and continuously optimize the model parameters of the open-source community feature generator OSC-GenFv, the open-source community profile model, the open-source community analyzer OSC-Analysis, the open-source community development evaluation profile model OSC-Profiler, the open-source community development predictor OSC-Predict, and the open-source community development optimizer OSC-Optimize.