A teaching management service system based on digital empowerment of engineering and learning integration
By integrating the teaching and research management service system empowered by digital intelligence, the system integrates work-study integrated teaching data, realizes full-process automated management and multi-dimensional analysis, solves the problems of data dispersion and insufficient personalized adaptation, and improves teaching quality and management efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HANGZHOU WOTU EDUCATION TECH CO LTD
- Filing Date
- 2026-03-04
- Publication Date
- 2026-06-19
AI Technical Summary
The current integrated work-industry teaching management suffers from problems such as data dispersion, delayed analysis, and insufficient personalized adaptation, leading to a break in the teaching feedback loop and affecting teaching effectiveness.
By adopting a teaching and research management service system based on digital intelligence, targeted training content is generated through data collection, cleaning, encryption, intelligent processing, and multi-dimensional analysis, achieving full-process data-driven management and personalized teaching adaptation.
It improved the efficiency of teaching management, enabled precise positioning of students' learning outcomes and generation of personalized training content, formed a closed-loop teaching system, and improved teaching quality.
Smart Images

Figure CN122240703A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of smart education technology, specifically to a work-integrated teaching and research management service system based on digital intelligence empowerment. Background Technology
[0002] The integrated work-study teaching and research model is a vocational education model that takes the competency requirements of professional positions as its core, deeply integrates professional theoretical teaching with on-the-job practical training, and promotes teaching and research in synergy. It breaks down the traditional teaching barriers of "separation of theory and practice, and disconnect between teaching and research" and follows the core principles of "learning by doing, doing while learning, and promoting teaching through teaching and research". Through a curriculum system that integrates "job, course, competition and certification", a teaching scenario that integrates theory and practice, and an implementation path that links teaching and research with teaching, it achieves the simultaneous cultivation of students' professional knowledge, practical skills and job qualities, and ultimately achieves the vocational education goal of "teaching content aligning with job requirements, teaching and research results feeding back into teaching practice, and students' abilities matching job standards".
[0003] In the current process of integrated work-study teaching management, there are problems such as data dispersion, delayed analysis, and insufficient personalized adaptation: traditional teaching management relies heavily on manual recording of training status and statistics of class completion data, which is inefficient and prone to errors; core data such as the percentage of assessment completion and the number of trainees lack real-time integration and analysis, making it impossible to quickly grasp the overall learning effectiveness of trainees; it is difficult to accurately locate the root causes of trainees' learning deficiencies and generate targeted training content, resulting in a broken teaching feedback loop and affecting the effectiveness of integrated work-study teaching. Summary of the Invention
[0004] To address the aforementioned technical problems, this invention provides a digitally-enabled integrated work-industry teaching and research management service system. This system utilizes digital technology to achieve data-driven management of the entire teaching and research process, analyzes student learning outcomes from multiple dimensions, generates targeted training content to address shortcomings, and improves the quality and management efficiency of integrated work-industry teaching. The digitally-enabled integrated work-industry teaching and research management service system includes: The data acquisition module is used to collect and obtain data from the entire process of integrated work-study teaching.
[0005] The data cleaning module is used to preprocess the data of the entire process of integrated work-study teaching to obtain standardized research and study data.
[0006] The data encryption module is used to encrypt standard research and study data, thereby increasing data security.
[0007] The data processing module is used to intelligently analyze the encrypted standard study data to obtain targeted personalized recommendations.
[0008] The teaching analysis module performs multi-dimensional analysis on standard study tour data and generates a teaching analysis report.
[0009] The interactive display module is used to display and play personalized recommended content and teaching analysis reports, and to allow for interactive operations.
[0010] The preferred approach is to include four categories of data for the entire process of integrated work-study teaching: basic student information, training process data, assessment data, and class completion data.
[0011] Preferred: Student basic information includes: name, major, basic skills and / or learning goals.
[0012] Preferred training process data includes: attendance records, classroom interaction time, practical operation time, assignment completion quality, and / or training resource access records.
[0013] Preferred: Assessment data includes: assessment subjects, assessment scores, assessment pass rate, incomplete assessment items and / or distribution of incorrect questions.
[0014] Preferred data for class completion includes: the number of students who completed the class, the reasons for not completing the class, the ranking of the class completion results, and / or the completion status of practical achievements.
[0015] The preferred preprocessing steps include: first, data deduplication; second, data completion; and third, outlier removal.
[0016] Preferred method: Data deduplication processing includes: generating unique hash values for the collected integrated engineering and learning teaching process data according to data type, and eliminating all duplicate data entries by traversing and comparing hash values; for data with duplicate fields but different core information, data fusion processing is performed.
[0017] The preferred data fusion processing flow includes: for data with duplicate fields or differences in core information, first extracting effective features related to the core information of the differing data; constructing multi-dimensional feature vectors based on the needs of engineering teaching and practical job scenarios; standardizing the vectors for the differences in the values of each feature vector; assigning weights to the feature vectors; then performing difference data matching on the multi-dimensional feature vectors under the same tracking number to obtain each feature vector set; then calculating the comprehensive credibility of each feature vector set; and finally using the feature vector set with the highest comprehensive credibility as the deduplication result.
[0018] Preferred: Overall credibility Where i and k are the dimension numbers of the feature vectors, I is the total number of feature vector dimensions, k, i = 1, 2, ..., I and k ≠ i; w i It is the weight of the feature vector of dimension i, s i The data source trust level of the feature vector of dimension i; d kiIt is the correlation coefficient of the feature vector data of dimensions i and k.
[0019] The preferred encryption process includes the following steps: First, key generation and management: A 256-bit key is generated using a random number generator. A key splitting mechanism is used, storing the key separately on a cloud encryption server and a local security module. Encryption / decryption operations can only be initiated after both ends of the key are verified. Second, sensitive data encryption: Sensitive data in the entire process of integrated learning and education, such as student grades, personal identity information, and assessment rankings, is encrypted before being written to the database. The encrypted ciphertext is then stored. Simultaneously, the transmission of sensitive data is encapsulated using SSL / TLS protocols to prevent data theft or tampering during transmission. Third, decryption authorization management: Decryption permissions are assigned based on role-based access. Only administrators and corresponding instructors can decrypt and view sensitive data within their authorized scope after identity verification. The entire decryption process is traceable, generating operation logs for auditing and traceability, ensuring data security and controllability.
[0020] Preferably, the data processing module can include a clustering unit, a feature analysis unit, a data analysis unit, and a resource matching unit. The clustering unit is used to cluster the encrypted standard learning data to obtain clusters and extract cluster features. Clustering is the basic layer: first, the practical skills data of all trainees are divided into groups, and common features of four skill groups (excellent / good / medium / weak) are extracted to provide a group feature benchmark for subsequent algorithms. The feature analysis unit is used to divide the sample space according to the clusters and obtain data features that significantly affect the target variable. This is the analysis layer: based on the clustering results, the core factors affecting the practical skills effect are identified, clarifying "which skill indicators play a key role in learning effectiveness," and defining the core focus dimensions for subsequent individual assessments. The data analysis unit is used to integrate multiple decision trees, construct the training set for each tree based on the data features of each sample space, and output the final evaluation results using a voting method. The evaluation layer, using clustering features as a reference and core factors located by decision trees as weights, constructs an engineering job competency assessment model to accurately rate the abilities and identify weaknesses of individual trainees, providing a basis for individual needs in resource matching. The resource matching unit, based on the final evaluation results, uses the Spark distributed computing framework combined with a cosine similarity algorithm to quickly match course resources and obtain personalized recommendations. The implementation layer, based on the output of individual weaknesses, uses Spark distributed computing and cosine similarity to quickly match job-specific teaching resources, transforming the analysis / evaluation results of the preceding algorithms into actual teaching actions, achieving the final implementation of algorithm-enabled learning.
[0021] The preferred method for obtaining clusters includes the following steps: First, based on the core requirements of engineering practice, select a preset number of core feature indicators and normalize the data; Second, determine the number of clusters J; Third, randomly select J samples as initial centers; Fourth, calculate the Euclidean distance between each student sample and the J centers, assign the sample to the nearest cluster, and recalculate the mean of each cluster as the new cluster center, repeating the iteration until the cluster centers are stable; Fifth, generate J groups of students and extract the mean of each cluster feature as the clustering feature.
[0022] The preferred core feature indicators are five, including: student attendance rate, practical operation completion rate, average assessment score, practical results pass rate, and standardized use rate of tools and equipment.
[0023] Preferred method for determining the number of clusters J includes: calculating the sum of squared errors within the cluster corresponding to different j values, and the sum of squared distances from each sample to the center of its cluster. As the j value increases, the sum of squared distances gradually decreases. When j increases to a certain value, the rate of decrease in the sum of squared distances drops sharply. The j value corresponding to this inflection point is the optimal value J.
[0024] The preferred sample space partitioning process includes: First, determining the target variable and feature variables; Second, constructing a decision tree, using the Gini coefficient as the partitioning criterion, calculating the Gini coefficient gain for each feature, and prioritizing the feature with the largest gain as the root node; Third, repeating the feature selection and partitioning process for each child node until the samples within the node belong to the same category or reach a preset depth; Fourth, using a pre-pruning method, stopping the partitioning when the number of samples in a node is less than or equal to the preset depth; Fifth, determining key influencing factors as data features through the decision tree path, and outputting them in order of influence weight.
[0025] The preferred method for constructing the training set includes the following steps: First, constructing an evaluation index system and establishing dimensions for assessing student knowledge mastery; Second, dividing the dataset by selecting 80% of the student data as the training set and 20% as the test set, generating 100 training subsets using bootstrap sampling; Third, constructing a decision tree forest by building a CART decision tree based on each training subset, with random sampling used for feature selection in each tree; Fourth, training and validating the model by training 100 decision trees and validating the model accuracy using the test set, removing decision trees with an accuracy below 80%; Fifth, assessing knowledge mastery by inputting student data into the forest, with each tree outputting an evaluation score.
[0026] The preferred method for rapid course resource matching includes: First, tagging the materials in the course resource library to generate resource feature vectors; Second, constructing student demand tag vectors based on the knowledge gaps and professional skill goals output by the random forest model according to the final evaluation results; Third, using the Spark framework to calculate the cosine similarity between the student demand tag vectors and resource feature vectors in parallel, setting a similarity threshold of ≥0.7 for suitable resources, and prioritizing matching resources that are compatible with existing training venues and equipment; Fourth, resource sorting and recommendation, sorting by similarity from high to low.
[0027] Preferred: Multi-dimensional analysis includes: training status, class completion statistics, assessment completion rate and / or training number statistics.
[0028] Preferred approach: Training situation analysis: Statistics include student attendance rate, classroom interaction participation rate, practical operation completion rate, and assignment submission rate. Compare the training participation of different classes and different professional directions, analyze the weak links in the training process, and investigate the reasons in combination with factors such as teachers and venues.
[0029] Preferred approach: Class completion statistical analysis: Statistics on class completion rate, average class completion score, excellent rate, and failure rate; analysis of the main reasons for students who did not complete the class; generation of a class completion quality report; and comparison of class completion data from different batches of training to evaluate the effectiveness of the teaching plan.
[0030] Preferred approach: Assessment completion rate analysis: Statistics on the assessment completion rate by assessment subject, assessment type, and assessment stage; analysis of the distribution of students who did not complete the assessment and the reasons for not completing it; accuracy statistics for assessment question types and assessment items to identify frequently missed questions and weak operation items.
[0031] Preferred approach: Training participant statistics and analysis: Statistics on the total number of trainees, actual participants, and dropouts for each batch, specialty, and class; analysis of trends in participant numbers and reasons for dropouts; analysis of retention rates and learning outcomes for trainees with different skill levels, based on trainee basic information, to provide a basis for optimizing training programs.
[0032] The technical effects and advantages of this invention are as follows: 1. Full-process digital management: Integrating data from the entire process of work-study integrated teaching, and through multi-source data collection and digital processing, it realizes automated and precise management of training, assessment, and graduation, greatly improving teaching management efficiency and reducing errors caused by human intervention.
[0033] 2. Multi-dimensional and precise analysis: Data analysis is conducted from multiple dimensions, including training status, class completion statistics, assessment completion rate, and number of trainees, to accurately identify the overall learning effectiveness and individual learning deficiencies of trainees, providing data support for teaching optimization.
[0034] 3. Personalized teaching adaptation: Based on learning deficiencies, targeted training content is intelligently generated, taking into account both common problems and individual differences, forming a closed-loop teaching system of "analysis-generation-supplementary training-feedback", which effectively improves students' learning outcomes and the quality of work-study integration teaching.
[0035] 4. Data Security and Visualization: Data encryption technology is used to ensure the security of sensitive information. A visual interface is used to intuitively display data and teaching management content, meeting the needs of different roles and improving the system's usability. Attached Figure Description
[0036] Figure 1 This is a structural block diagram of an integrated teaching and research management service system based on digital intelligence empowerment proposed in this invention.
[0037] Figure 2 This is a schematic diagram of the preprocessing process in a digitally-enabled integrated teaching and research management service system for engineering and learning proposed in this invention.
[0038] Figure 3 This is a schematic diagram of the data fusion processing flow in an integrated engineering-industry teaching and research management service system based on digital intelligence empowerment proposed in this invention.
[0039] Figure 4 This is a schematic diagram of the encryption process in an integrated engineering-industry teaching and research management service system based on digital intelligence empowerment proposed in this invention.
[0040] Figure 5 This is a flowchart illustrating a method for obtaining clusters in an integrated teaching and research management service system based on digital intelligence, as proposed in this invention.
[0041] Figure 6 This is a flowchart illustrating a sample space partitioning method in an integrated engineering-industry teaching and research management service system based on digital intelligence empowerment, as proposed in this invention.
[0042] Figure 7 This is a flowchart illustrating the training set construction method in a digitally-enabled integrated teaching and research management service system proposed in this invention.
[0043] Figure 8 This is a flowchart illustrating a method for rapid matching of course resources in an integrated teaching and research management service system based on digital intelligence, as proposed in this invention. Detailed Implementation
[0044] Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the invention, and should not be construed as limiting the invention. Rather, embodiments of the invention include all variations, modifications, and equivalents falling within the spirit and scope of the appended claims.
[0045] Example 1 refer to Figure 1 This embodiment proposes a digitally-enabled integrated work-industry teaching and research management service system. Through digital technology, it achieves data-driven management of the entire teaching and research process, analyzes student learning outcomes from multiple dimensions, generates targeted training content to address shortcomings, and improves the quality and management efficiency of integrated work-industry teaching. The digitally-enabled integrated work-industry teaching and research management service system includes: The data acquisition module is used to collect and obtain data from the entire process of integrated work-integrated learning. This data can include four main categories: student basic information, training process data, assessment data, and class completion data. A multi-source data fusion acquisition method can be used to ensure the comprehensiveness and real-time nature of the data. Student basic information may include, but is not limited to: name, major, basic skills, and / or learning objectives. Training process data may include: attendance records, classroom interaction time, practical operation time, assignment completion quality, and / or training resource access records. Assessment data may include: assessment subjects, assessment scores, pass rate, incomplete assessment items, and / or distribution of incorrect answers. Class completion data may include: number of students who graduated, reasons for not graduating, class ranking, and / or completion status of practical achievements. The data acquisition module can collect data through multiple channels, including IoT devices (such as practical operation terminals and attendance devices), teaching platforms, assessment systems, and manual data entry interfaces, and store the data in the database after standardized format processing. For example, a trainee completes attendance by using an edge-based check-in device, records operational data through a practical operation terminal during training, and submits theoretical exam papers and practical results through the assessment system during the assessment. All data is uploaded to the cloud database after being processed in a standardized format, while special data (such as explanations of special circumstances of trainees) is manually entered to supplement and improve the data.
[0046] The data cleaning module is used to preprocess data from the entire process of integrated work-study teaching to obtain standardized research and study data. (Reference) Figure 2The preprocessing process may include: First, data deduplication. This step removes or merges duplicate data from the entire integrated work-integrated learning teaching process. A hash-based deduplication algorithm can be used to generate unique hash values for the collected integrated work-integrated learning teaching process data according to data type (e.g., student basic information, assessment scores). Duplicate data entries are then eliminated by iterating and comparing the hash values. For data with duplicate fields but different core information (e.g., multi-channel data collected from the same student for the same practical task, or job practice records under the same tracking number), instead of directly eliminating or using them in isolation, data fusion processing is performed. Fusion methods include feature-level fusion and decision-level fusion. (See reference...) Figure 3 The specific process is as follows: Feature-level fusion: For data with some fields being repeated and core information being different, first extract the effective features of the core information related to the data with differences. Combined with the needs of engineering teaching and job practice scenarios, construct a multi-dimensional feature vector. Taking CNC lathe operation data as an example, if the same batch of processing tasks (tracking number: SC202601001) for the same student has two sources: data collected from machine tool terminals and data recorded manually, some fields (such as processing time) are repeated but the core information (dimensional accuracy deviation, tool wear record) is different. At this time, the core features extracted include: dimensional accuracy deviation value, tool wear amount, processing pass rate, old equipment compensation adjustment value, and compliance nodes of the operation process. Based on these features, a 5-dimensional feature vector is constructed. For the vector of differences in the value of each feature vector, the standardized value of each dimension of the corresponding feature is mapped to the interval [0, 1]. Decision-level fusion: First, weights are assigned to feature vectors based on the importance of practical skills in engineering positions (e.g., for CNC lathe operators, dimensional accuracy deviation has a weight of 0.3, machining pass rate has a weight of 0.25, tool wear has a weight of 0.2, compensation adjustment value for aging equipment has a weight of 0.15, and compliance nodes in the practical process have a weight of 0.1). Then, multi-dimensional feature vectors under the same tracking number are matched for differences to obtain various feature vector sets. The feature vectors in each set have the same dimension. Due to the existence of differentiated data, each feature vector will have at least one different dimension. For example, the feature vector component of the dimensional accuracy deviation value collected by the terminal is 0.8, while the feature vector component of the manually recorded dimensional accuracy deviation value is 0.7. During the matching process, {0.8, ...} and {0.7, ...} will inevitably exist. This is just a simple example; the actual data will be more complex, which will not be elaborated here. Finally, the overall reliability of each feature vector set is calculated. Where i and k are the dimension numbers of the feature vectors, I is the total number of feature vector dimensions, k, i = 1, 2, ..., I and k ≠ i; w i It is the weight of the feature vector of dimension i, s iThis refers to the data source trust level of the feature vector with dimension number i. For example, the trust level for IoT terminal data collection is set to 0.9, and the trust level for manual recording is set to 0.7. This can be dynamically adjusted according to the accuracy of the data collection channel, and then different data sources can be used; d ki This refers to the correlation coefficient of the feature vector data of dimensions i and k. For two unrelated feature vectors, the correlation coefficient can be 1. If the two values corroborate each other, the correlation coefficient can be greater than 1, specifically in the range (1, 5). If the two values contradict each other, the correlation coefficient will be less than 1, with the value decreasing as the degree of contradiction increases, specifically in the range (0, 1). Of course, other value settings are not excluded. For example, if the tool wear is 5, the total machining volume is 10, and the machining pass rate is 100%, the correlation coefficient can be defined as 0.5. This is just a simple example, and the details will not be elaborated here. Then, the feature vector set with the highest overall credibility is used as the deduplication result. This method is used for calculation. It can match each differentiated data with relevant data, without decisively discarding or using it. It does not simply evaluate the data itself, but evaluates the overall credibility of the integration of the data and relevant data, making the evaluation more accurate and comprehensive. This method can evaluate the correlation between differentiated data and other dimension vectors, taking into account contradictory and contradictory relationships, and quickly weakening contradictory relationships to avoid data distortion and improve calculation accuracy.
[0047] The second step is data completion processing, which combines rule-based completion with intelligent inference completion. For non-critical fields missing in the deduplication results (such as training resource access time), intelligent filling is performed based on the average data of trainees in the same batch and major. For missing critical fields (such as assessment scores and basic skills), system pop-ups remind administrators or teachers to supplement the data, while simultaneously linking the data collection channels to trace the reasons for the missing data. The third step is outlier removal processing, which employs a dual mechanism of 3σ criterion and business rule verification. First, the 3σ criterion is used to filter out outlier data that deviates from the data mean by more than 3 standard deviations (such as practical operation time exceeding the reasonable range, abnormally high / low assessment scores). Then, it is further verified by combining work-study teaching business rules (such as the requirement that sign-in time must be within the training period, and the setting of full marks for assessment scores). Once anomalies are confirmed, they are marked and removed, and the details of the outlier data are recorded for subsequent source tracing analysis. Finally, standardized and highly accurate standard study tour data is output. Data cleaning, through precise adaptation to engineering characteristics, efficient data quality assurance, and linkage with all system modules, differs from general data cleaning solutions. It focuses on the unique characteristics of integrated engineering and science teaching data (on-the-job practice, assessment, and training), employing a dual strategy of "hash deduplication + data fusion." Completely duplicated data is directly removed to improve efficiency, while data with high frequency in engineering scenarios—such as "partial field duplication and different core information"—is not blindly removed. Instead, core practical information is preserved through feature-level and decision-level fusion, solving the problem of lost practical data caused by the "one-size-fits-all" approach of general cleaning. Data completion / anomaly removal is highly accurate and aligns with engineering and science business rules: During completion, "critical fields" and "non-critical fields" are distinguished. Non-critical fields are filled with the average value of the same batch, major, and job (adapting to the grouped training characteristics of engineering and science positions). Key fields are manually supplemented, and the reasons are traced. Anomaly removal uses a dual verification of "3σ criterion + engineering and science business rules," filtering out numerical anomalies (such as practical precision data exceeding tolerance) and verifying against teaching rules (such as attendance time and assessment full score limits), ensuring that the cleaned data meets the requirements of practical work and teaching management. After cleaning, the data undergoes standardization and can be directly integrated into the data processing module. Simultaneously, abnormal data and fusion logs from the cleaning process are retained, supporting the data analysis module's learning effectiveness analysis and source tracing, ensuring the smooth closed-loop flow of the system's "data collection-cleaning-analysis-content generation." A vector is constructed using "feature-level fusion + decision-level fusion." Decision-level fusion assigns feature weights based on job importance and sets data source credibility, solving the problem of insufficient accuracy in practical data caused by the "cross-job generalization" of existing cleaning technologies. This ensures that the cleaned data accurately reflects the trainees' practical skills.
[0048] The data encryption module is used to encrypt standard research and study data, increasing data security. (Reference) Figure 4The specific steps can include: First, key generation and management: a 256-bit key is generated using a random number generator. A key splitting mechanism is used, storing the key separately on a cloud-based encryption server and a local security module. Encryption / decryption operations can only be initiated after both ends of the key are verified. Second, sensitive data encryption: sensitive data such as student grades, personal identification information, and assessment rankings are encrypted before being written to the database. The encrypted ciphertext is then stored, and the data transmission process is encapsulated using SSL / TLS protocols to prevent data theft or tampering during transmission. Third, decryption authorization management: decryption permissions are assigned based on role-based access. Only administrators and corresponding instructors can decrypt and view sensitive data within their authorized scope after verifying their identity (account password + dynamic verification code). The entire decryption process is traceable, generating operation logs for auditing and traceability, ensuring data security and controllability.
[0049] The data processing module is used to intelligently analyze the encrypted pre-research data to obtain targeted personalized recommendations. Specifically, the data processing module may include a clustering unit, a feature analysis unit, a data analysis unit, and a resource matching unit.
[0050] The clustering unit is used to cluster the encrypted standard research data to obtain clusters and extract cluster features. Each cluster is independent, maximizing intra-cluster sample similarity and minimizing inter-cluster sample similarity. (Reference) Figure 5The specific process of obtaining clusters can include: First, based on the core needs of engineering practice, select five core feature indicators: student attendance rate, practical operation completion rate, average assessment score, practical operation qualification rate, and standardized use rate of tools and equipment. Normalize the data (map the indicator values to the [0, 1] interval) to eliminate the influence of dimensions. Second, determine the number of clusters J (usually 2-10), and calculate the sum of squared errors within the cluster corresponding to different j values. The sum of squared distances from each sample to its cluster center (mean of multi-dimensional feature vector) will gradually decrease as the j value increases (samples within the cluster become more concentrated). When j increases to a certain value, the decrease in the sum of squared distances will be sharp. The j value corresponding to this inflection point is the optimal value J. For example, for practical data on CNC lathe operators, the sum of squared distances for j ranges from 2 to 6. When j increases from 3 to 4, the decrease in the sum of squared distances drops from 35% to 8%. Considering the four-tiered competency classification of "excellent, good, average, and weak" in the engineering scenario, J=4 is ultimately determined, which is both reasonable and suitable for teaching evaluation needs. The third step is to initialize cluster centers by randomly selecting J samples as initial centers. The fourth step is iterative clustering, calculating the Euclidean distance between each student sample and the J centers, assigning the samples to the nearest cluster, and recalculating the mean of each cluster as the new cluster center. This iteration is repeated until the cluster centers stabilize (iteration count ≤ 50 times). The fifth step is to output the results, generating J groups of students and extracting the mean of each cluster feature as the clustering feature. For example, (electrical engineering major): the cluster identified a "weak group" with the following characteristics: attendance rate ≤ 85%, wiring practice completion rate < 50%, average assessment score < 60, practical operation pass rate < 40%, tool standard usage rate < 70%, and the safety operation compliance rate is generally more than 40% lower than other groups, which aligns with the core requirements of electrical engineering practical safety and standardization.
[0051] The feature analysis unit is used to partition the sample space based on clusters and obtain data features that significantly affect the target variable (learning performance level). (Reference) Figure 6The specific implementation process is as follows: First, determine the target variable and feature variables. The target variable is the cluster, i.e., the student's learning effectiveness level (excellent, good, average, weak). The feature variables, based on the clustering features, include three additional engineering-scenario indicators: teacher suitability, practical venue utilization rate, and phased practical assessment pass rate. Second, construct a decision tree, using the Gini coefficient as the splitting criterion, calculating the Gini coefficient gain for each feature, and prioritizing the feature with the largest gain as the root node. Third, recursively split the nodes, repeating the feature selection and splitting process for each child node until the samples within the node belong to the same category or reach a preset depth (maximum depth set to 5 to avoid overfitting). Fourth, prune and optimize, using a pre-pruning method; stop splitting when the number of samples in a node is ≤5 to improve the model's generalization ability. Fifth, analyze the results, determining key influencing factors as data features through the decision tree path, and outputting them sorted by influence weight.
[0052] The data analysis unit integrates multiple decision trees, constructs a training set for each tree based on the data features of each sample space, and uses a voting method to output the final evaluation result, improving the model's accuracy and stability. (Reference) Figure 6The specific implementation process is as follows: First, construct an evaluation index system. This system can be aligned with national vocational skill level standards, selecting six categories of work-study adaptation indicators: theoretical scores, practical operation scores, error repetition rate, knowledge point mastery coverage, practical operation step standardization scores, and troubleshooting time. This forms the evaluation dimension for student knowledge mastery (out of 100 points). Second, divide the dataset, selecting 80% of the student data as the training set and 20% as the test set. Bootstrap sampling is used to generate 100 training subsets (each subset contains duplicate samples). Third, construct a decision tree forest, building a CA (Category A) based on each training subset. The first step is to use RT decision trees. For each tree, features are randomly selected (4 features are randomly chosen from 6 features) to avoid overfitting of a single tree. The fourth step is model training and validation. 100 decision trees are trained, and the model accuracy is validated using a test set (accuracy ≥ 88%). Decision trees with accuracy below 80% are removed, leaving 86 trees to form a random forest. The fifth step is knowledge mastery assessment. Student data is input into the forest, and each tree outputs an assessment score. The average score is taken as the final mastery score (score < 60 indicates weak knowledge, 60-79 indicates moderate knowledge, 80-89 indicates good knowledge, and ≥ 90 indicates excellent knowledge). For example (Mechanical Manufacturing major, benchmarked against the intermediate CNC lathe operator standard): Student Zhang scored 75 points in the CNC programming theory subject, 68 points in the practical subject subject, with a 15% error repetition rate, 72% knowledge point coverage, 70 points for the standardization of practical steps, and 20% exceeding the standard time for troubleshooting complex parts. The model evaluation score was 65 points, indicating "failure to meet the intermediate-level programming practical requirements," with weaknesses in the application of complex programming instructions and rapid fault diagnosis. The final evaluation result includes not only the model evaluation score but also specific data such as its data characteristics.
[0053] Resource Matching Unit: Based on the final evaluation results, the Spark distributed computing framework is used in conjunction with the cosine similarity algorithm to achieve rapid matching of course resources, while also incorporating engineering and learning space constraints, equipment limitations, and skill level requirements. (Reference) Figure 8The specific implementation process is as follows: First, construct a resource feature library. Micro-lessons, practical videos, question banks, and virtual simulation resources in the course resource library are tagged with new labels such as "Professional Direction," "Skill Level," and "Site Suitability" (e.g., electrical resources are labeled "Low-voltage Wiring," "Intermediate Technician," and "Suitable for Ordinary Training Platforms," while CNC resources are labeled "Complex Programming," "Intermediate Technician," and "Suitable for CNC Lathes"). This generates resource feature vectors. Second, generate student demand vectors. Based on the knowledge gaps and professional skill goals output by the final evaluation results using a random forest model, student demand label vectors are constructed. Third, similarity calculation. The cosine similarity between the student demand label vector and the resource feature vector is calculated in parallel using the Spark framework. A similarity threshold of ≥0.7 is set for suitable resources, while prioritizing resources that are compatible with existing training sites and equipment. Fourth, resource sorting and recommendation. Resources are sorted from high to low similarity, and the top three resources are selected as personalized recommendations. Priority is optimized based on resource usage rates among groups of similar skill levels. For example (Zhang, a mechanical manufacturing major): His needs vector is [CNC programming, complex instructions, troubleshooting, intermediate-level technician, CNC lathe adaptation]. Recommended resources include the micro-course "Practical Tutorial on Complex Instructions for Intermediate-Level CNC Technicians" (0.89 similarity), the "Complex Parts Programming Troubleshooting Simulation Question Bank" (0.82 similarity), and the "One-on-One Guidance Video on CNC Lathe Practical Troubleshooting" (0.78 similarity). This collaborative closed-loop system, formed by four units, overcomes the limitations of single algorithms in integrated engineering and learning teaching, which suffer from "incomplete analysis and poor applicability." It serves the digital processing module as a whole, addressing the core objectives of data processing → algorithm analysis → capability assessment → resource empowerment. Its overall significance and advantages lie in three dimensions: adaptation to engineering and learning scenarios, teaching empowerment, and technological innovation. It forms the core algorithmic foundation supporting the entire intelligent engineering teaching system in achieving precise integrated engineering and learning teaching. Achieving a data-driven, precise, and intelligent transformation of integrated work-study teaching: Breaking away from the traditional work-study teaching model that relies on experience and manual statistical analysis, this approach transforms students' practical work data into analyzable, implementable, and optimizable teaching data through the collaborative operation of four units. This achieves "data-driven group-based differentiated instruction, directional individualized remedial training, and goal-oriented teaching optimization." The entire process revolves around job competency requirements, from group characteristic extraction to individual competency assessment and resource matching, all aligned with vocational skill level standards and job practical requirements. This ensures deep integration of teaching content with job demands and skills certification, achieving the core goal of integrated work-study teaching: "student competency matching job standards." A closed loop of "algorithm analysis - teaching implementation - effect feedback - algorithm optimization" is formed: After the calculation results of the four units are implemented into teaching actions (differentiated instruction, personalized remedial training), the teaching effect data is fed back through the data acquisition module and used as input data for the algorithm again. This enables continuous iterative optimization of the algorithm model, constantly improving the ability of digital intelligence to empower teaching and adapting to the dynamic changes in work-study teaching needs.The four units each perform their own functions and support each other, seamlessly connecting the four core links of engineering teaching: "group analysis, factor identification, individual assessment, and resource matching." This avoids the problem of a single algorithm being "capable of analysis but not implementation, capable of assessment but not matching," and forms a complete algorithm-enabled system from data to teaching, from groups to individuals, and from analysis to implementation. The overall efficiency is far higher than the simple superposition of a single algorithm.
[0054] The teaching analysis module performs multi-dimensional analysis of standard study tour data to obtain a teaching analysis report. These multi-dimensional analyses can include four core dimensions: training status, class completion statistics, assessment completion rate, and trainee statistics. Training status analysis: This involves statistically analyzing indicators such as student attendance rate, classroom interaction participation rate, practical operation completion rate, and assignment submission rate. It compares participation across different classes and specializations, analyzes weaknesses in the training process (e.g., low participation in a specific practical project, insufficient attendance at a particular time), and identifies the causes by considering factors such as faculty and venue. Class completion statistics analysis: This involves statistically analyzing the completion rate, average completion score, excellent rate (top 10%), and failure rate. It analyzes the main reasons for students not completing the course (e.g., failing the assessment, inadequate practical results, personal reasons for dropping out), generating a completion quality report. Simultaneously, it compares completion data from different training batches to evaluate the effectiveness of the teaching plan. Assessment Completion Rate Analysis: The completion rate of assessments is statistically analyzed by subject, type (theory / practice), and stage (interim / final assessment). The distribution of trainees who did not complete the assessments and the reasons for non-completion (e.g., insufficient time, weak grasp of knowledge, lack of operational proficiency) are analyzed. Accuracy rates are statistically analyzed for each question type and item to identify frequently missed questions and weak operational skills. Training Participant Statistics Analysis: The total number of trainees, actual participants, and dropouts for each batch, major, and class are statistically analyzed. Trends in participant numbers and reasons for dropouts are analyzed. Combined with trainee basic information, retention rates and learning outcomes differences among trainees with different skill levels are analyzed to provide a basis for optimizing the training program. Based on the above multi-dimensional data, a personal learning profile is generated for each trainee, integrating data on their training participation, assessment scores, and practical performance. AI algorithms are used to comprehensively evaluate trainees' learning abilities, knowledge mastery, and practical skills, accurately identifying individual deficiencies in theoretical knowledge, practical operation, and learning habits. The above data constitutes the teaching analysis report.
[0055] The interactive display module is used to showcase and interact with personalized recommended content and teaching analysis reports. This module features a visual interface, providing different access points for teachers, administrators, and students. Teachers can view multi-dimensional data analysis reports for their classes, individual student learning profiles, and personalized remedial training plans, tracking remedial training effectiveness in real time. Administrators can view a summary of training data across the entire platform, rankings of teaching quality for each class, and resource usage, enabling comprehensive management and decision support. Students can view their individual learning progress, assessment scores, analysis of learning deficiencies, and personalized recommended content, independently arranging remedial learning, and also enjoy interactive operations such as asking questions and submitting assignments online.
[0056] It should be understood that the various forms of processes shown above can be used to reorder, add, or delete steps. For example, the steps described in this invention disclosure can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution disclosed in this invention can be achieved, and this is not limited herein.
[0057] The specific embodiments described above do not constitute a limitation on the scope of protection of this invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.
Claims
1. A digital empowerment-based engineering-integrated teaching and research teaching management service system, characterized in that, include: The data acquisition module is used to collect and obtain data from the entire process of integrated work-study teaching. The data cleaning module is used to preprocess the data of the entire process of integrated work and study teaching to obtain standard research and study data; The data encryption module is used to encrypt standard research and study data; The data processing module is used to intelligently analyze the encrypted standard study data to obtain targeted personalized recommendations. The teaching analysis module is used to perform multi-dimensional analysis of standard study tour data and obtain a teaching analysis report. The interactive display module is used to display and play personalized recommended content and teaching analysis reports, and to allow for interactive operations.
2. The digital empowerment-based engineering-integrated teaching and research teaching management service system according to claim 1, characterized in that, The preprocessing process includes: first, data deduplication; second, data completion; and third, outlier removal.
3. The digital empowerment-based engineering-integrated teaching and research teaching management service system according to claim 2, characterized in that, Data deduplication includes: generating unique hash values for the collected integrated engineering and learning teaching process data according to data type; eliminating duplicate data entries in all fields by traversing and comparing hash values; and performing data fusion processing for data with duplicate fields but different core information.
4. The digital empowerment-based engineering-integrated teaching and research teaching management service system according to claim 3, characterized in that, The specific data fusion processing flow includes: for data with duplicate fields or differences in core information, firstly extract the effective features related to the core information of the differing data; construct multi-dimensional feature vectors based on the needs of engineering teaching and practical job scenarios; standardize the vectors for the differences in the values of each feature vector; assign weights to the feature vectors; then perform difference data matching on the multi-dimensional feature vectors under the same tracking number to obtain each feature vector set; then calculate the comprehensive credibility of each feature vector set; and finally use the feature vector set with the highest comprehensive credibility as the deduplication result.
5. The digital empowerment-based engineering-integrated teaching-research-teaching management service system according to claim 1, characterized in that, The specific steps of encryption processing include: first, key generation and management; second, encryption of sensitive data; and third, decryption authorization management.
6. The digital empowerment-based engineering-integrated teaching and research management service system according to any one of claims 1-5, characterized in that, The digital intelligence processing module specifically includes a cluster processing unit, a feature analysis unit, a data analysis unit, and a resource matching unit; The clustering processing unit is used to cluster the encrypted standard research data to obtain clusters and extract clustering features; The feature analysis unit is used to divide the sample space according to clusters and obtain data features that have a significant impact on the target variable. The data analysis unit integrates multiple decision trees, constructs a training set for each tree based on the data features of each sample space, and outputs the final evaluation result using a voting method. Resource matching unit: Based on the final evaluation results, the Spark distributed computing framework is used in conjunction with the cosine similarity algorithm to quickly match course resources and obtain personalized recommendations.
7. The engineering-integrated teaching and research management service system based on digital intelligence empowerment according to claim 6, characterized in that, The specific process of obtaining clusters includes: First, based on the core requirements of engineering practice, select a preset number of core feature indicators and normalize the data; Second, determine the number of clusters J; Third, randomly select J samples as initial centers; Fourth, calculate the Euclidean distance between each student sample and the J centers, assign the sample to the nearest cluster, and then recalculate the mean of each cluster as the new cluster center, repeating the iteration until the cluster centers are stable; Fifth, generate J groups of students and extract the mean of each cluster feature as the clustering feature.
8. The engineering-integrated teaching and research management service system based on digital intelligence empowerment according to claim 7, characterized in that, There are five core characteristic indicators, namely: student attendance rate, practical operation completion rate, average assessment score, practical results pass rate, and standardized use rate of tools and equipment.
9. The engineering-integrated teaching and research management service system based on digital intelligence empowerment according to claim 7, characterized in that, The method for determining the number of clusters J includes: calculating the sum of squared errors within the cluster corresponding to different j values, and the sum of squared distances from each sample to the center of its cluster. As the value of j increases, the sum of squared distances gradually decreases; when j increases to a certain value, the rate of decrease in the sum of squared distances drops sharply, and the j value corresponding to this inflection point is the optimal value J.
10. The engineering-integrated teaching and research management service system based on digital intelligence empowerment according to claim 7, characterized in that, The specific process of sample space partitioning includes: First, determining the target variable and feature variables; Second, constructing a decision tree, using the Gini coefficient as the partitioning criterion, calculating the Gini coefficient gain of each feature, and prioritizing the feature with the largest gain as the root node; Third, repeating the feature selection and partitioning process for each child node until the samples within the node belong to the same category or reach a preset depth; Fourth, using a pre-pruning method, stopping the partitioning when the number of samples in a node is less than or equal to the preset depth; Fifth, determining key influencing factors as data features through the decision tree path, and outputting them in order of influence weight.