Artificial intelligence-based vocational skill personalized learning path planning method and system

By constructing a dynamic causal skill map and a skill mastery inference model, the learning path planning is optimized, solving the problem of lack of causal logic and realizing personalized and scientific learning path adjustment.

CN122309853APending Publication Date: 2026-06-30HUNAN VOCATIONAL COLLEGE OF COMMERCE

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HUNAN VOCATIONAL COLLEGE OF COMMERCE
Filing Date
2026-05-12
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies lack the ability to make causal inferences and cannot reveal the causal logic between skill learning, which makes it impossible for path planning to achieve true personalization and dynamic optimization.

Method used

By collecting structured event data from users, a dynamic causal skill graph is constructed. A skill mastery inference model is used to assess the user's mastery level, and the learning path is optimized through counterfactual reasoning. The path search and adjustment are combined with the strength of causal effects.

Benefits of technology

It has achieved a shift from passive recommendation to proactive causal guidance, dynamically identifies key bottleneck skills, generates optimal learning path adjustment plans, and improves the scientific nature and personalized adaptability of learning path planning.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309853A_ABST
    Figure CN122309853A_ABST
Patent Text Reader

Abstract

This application relates to a method, system, device, and medium for personalized vocational skills learning path planning based on artificial intelligence. The method includes: collecting a user's structured event dataset; constructing a dynamic causal skill graph with skill points as nodes and causal effect strength as directed edges using two-stage linear regression; extracting the user's dynamic behavior sequence and inputting it into a pre-trained model to obtain a skill mastery vector; searching for a personalized causal learning path with the current mastered skill points as the starting point, the career goal as the ending point, and maximizing the cumulative causal effect as the objective; performing counterfactual reasoning on the learning path based on the structured causal model to predict the counterfactual mastery, learning time, and path under different intervention conditions; and obtaining the optimal learning path adjustment scheme by weighted fusion of the counterfactual results. This method enables personalized learning path planning and dynamic optimization with causal logic.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of artificial intelligence technology, and in particular relates to a method and system for planning personalized learning paths for vocational skills based on artificial intelligence. Background Technology

[0002] With the popularization of online vocational skills learning platforms, intelligent learning path planning technology has been widely applied. Existing technologies mainly rely on collaborative filtering or static knowledge graphs for path recommendation: collaborative filtering analyzes learners' historical behavior to discover group learning patterns and uses the learning sequences of similar learners as the basis for recommendations; static knowledge graphs depend on predefined skill prerequisite relationships defined by experts and generate learning routes through shortest path search. The characteristic of these methods is their ability to utilize large-scale data to discover statistical correlations and quickly generate learning sequences that conform to popular habits.

[0003] However, current traditional methods suffer from a core technical problem: a lack of causal inference capabilities, failing to reveal the causal logic between skill learning. Relevance-based recommendations can only discover group-wide learning behavior connections, but cannot explain the decisive role of mastering a prerequisite skill in learning subsequent higher-level skills. Static knowledge graphs can only reflect pre-defined hierarchical relationships, making it difficult to capture the dynamic causal mechanisms that change during actual learning, such as the chain reaction of a misunderstanding leading to subsequent failures. Due to the lack of causal characterization, the system cannot identify key bottleneck skills, nor can it simulate the potential impact of adjusting the learning sequence, causing path planning to remain at the passive recommendation level, unable to achieve truly personalized causal guidance and dynamic optimization. Summary of the Invention

[0004] Therefore, it is necessary to provide an AI-based method and system for planning personalized learning paths for vocational skills, which can construct a dynamic causal skill map based on two-stage regression, assess the user's mastery level through a skill mastery inference model, and optimize the learning path using counterfactual reasoning, in order to address the aforementioned technical problems.

[0005] Firstly, this application provides a method for planning personalized learning paths for vocational skills based on artificial intelligence, including:

[0006] S1. Collect the user's structured event dataset; based on the structured event dataset, extract instrumental variables and user covariates, and use the instrumental variables and user covariates as inputs to perform a first-stage linear regression on all skill points to calculate the user's instrumental variable prediction of mastery level for each skill point;

[0007] S2. Based on the instrumental variables, predict the mastery level data, perform a second-stage linear regression to calculate the causal effect strength between skill points, and construct a dynamic causal skill graph based on the causal effect strength. The dynamic causal skill graph is constructed with skill points as nodes and the causal effect strength as directed edges.

[0008] S3. Extract the user's dynamic behavior sequence from the structured event dataset, and input the dynamic behavior sequence into the pre-trained skill mastery inference model to obtain the user's skill mastery vector at each skill point;

[0009] S4. Taking the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of skill points corresponding to the user's career goal as the ending point, and the goal of maximizing the cumulative causal effect, a path search is performed to obtain a personalized causal learning path.

[0010] S5. Based on the dynamic causal skill graph, the personalized causal learning path is input into the structural causal model for counterfactual inference to predict the counterfactual mastery vector, counterfactual expected learning time, and counterfactual path under different intervention conditions. The counterfactual path is the learning path replanned based on the dynamic causal skill graph and the skill mastery inference model after applying corresponding intervention conditions to the personalized causal learning path.

[0011] S6. The counterfactual knowledge vector and the counterfactual expected learning time are weighted and fused to obtain the intervention causal benefit value. The counterfactual path corresponding to the highest intervention causal benefit value is taken as the optimal learning path adjustment scheme.

[0012] In one embodiment, the step of using the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of skill points corresponding to the user's career goal as the ending point, and maximizing the cumulative causal effect as the objective to perform path search to obtain a personalized causal learning path includes:

[0013] S11. Based on the dynamic causal skill graph, select any candidate path from the starting point to the ending point, traverse all directed edges on the candidate path, obtain the causal effect intensity corresponding to each directed edge, and compare the starting skill of the directed edge with all skill points in the current mastered skill point set to obtain the comparison result.

[0014] S12. Based on the comparison results, the causal effect intensity is weighted and adjusted using a cumulative causal effect value function to obtain the cumulative causal effect value of the candidate path, wherein the expression of the cumulative causal effect value function is:

[0015]

[0016] in, For the candidate path The cumulative causal effect value, Skill points skill points The strength of the causal effect As an indicator function, when the comparison result determines the user's skill points degree of mastery Greater than the mastery threshold The value is 1 if it is true, and 0 otherwise. For the updated personalized gain parameters, For users skill points The degree of mastery;

[0017] S13. Perform a heuristic search on the dynamic causal skill graph to obtain the best candidate path corresponding to the cumulative causal effect value with the largest value, and use the best candidate path as the personalized causal learning path.

[0018] In one embodiment, after performing a heuristic search on the dynamic causal skill graph to obtain the optimal candidate path corresponding to the largest cumulative causal effect value, and using the optimal candidate path as the personalized causal learning path, the method further includes:

[0019] S21. Record the actual mastery time of the completed skill points during the learning process, and compare the actual mastery time with the expected learning time of the corresponding skill points in the personalized causal learning path to obtain the user's learning efficiency coefficient.

[0020] S22. Calculate the user's transfer learning ability representation value by performing an exponentially weighted moving average based on the learning efficiency coefficient.

[0021] S23. Input the transfer learning ability representation value into the gain parameter mapping function to calculate the updated personalized gain parameter, wherein the expression of the gain parameter mapping function is:

[0022]

[0023] in, The updated personalized gain parameters, For the personalized gain parameters, The transfer learning ability representation value, This is the normalized reference constant.

[0024] In one embodiment, the skill mastery inference model is trained using the following method:

[0025] S31. Extract all users' historical dynamic behavior sequences from the structured event dataset, and use the skill test scores corresponding to the historical dynamic behavior sequences as supervision labels. Construct a training sample set based on the historical dynamic behavior sequences and the supervision labels.

[0026] S32. Construct an initial skill mastery inference model, which includes an embedding layer, a multi-head attention layer, a bidirectional long short-term memory network layer, and a fully connected output layer.

[0027] S33. Input the historical dynamic behavior sequence into the embedding layer, and map the discrete behavior types into dense vectors through the embedding layer to obtain a behavior embedding vector sequence.

[0028] S34. Input the behavior embedding vector sequence into the multi-head attention layer, calculate the importance weight of different behaviors in the behavior embedding vector sequence to mastery through the multi-head attention layer, and perform weighted aggregation on the behavior embedding vector sequence according to the importance weight to obtain a weighted behavior representation vector.

[0029] S35. Input the weighted behavior representation vector into the bidirectional long short-term memory network layer, and capture the forward and backward dependencies of the weighted behavior representation vector through the bidirectional long short-term memory network layer to obtain the hidden state sequence.

[0030] S36. Input the hidden state of the last time step in the hidden state sequence into the fully connected output layer, and perform feature mapping on the hidden state of the last time step through the fully connected output layer to obtain the skill mastery prediction value.

[0031] S37. Input the predicted skill mastery value and the supervision label into the loss function to calculate the prediction error, and obtain the prediction error;

[0032] S38. Update the network parameters of the initial skill mastery inference model using the backpropagation algorithm based on the prediction error to obtain the updated model network parameters;

[0033] S39. Select different samples from the training sample set and repeat steps S33 to S38 to perform multiple rounds of iterative training on the initial skill mastery inference model until the prediction error is less than a preset error threshold or the number of iterations reaches a preset iteration threshold.

[0034] S310. Based on the updated model network parameters obtained in the last iteration, update the initial skill mastery inference model to obtain the skill mastery inference model.

[0035] In one embodiment, the step of inputting the personalized causal learning path into a structural causal model for counterfactual inference based on the dynamic causal skill graph to predict the counterfactual mastery vector, expected counterfactual learning time, and counterfactual path under different intervention conditions includes:

[0036] S41. Based on the skill nodes on the personalized causal learning path, generate different intervention actions for different skill nodes through a preset set of intervention action classifications, and generate a candidate set of intervention actions.

[0037] S42. The posterior probability distribution of the latent exogenous variables affecting the user's learning performance is obtained by calculating the structured event dataset of the user using the variational Bayesian inference algorithm.

[0038] S43. Select a candidate intervention action from the candidate intervention action set as the current intervention action, modify the structural causal model according to the current intervention action, cut off all input edges of the skill node corresponding to the current intervention action and assign a preset intervention value to obtain the post-intervention causal model.

[0039] S44. Substitute the posterior probability distribution into the post-intervention causal model, calculate the counterfactual mastery prediction values ​​of all uninterrupted skill nodes by forward propagation through the linear regression equation in the post-intervention causal model, and combine the counterfactual mastery prediction values ​​of all skill nodes into the counterfactual mastery vector.

[0040] S45. Using the skill points in the counterfactual mastery vector that exceed the preset mastery threshold as the counterfactual starting point and the target skill point set corresponding to the user's career goal as the ending point, perform path search calculation on the dynamic causal skill graph corresponding to the post-intervention causal model with the goal of maximizing the cumulative causal effect to obtain the counterfactual path.

[0041] S46. Estimate the time required to learn along the counterfactual path using a learning time prediction function to obtain the expected counterfactual learning time.

[0042] Secondly, this application also provides an artificial intelligence-based personalized learning path planning system for vocational skills, used to implement the method described in the first aspect, including:

[0043] The data acquisition and regression module is used to collect users' structured event datasets; based on the structured event datasets, it extracts instrumental variables and user covariates, and uses the instrumental variables and user covariates as inputs to perform a first-stage linear regression on all skill points to calculate the user's instrumental variable prediction of mastery level for each skill point;

[0044] The dynamic causal graph construction module is used to predict mastery level data based on the instrumental variables, perform second-stage linear regression, calculate the causal effect strength between skill points, and construct a dynamic causal skill graph based on the causal effect strength. The dynamic causal skill graph is constructed with skill points as nodes and the causal effect strength as directed edges.

[0045] The skill mastery inference module is used to extract the dynamic behavior sequence of users in the structured event dataset, and input the dynamic behavior sequence into the pre-trained skill mastery inference model to obtain the skill mastery vector of users at each skill point;

[0046] The causal path search module is used to take the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of target skill points corresponding to the user's career goal as the ending point, and the goal of maximizing the cumulative causal effect to perform path search and obtain a personalized causal learning path.

[0047] The counterfactual reasoning module is used to input the personalized causal learning path into the structural causal model based on the dynamic causal skill graph to perform counterfactual reasoning, and predict the counterfactual mastery vector, counterfactual expected learning time and counterfactual path under different intervention conditions. The counterfactual path is the learning path replanned based on the dynamic causal skill graph and the skill mastery inference model after applying corresponding intervention conditions to the personalized causal learning path.

[0048] The benefit assessment and scheme generation module is used to perform weighted fusion of the counterfactual mastery vector and the counterfactual expected learning time to obtain the intervention causal benefit value, and the counterfactual path corresponding to the highest intervention causal benefit value is taken as the optimal learning path adjustment scheme.

[0049] Thirdly, this application also provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of any of the methods in the first aspect of this application.

[0050] Fourthly, this application also provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of any of the methods in the first aspect of this application.

[0051] The aforementioned AI-based personalized learning path planning method and system for vocational skills collects users' structured event datasets and utilizes a two-stage regression algorithm to mine the strength of causal effects between skill points to construct a dynamic causal skill map. This addresses the technical limitation of traditional methods, which can only discover statistical correlations but cannot reveal the causal logic between skills. It combines a skill mastery inference model to accurately assess the user's current mastery status and searches for paths with the goal of maximizing cumulative causal effects, achieving a shift from passive recommendation to proactive causal guidance. Furthermore, by introducing a structural causal model for counterfactual reasoning, it predicts and evaluates the learning effects under different intervention conditions, thereby dynamically identifying key bottleneck skills and generating optimal learning path adjustment schemes. This significantly improves the scientific rigor, accuracy, and personalized adaptability of learning path planning. Attached Figure Description

[0052] To more clearly illustrate the technical solutions in the embodiments or related technologies of this application, the accompanying drawings used in the description of the embodiments or related technologies will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0053] Figure 1 This is a flowchart of a personalized vocational skills learning path planning method based on artificial intelligence, according to one embodiment of the present invention.

[0054] Figure 2 This is a schematic diagram of the structure of an AI-based personalized learning path planning system for vocational skills, as described in one embodiment of the present invention. Detailed Implementation

[0055] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0056] In one embodiment, such as Figure 1 As shown, a personalized learning path planning method for vocational skills based on artificial intelligence is provided. This embodiment illustrates the application of this method to a learning path planning terminal. It is understood that this method can also be applied to a server, or to a system including both a terminal and a server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the following steps:

[0057] S1. Collect the user's structured event dataset; based on the structured event dataset, extract instrumental variables and user covariates, and use the instrumental variables and user covariates as inputs to perform a first-stage linear regression on all skill points to calculate the user's instrumental variable prediction of mastery level for each skill point.

[0058] Specifically, the structured event dataset can be a collection of various behavioral data related to user skill learning, obtained by the learning path planning terminal from the online vocational skills learning platform. This includes user profile information, course learning, practice exercises, online assessments, error correction, learning duration, and all other behavioral events related to skill mastery. All behavioral events are organized and archived according to a preset structured format, providing basic data support for subsequent data processing and model calculations. Instrumental variables can include user learning device type, frequency of learning reminders pushed by the platform, and the rationality of learning time allocation. Instrumental variables can reflect user learning characteristics without directly interfering with the causal relationship between skills. User covariates can include user basic characteristics, learning preference characteristics, and historical learning performance characteristics.

[0059] For example, the learning path planning terminal can convert the categorical fields in the structured event dataset into numerical types using one-hot encoding, and standardize the continuous fields to a specified interval using Min-Max to obtain a standardized structured event dataset. The terminal can also impute missing values ​​in the standardized structured event dataset using K-nearest neighbor imputation to obtain a complete standardized structured event dataset. Based on the complete standardized structured event dataset, the terminal can extract instrumental variables and user covariates, standardize them, and input them into a first-stage linear regression equation to fit a regression model. It can then calculate the predicted mastery level of the user at each skill point using instrumental variables. The expression for the first-stage linear regression equation can be:

[0060]

[0061] in, For users in skill points The actual level of mastery, For the first The regression intercept term of a skill point regression model. For the first The instrumental variable coefficient vector of a skill point regression model For the first A vector of instrumental variables for each skill point. For the first The user covariate coefficient vector of a skill point regression model For the first User covariate vectors in a skill point regression model For the first The regression model for each skill point contains a normally distributed random error term. The learning path planning terminal can use the least squares method to fit the regression model and substitute the instrumental variables and user covariate vectors into the equation to calculate the instrumental variable prediction of the user's mastery level for each skill point.

[0062] S2. Based on the instrumental variable prediction of mastery data, perform a second-stage linear regression to calculate the causal effect strength between skill points, and construct a dynamic causal skill graph based on the causal effect strength. The dynamic causal skill graph is constructed with skill points as nodes and causal effect strength as directed edges.

[0063] For example, the learning path planning terminal can use the instrumental variable prediction of mastery level data obtained in the first stage as explanatory variables to construct a second-stage linear regression equation for any two different skill points. The expression of the second-stage linear regression equation can be:

[0064]

[0065] in, Skill points Instrumental variables for predicting mastery data, Skill points Instrumental variables for predicting mastery data, For the intercept term, Skill points right The causal effect intensity coefficient For the user covariate coefficient vector, For user covariate vectors, The random error term follows a normal distribution; the learning path planning terminal can use the least squares method to fit each skill point pair individually, combine cross-validation to optimize the parameters, obtain the causal effect strength, and standardize the causal effect strength to a specified interval; the learning path planning terminal can construct a dynamic causal skill graph with a specified graph database, where nodes are skill points and directed edges are causal relationships between skills.

[0066] S3. Extract the dynamic behavior sequence of users from the structured event dataset, and input the dynamic behavior sequence into the pre-trained skill mastery inference model to obtain the skill mastery vector of users at each skill point.

[0067] Specifically, the dynamic behavior sequence can be a time-series sequence formed by arranging the continuous learning behavior records of users for each skill point in descending order of timestamps within a preset time window. The dynamic behavior sequence can include the entire learning process, such as course learning, exercise practice, online assessment, and error correction, and can completely restore the user's learning trajectory and behavioral patterns. The skill mastery inference model can be a time-series feature inference model constructed by combining a bidirectional long short-term memory network with an attention mechanism. It can capture the long-distance temporal dependencies of the dynamic behavior sequence, highlight the behavioral events that have a significant impact on skill mastery through the attention mechanism, and output a skill mastery vector that matches the dimension of the total number of skill points. The value range of each element in the vector is a specified interval, corresponding to the user's actual mastery of each skill point.

[0068] Specifically, the learning path planning terminal can construct training and validation sets based on historical data from multiple users. It uses dynamic behavioral sequence features as input, calculates the actual skill mastery after calibration using formulas as labels, adopts mean squared error as the loss function, iterates training with a specified optimizer, and combines an early stopping strategy to avoid overfitting, thereby obtaining a pre-trained skill mastery inference model.

[0069] For example, the learning path planning terminal can extract the dynamic behavior sequence of users from the structured event dataset, and preprocess and encode the dynamic behavior sequence. The learning path planning terminal can convert the event type into a numerical vector of a specified dimension using one-hot encoding, and the unique identifier of the skill point is represented by a pre-trained embedding vector. Continuous features such as behavior result, behavior time, learning duration, and time difference are standardized to the [0,1] interval to obtain the initial feature vector of a single behavior record. The learning path planning terminal can extract the behavior effect feature values ​​of user behavior completion rate and accuracy from the initial feature vector, and input the behavior effect feature values ​​and the dynamic behavior sequence into a pre-trained skill mastery inference model to obtain a single skill mastery vector. The calculation expression of a single skill mastery vector can be:

[0070]

[0071] in, For users in target skill points The degree of mastery of comprehensive skills For user learning efficiency coefficient, The total length of the dynamic behavior sequence. For a moment Corresponding skill points Behavioral time-series weights For a moment Corresponding skill points behavioral effect feature value Target skill points The total number of preceding causal parent nodes. Target skill points The set of all preceding causal parent nodes, parent node skills Target skill points The standardized causal effect strength For users in parent node skills The learning path planning terminal can calculate the comprehensive skill mastery of multiple skill points and integrate all comprehensive skill mastery according to a preset sorting rule to obtain a skill mastery vector.

[0072] S4. Starting from the current set of mastered skill points corresponding to the skill mastery vector, and ending with the target set of skill points corresponding to the user's career goal, a path search is performed with the goal of maximizing the cumulative causal effect to obtain a personalized causal learning path.

[0073] Specifically, a personalized causal learning path is a unique learning guide generated by the learning path planning terminal based on individual user differences and the causal logic of skills. Personalization is reflected in the fact that the starting point of the path matches the user's current skill level and the ending point matches the user's career goals, adapting to the user's learning foundation and needs. Causality is reflected in the fact that adjacent skill points in the path are constructed based on a dynamic causal skill graph, with the core of maximizing the cumulative causal effect, ensuring that the preceding skills have a significant promoting effect on the subsequent skills, which can improve learning efficiency and shorten the skill mastery cycle.

[0074] For example, the learning path planning terminal can set a mastery threshold based on a skill mastery vector, and can use the currently mastered skill point that has reached the mastery threshold as the starting point; the learning path planning terminal can combine a preset career skill mapping table to obtain the target skill point set corresponding to the user's career goal, and use the target skill point set as the endpoint; the learning path planning terminal can optimize the heuristic function to calculate the evaluation estimate to measure the estimated total cost from the current node to the target node, which is used to filter the next node to be expanded and guide the path search direction. The expression for calculating the evaluation estimate can be:

[0075]

[0076] in, To evaluate the function value, The cumulative causal effect strength from the starting point to the current node. The learning path planning terminal can take the sum of the maximum causal effect strengths from the current node to any node in the target skill point set, and maintain... (The sentence is incomplete and requires more context to translate accurately.) The learning path planning terminal can set a path length threshold to avoid loops and excessively long paths. The learning path planning terminal can search for target skill points and trace back to predecessor nodes to obtain personalized causal learning paths.

[0077] S5. Based on the dynamic causal skill graph, the personalized causal learning path is input into the structural causal model for counterfactual inference, and the counterfactual mastery vector, counterfactual expected learning time and counterfactual path under different intervention conditions are predicted. The counterfactual path is the learning path that is replanned based on the dynamic causal skill graph and skill mastery inference model after applying the corresponding intervention conditions to the personalized causal learning path.

[0078] Specifically, structural causal models can be mathematical frameworks used to model and infer causal relationships between variables. By defining causal functional relationships between variables and corresponding directed acyclic graphs, they can support intervention analysis and counterfactual reasoning, thereby simulating changes in the behavior of a system under specific intervention conditions and inferring the possible outcomes of taking different actions.

[0079] For example, the learning path planning terminal can fix other variables in the structural causal model and only change the value of the intervention variable. By inputting the mastery set of the preceding causal parent node and the exogenous variable corresponding to the skill point into the causal relationship function, the counterfactual mastery of the skill point can be calculated. The expression of the causal relationship function can be:

[0080]

[0081] in, It is a causal function, and , Skill points Counterfactual knowledge Skill points The set of mastery of all preceding causal parent nodes. Skill points The corresponding exogenous variables, The intercept term is obtained by fitting historical user data. parent node skills The causal effect strength coefficient for j parent node skills The degree of mastery is The learning path planning terminal can calculate the counterfactual mastery of the mastery set of all preceding causal parent nodes, and integrate all counterfactual mastery to obtain a counterfactual mastery vector. Based on this counterfactual mastery vector, the terminal can adjust the causal effect strength of relevant skill points in the dynamic causal skill graph, using skill points with achieved counterfactual mastery as a new starting point. The algorithm re-searches the path and, based on the principle of maximizing cumulative causal effects, selects counterfactual paths. The learning path planning terminal can substitute the obtained counterfactual paths into the learning time prediction function constructed by multiple linear regression, and, combined with relevant parameters such as the counterfactual mastery of each skill point in the path and the strength of causal effects between skills, calculate the counterfactual expected learning time corresponding to the counterfactual path. The learning path planning terminal can verify the validity of the counterfactual reasoning results by calculating confidence intervals through a specified sampling method and excluding abnormal intervention conditions.

[0082] S6. Weighted fusion of the counterfactual knowledge vector and the counterfactual expected learning time is performed to obtain the intervention causal benefit value. The counterfactual path corresponding to the highest intervention causal benefit value is taken as the optimal learning path adjustment scheme.

[0083] Specifically, the intervention causal benefit value can be used to comprehensively measure the effectiveness of intervention measures. The intervention causal benefit value can take into account both the improvement of users' skill mastery and the reduction of learning time after intervention. The weight of the two is dynamically determined by the learning path planning terminal according to the user's learning preferences. The learning path planning terminal provides multiple weight modes for users to choose from, and also supports user-defined weights.

[0084] For example, the learning path planning terminal can normalize the counterfactual mastery vector to obtain a normalized mastery vector, wherein the expression for calculating the normalized mastery vector can be:

[0085]

[0086] in, For the normalized first Mastery of each skill point For counterfactual knowledge of the degree vector, the first The element corresponding to the element Counterfactual mastery of each skill point The maximum value in the counterfactual mastery vector is used. The learning path planning terminal can normalize the counterfactual expected learning time to obtain the normalized counterfactual expected learning time. The learning path planning terminal can input the normalized counterfactual mastery vector and the normalized counterfactual expected learning time into a weighted fusion formula to calculate the intervention causal benefit value. The expression of the weighted fusion formula can be:

[0087]

[0088] in, To intervene in causal benefit values, Total number of skill points The normalized counterfactual expected learning time, This is a weighting coefficient for skill mastery. The weighting coefficients for counterfactual expected learning time; the learning path planning terminal can compare the intervention causal benefit values ​​corresponding to each intervention condition to obtain the highest intervention causal benefit value, and the learning path planning terminal can use the counterfactual path corresponding to the highest intervention causal benefit value as the optimal learning path adjustment scheme.

[0089] The aforementioned AI-based personalized learning path planning method and system for vocational skills collects users' structured event datasets and utilizes a two-stage regression algorithm to mine the strength of causal effects between skill points to construct a dynamic causal skill map. This addresses the technical limitation of traditional methods, which can only discover statistical correlations but cannot reveal the causal logic between skills. It combines a skill mastery inference model to accurately assess the user's current mastery status and searches for paths with the goal of maximizing cumulative causal effects, achieving a shift from passive recommendation to proactive causal guidance. Furthermore, by introducing a structural causal model for counterfactual reasoning, it predicts and evaluates the learning effects under different intervention conditions, thereby dynamically identifying key bottleneck skills and generating optimal learning path adjustment schemes. This significantly improves the scientific rigor, accuracy, and personalized adaptability of learning path planning.

[0090] In one embodiment of the present invention, a path search is performed using the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of skill points corresponding to the user's career goal as the ending point, and the goal of maximizing the cumulative causal effect, to obtain a personalized causal learning path, including:

[0091] S11. Based on the dynamic causal skill graph, select any candidate path from the starting point to the ending point, traverse all directed edges on the candidate path, obtain the causal effect intensity corresponding to each directed edge, and compare the starting skill of the directed edge with all skill points in the current mastered skill point set to obtain the comparison result.

[0092] For example, the learning path planning terminal can, based on a dynamic causal skill graph, take the current set of mastered skill points corresponding to the skill mastery vector as the starting point and the target set of skill points corresponding to the user's career goal as the ending point. It can randomly select any candidate path from all possible paths from the starting point to the ending point. All directed edges between adjacent skill points on the candidate path must have positive causal relationships, and invalid paths containing negative causal effects are eliminated. The learning path planning terminal traverses all directed edges on the candidate path, calls the interface of the dynamic causal skill graph through a specified query language, and obtains the causal effect strength corresponding to each directed edge. The learning path planning terminal compares the starting skill point of each directed edge with all skill points in the current set of mastered skill points, determines whether the skill point belongs to the current set of mastered skill points, and determines whether the user's mastery of the skill point is greater than a preset mastery threshold. The comparison results of the starting skill point of each directed edge are recorded one by one to form a complete comparison result.

[0093] S12. Based on the comparison results, the cumulative causal effect strength is adjusted by weighting the cumulative causal effect value function to obtain the cumulative causal effect value of the candidate path.

[0094] Specifically, the expression for the cumulative causal effect value function is:

[0095]

[0096] in, Candidate paths The cumulative causal effect value is such that the larger the value, the better the overall effect of the candidate path. Skill points skill points The strength of the causal effect This is an indicator function that determines the user's skill points based on the comparison results. degree of mastery Greater than the mastery threshold The value is 1 if it is true, and 0 otherwise. For the updated personalized gain parameters, For users skill points The degree of mastery.

[0097] For example, the learning path planning terminal can, based on the comparison list, perform personalized weighted adjustments on the causal effect strength of each directed edge on the candidate path using the cumulative causal effect value function, and then sum up all the weighted causal effect strengths one by one to obtain the cumulative causal effect value of the candidate path.

[0098] S13. Perform a heuristic search on the dynamic causal skill graph to obtain the best candidate path corresponding to the cumulative causal effect value with the largest value, and use the best candidate path as the personalized causal learning path.

[0099] For example, the learning path planning terminal can employ a heuristic search algorithm, and... Algorithm linkage: The learning path planning terminal can search all candidate paths from the starting point to the ending point in the dynamic causal skill graph, and record the cumulative causal effect value corresponding to each candidate path in real time; During the search process, the learning path planning terminal can store candidate paths to be searched through an open list and store candidate paths that have been calculated through a closed list, and record the cumulative causal effect value corresponding to each candidate path in real time; The learning path planning terminal can sort the cumulative causal effect values ​​of all candidate paths and filter out the best candidate path corresponding to the largest cumulative causal effect value; The learning path planning terminal can perform secondary verification on the best candidate path to confirm that all directed edges on the path are positive causal relationships.

[0100] This application provides an AI-based method for personalized vocational skills learning path planning. By combining a dynamic causal skill graph with the user's current skill mastery, and through candidate path screening, personalized weighted adjustment of causal effects, and heuristic search, it achieves personalized causal learning path screening that maximizes cumulative causal effects. This effectively improves the personalization and search efficiency of the path, ensuring that the path aligns with the user's skill base and conforms to the causal logic between skills, providing a reliable foundation for subsequent counterfactual reasoning and optimal path adjustment.

[0101] Based on the above embodiments, a heuristic search is performed on the dynamic causal skill graph to obtain the optimal candidate path corresponding to the largest cumulative causal effect value. After using the optimal candidate path as the personalized causal learning path, the method further includes:

[0102] S21. Record the actual time it takes for the user to master the completed skill points during the learning process, compare the actual mastery time with the expected learning time of the corresponding skill points in the personalized causal learning path, and obtain the user's learning efficiency coefficient.

[0103] Specifically, the actual mastery time can be the actual time a user spends from starting to learn a skill point to passing the platform assessment and reaching the preset mastery threshold, accurate to the minute and linked to the time data of the skill point identifier.

[0104] For example, the learning path planning terminal can record in real time the actual mastery time of completed skill points during the learning process of a personalized causal learning path; the learning path planning terminal can obtain the expected learning time for the corresponding completed skill points in the personalized causal learning path; the learning path planning terminal can compare the actual mastery time and the corresponding expected learning time for each completed skill point to obtain the user's learning efficiency coefficient, wherein the calculation expression for the user's learning efficiency coefficient can be: ,in, The expected learning time for the corresponding skill points in the personalized causal learning path. The actual time it takes for a user to master a skill point. For user learning efficiency coefficient, when This indicates that the user's actual learning efficiency is higher than expected. This indicates that the actual learning efficiency is lower than expected. The efficiency was consistent with expectations.

[0105] S22. Calculate the user's transfer learning ability representation value by performing an exponentially weighted moving average based on the learning efficiency coefficient.

[0106] Specifically, the transfer learning ability representation value can be used to quantify a user's ability to transfer existing skills to the learning of new skills. The range of values ​​for the transfer learning ability representation value can be matched with the learning efficiency coefficient. The larger the value, the stronger the user's skill transfer ability, the better the adaptability to learning new skills, and the faster the user can improve the efficiency of mastering new skills by leveraging existing skill foundations.

[0107] For example, the learning path planning terminal can select the learning efficiency coefficients corresponding to a preset number of skill points recently completed by the user, and use an exponential weighted moving average algorithm to smooth them out in order to eliminate the random fluctuations in the learning efficiency of a single skill point, highlight the long-term trend of the user's learning efficiency, and calculate the user's transfer learning ability representation value.

[0108] S23. Input the transfer learning ability representation value into the gain parameter mapping function to calculate the updated personalized gain parameter.

[0109] Specifically, the expression for the gain parameter mapping function is:

[0110]

[0111] in, For the updated personalized gain parameters, For personalized gain parameters, This represents the transfer learning ability value. This is the normalized reference constant.

[0112] For example, the learning path planning terminal can input the transfer learning ability representation value into the gain parameter mapping function, and dynamically adjust the personalized gain parameter through the gain parameter mapping function to obtain the updated personalized gain parameter, which provides support for subsequent learning path iterative optimization and causal effect strength weighted calculation.

[0113] This application provides a personalized learning path planning method for vocational skills based on artificial intelligence. By comparing the actual learning achievement time with the expected learning time, the learning efficiency coefficient is calculated. The transfer learning ability representation value is obtained by exponential weighted moving average smoothing, thereby driving the dynamic adjustment of personalized gain parameters. This enables the gain parameters to accurately adapt to the user's current learning status and provides efficient support for iterative optimization of the learning path and weighted calculation of causal effect strength.

[0114] In one embodiment of the present invention, the skill mastery inference model is trained using the following method:

[0115] S31. Extract the historical dynamic behavior sequences of all users from the structured event dataset, and use the skill test scores corresponding to the historical dynamic behavior sequences as supervision labels. Construct a training sample set based on the historical dynamic behavior sequences and supervision labels.

[0116] For example, the learning path planning terminal can extract the historical dynamic behavior sequence of all users from the structured event dataset according to the continuous learning behavior records within a preset time window, sorted by timestamp in descending order, and use the skill test scores corresponding to the historical dynamic behavior sequence as supervision labels; the learning path planning terminal can divide the historical dynamic behavior sequence and supervision labels into training set, validation set and test set in a ratio of 7:2:1 to construct a complete training sample set.

[0117] S32. Construct an initial skill mastery inference model, which includes an embedding layer, a multi-head attention layer, a bidirectional long short-term memory network layer, and a fully connected output layer.

[0118] Specifically, the initial skill mastery inference model can include an embedding layer, a multi-head attention layer, a bidirectional long short-term memory network layer, and a fully connected output layer. These layers work together to extract features from historical dynamic behavior sequences and predict skill mastery. The embedding layer maps discrete behavior features into dense vectors, the multi-head attention layer mines the differences in importance between different behaviors in the behavior sequence, the bidirectional long short-term memory network layer captures the temporal dependencies of the behavior sequence, and the fully connected output layer outputs the predicted skill mastery value.

[0119] S33. Input the historical dynamic behavior sequence into the embedding layer, and map the discrete behavior types into dense vectors through the embedding layer to obtain the behavior embedding vector sequence.

[0120] For example, the learning path planning terminal can batch input historical dynamic behavior sequences from the training sample set into the embedding layer of the initial skill mastery inference model. The embedding layer can use one-hot encoding combined with pre-trained embedding vectors to map the historical dynamic behavior sequences into dense vectors of fixed dimensions for discrete behavior types in the historical dynamic behavior sequences. At the same time, the continuous features in the sequence are standardized and incorporated into the vectors to obtain a sequence of behavior embedding vectors with consistent lengths in the historical dynamic behavior sequences.

[0121] S34. Input the behavior embedding vector sequence into the multi-head attention layer, calculate the importance weight of different behaviors in the behavior embedding vector sequence to the mastery through the multi-head attention layer, and perform weighted aggregation of the behavior embedding vector sequence according to the importance weight to obtain the weighted behavior representation vector.

[0122] For example, the learning path planning terminal can input the sequence of behavior embedding vectors output by the embedding layer into the multi-head attention layer. The multi-head attention layer can calculate the importance weight of each behavior vector in the sequence of behavior embedding vectors and all other behavior vectors. The learning path planning terminal can normalize the importance weights using the softmax function, and then aggregate the behavior embedding vectors corresponding to the normalized importance weights to obtain a weighted behavior representation vector that takes into account the importance of each behavior.

[0123] S35. Input the weighted behavior representation vector into the bidirectional long short-term memory network layer. The bidirectional long short-term memory network layer captures the forward and backward dependencies of the weighted behavior representation vector to obtain the hidden state sequence.

[0124] For example, the learning path planning terminal can input the weighted behavior representation vector into the bidirectional long short-term memory network layer, capture the forward temporal dependencies of the weighted behavior representation vector through the forward LSTM in the bidirectional long short-term memory network layer, and capture the backward temporal dependencies through the backward LSTM in the bidirectional long short-term memory network layer to obtain the hidden state sequence.

[0125] S36. Input the hidden state of the last time step in the hidden state sequence into the fully connected output layer. The hidden state of the last time step is then used for feature mapping through the fully connected output layer to obtain the predicted value of skill mastery.

[0126] For example, the learning path planning terminal can extract the hidden state of the last time step in the hidden state sequence and input it into the fully connected output layer. The fully connected output layer maps the hidden state to a skill mastery prediction value in the interval [0,1] through two linear transformations and the Sigmoid function.

[0127] S37. Input the predicted value of skill mastery and the supervision label into the loss function to calculate the prediction error and obtain the prediction error.

[0128] For example, the learning path planning terminal can input the predicted skill mastery value obtained from the fully connected output layer and the supervised labels in the training sample set into the loss function to calculate the prediction error between the two, where the expression of the loss function can be:

[0129]

[0130] in, For the sample The single prediction error, This represents the number of samples in the current training batch. For the first Supervisory labels for each sample, For the first The predicted value of skill mastery for each sample; the learning path planning terminal can calculate the average of the individual prediction errors of all samples in the batch to obtain the prediction error.

[0131] S38. Based on the prediction error, update the network parameters of the inference model using the backpropagation algorithm to obtain the updated model network parameters.

[0132] For example, the learning path planning terminal can use the backpropagation algorithm to update the network parameters of the bidirectional long short-term memory network layer, the multi-head attention layer, and the embedding layer in reverse order, starting from the fully connected output layer, based on the prediction error of the current training batch. During the update process, the learning path planning terminal can use the Adam optimizer and adaptively adjust the learning rate to accelerate the model convergence speed while avoiding model overfitting, thus obtaining the updated model network parameters.

[0133] S39. Select different samples from the training sample set and repeat steps S33 to S38 to perform multiple rounds of iterative training on the initial skill mastery inference model until the prediction error is less than the preset error threshold or the number of iterations reaches the preset iteration threshold.

[0134] For example, the learning path planning terminal can randomly select different batches of training samples from the training sample set and repeatedly execute steps S33 to S38 to perform multiple rounds of iterative training on the initial skill mastery inference model. After each round of training, the learning path planning terminal can use a validation set to verify the performance of the skill mastery inference model. By calculating the prediction error of the validation set, the convergence status of the skill mastery inference model is recorded. The learning path planning terminal can compare the calculated prediction error of the validation set with a preset error threshold. When the prediction error is less than the error threshold or the number of iterations reaches the preset iteration threshold, the learning path planning terminal can stop the iterative training of the skill mastery inference model.

[0135] S310. Based on the updated model network parameters obtained in the last iteration, update the initial skill mastery inference model to obtain the skill mastery inference model.

[0136] For example, the learning path planning terminal can extract the updated model network parameters obtained from the last iteration of training, and replace the original network parameters of the initial skill mastery inference model with the updated model network parameters to obtain the skill mastery inference model.

[0137] This application provides an AI-based method for personalized vocational skills learning path planning. By constructing an adapted training sample set, an initial model containing an embedding layer, a multi-head attention layer, and a bidirectional long short-term memory network layer is built. After multiple rounds of iterative training, error optimization, and performance verification, a high-precision skill mastery inference model is obtained. This model can accurately capture the temporal characteristics of user learning behavior and accurately infer the mastery of each skill point, providing reliable model and data support for subsequent personalized learning path planning and related steps.

[0138] In one embodiment of the present invention, based on a dynamic causal skill graph, personalized causal learning paths are input into a structural causal model for counterfactual inference to predict counterfactual mastery vectors, expected counterfactual learning times, and counterfactual paths under different intervention conditions, including:

[0139] S41. Based on the skill nodes in the personalized causal learning path, generate different intervention actions for different skill nodes through a preset set of intervention action classifications, and generate a set of candidate intervention actions.

[0140] For example, the learning path planning terminal can extract the data attributes of all skill nodes on a personalized causal learning path, combine them with a preset set of intervention action categories, and generate exclusive and suitable intervention actions for skill nodes with different attributes through feature cosine similarity matching. Nodes with weak points in the user's mastery are matched with targeted interventions to fill the gaps, while nodes with strong causal relationships are matched with linked interventions, which are then integrated to form a set of candidate intervention actions.

[0141] S42. The variational Bayesian inference algorithm is used to calculate the posterior probability distribution of the latent exogenous variables that affect the user's learning performance on the user's structured event dataset.

[0142] For example, the learning path planning terminal can use a variational Bayesian inference algorithm, taking the temporal features of learning behavior and the fluctuation data of skill mastery in the user's structured event dataset as observed variables; the learning path planning terminal can set the prior distribution of the latent exogenous variables to a normal distribution with a mean of 0 and a variance of 1, and optimize the variational distribution parameters iteratively by maximizing the lower bound of the evidence objective function and using the adaptive gradient descent method. During the iteration process, the user learning efficiency coefficient is introduced to dynamically adjust the convergence threshold, so as to obtain the posterior probability distribution of the latent exogenous variables that fits the individual characteristics of the user.

[0143] S43. Select a candidate intervention action from the candidate intervention action set as the current intervention action, modify the structural causal model according to the current intervention action, cut off all input edges of the skill node corresponding to the current intervention action and assign a preset intervention value to obtain the post-intervention causal model.

[0144] For example, the learning path planning terminal can select the candidate intervention action with the highest comprehensive suitability from the set of candidate intervention actions, combined with the predicted value of the intervention effect and the user's current learning state, as the current intervention action; the learning path planning terminal can modify the structural causal model according to the current intervention action, not only cutting off all input edges of the skill node corresponding to the intervention action, but also removing interference terms in the original causal relationship of the node, and dynamically assigning a preset intervention value according to the user's historical intervention effect data, to obtain the post-intervention causal model.

[0145] S44. Substitute the posterior probability distribution into the post-intervention causal model, and calculate the counterfactual mastery prediction values ​​of all uninterrupted skill nodes by forward propagating the linear regression equation in the post-intervention causal model. Combine the counterfactual mastery prediction values ​​of all skill nodes into a counterfactual mastery vector.

[0146] For example, the learning path planning terminal can substitute the posterior probability distribution of potential exogenous variables into the post-intervention causal model, and calculate the counterfactual mastery prediction values ​​of all uninterrupted skill nodes by forward propagation through the linear regression equation within the model; the learning path planning terminal can introduce the skill mastery vector output by the skill mastery inference model for calibration, correct the prediction bias, and sort and combine them into a counterfactual mastery vector according to the skill node identifier.

[0147] S45. Using the skill points in the counterfactual mastery vector that exceed the preset mastery threshold as the counterfactual starting point and the target skill point set corresponding to the user's career goal as the endpoint, perform path search calculation on the dynamic causal skill graph corresponding to the post-intervention causal model with the goal of maximizing the cumulative causal effect to obtain the counterfactual path.

[0148] For example, the learning path planning terminal can use skill points in the counterfactual mastery vector that exceed a preset mastery threshold as the counterfactual starting point, and the target skill point set corresponding to the user's career goal as the ending point. On the dynamic causal skill graph corresponding to the post-intervention causal model, an improved approach can be adopted. The algorithm searches for the path with the goal of maximizing the cumulative causal effect, while filtering out candidate paths whose path length exceeds a preset range or whose causal effect fluctuates too much, and obtains the optimal counterfactual path.

[0149] S46. Estimate the time required to learn along the counterfactual path using the learning time prediction function to obtain the expected counterfactual learning time.

[0150] For example, the learning path planning terminal can substitute the counterfactual path into a preset learning time prediction function to calculate the counterfactual expected learning time, wherein the expression of the learning time prediction function can be:

[0151]

[0152] in, To anticipate learning time in a counterfactual way, For the first The basic learning time for each skill node For intervention actions on the first The duration gain coefficient of each skill node, The value representing the user's transfer learning ability. This is a correction value for the deviation in the user's historical learning time.

[0153] This application provides an AI-based method for planning personalized learning paths for vocational skills. By generating differentiated candidate intervention actions that fit the user's skill nodes, and combining variational Bayesian inference to accurately obtain the posterior distribution of potential exogenous variables, the method reconstructs the causal model after intervention and completes the prediction of counterfactual mastery vector, counterfactual path, and counterfactual expected learning time. This achieves efficient counterfactual reasoning under personalized intervention and provides reliable data support and logical basis for subsequent optimal learning path adjustment.

[0154] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0155] Based on the same inventive concept, this application also provides an AI-based personalized learning path planning system for vocational skills, used to implement the AI-based personalized learning path planning method for vocational skills described above. The solution provided by this system is similar to the implementation scheme described in the above method. Therefore, the specific limitations of one or more AI-based personalized learning path planning system embodiments provided below can be found in the limitations of the AI-based personalized learning path planning method for vocational skills described above, and will not be repeated here.

[0156] In one exemplary embodiment, such as Figure 2 As shown, an artificial intelligence-based personalized learning path planning system 500 for vocational skills is provided to implement the methods in the above-described method embodiments, including:

[0157] The data acquisition and regression module 501 is used to collect users' structured event datasets; based on the structured event datasets, it extracts instrumental variables and user covariates, and uses the instrumental variables and user covariates as inputs to perform first-stage linear regression on all skill points to calculate the user's instrumental variable prediction of mastery level for each skill point;

[0158] The dynamic causal graph construction module 502 is used to predict mastery level data based on instrumental variables, perform second-stage linear regression, calculate the causal effect strength between skill points, and construct a dynamic causal skill graph based on the causal effect strength. The dynamic causal skill graph is composed of skill points as nodes and causal effect strength as directed edges.

[0159] The skill mastery inference module 503 is used to extract the dynamic behavior sequence of users in the structured event dataset and input the dynamic behavior sequence into the pre-trained skill mastery inference model to obtain the skill mastery vector of users at each skill point.

[0160] The causal path search module 504 is used to search for a path by taking the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of target skill points corresponding to the user's career goal as the ending point, and the goal of maximizing the cumulative causal effect, so as to obtain a personalized causal learning path.

[0161] The counterfactual reasoning module 505 is used to input personalized causal learning paths into the structural causal model for counterfactual reasoning based on the dynamic causal skill graph, and to predict the counterfactual mastery vector, counterfactual expected learning time and counterfactual path under different intervention conditions. The counterfactual path is the learning path that is replanned based on the dynamic causal skill graph and skill mastery inference model after applying corresponding intervention conditions to the personalized causal learning path.

[0162] The benefit assessment and scheme generation module 506 is used to perform weighted fusion of the counterfactual mastery vector and the counterfactual expected learning time to obtain the intervention causal benefit value, and the counterfactual path corresponding to the highest intervention causal benefit value is taken as the optimal learning path adjustment scheme.

[0163] In one embodiment, the causal path search module 504 can also be used for:

[0164] S11. Based on the dynamic causal skill graph, select any candidate path from the starting point to the ending point, traverse all directed edges on the candidate path, obtain the causal effect intensity corresponding to each directed edge, and compare the starting skill of the directed edge with all skill points in the current mastered skill point set to obtain the comparison result.

[0165] S12. Based on the comparison results, the cumulative causal effect strength is weighted and adjusted using the cumulative causal effect value function to obtain the cumulative causal effect value of the candidate path.

[0166] S13. Perform a heuristic search on the dynamic causal skill graph to obtain the best candidate path corresponding to the cumulative causal effect value with the largest value, and use the best candidate path as the personalized causal learning path.

[0167] In one embodiment, the causal path search module 504 can also be used for:

[0168] S21. Record the actual mastery time of the completed skill points during the learning process, compare the actual mastery time with the expected learning time of the corresponding skill points in the personalized causal learning path, and obtain the user's learning efficiency coefficient.

[0169] S22. Calculate the user's transfer learning ability representation value by performing an exponentially weighted moving average based on the learning efficiency coefficient.

[0170] S23. Input the transfer learning ability representation value into the gain parameter mapping function to calculate the updated personalized gain parameter.

[0171] In one embodiment, the skill mastery inference model is trained using the following method:

[0172] S31. Extract the historical dynamic behavior sequences of all users from the structured event dataset, and use the skill test scores corresponding to the historical dynamic behavior sequences as supervision labels. Construct a training sample set based on the historical dynamic behavior sequences and supervision labels.

[0173] S32. Construct an initial skill mastery inference model, which includes an embedding layer, a multi-head attention layer, a bidirectional long short-term memory network layer, and a fully connected output layer.

[0174] S33. Input the historical dynamic behavior sequence into the embedding layer, and map the discrete behavior types into dense vectors through the embedding layer to obtain the behavior embedding vector sequence.

[0175] S34. Input the behavior embedding vector sequence into the multi-head attention layer, calculate the importance weight of different behaviors in the behavior embedding vector sequence to the mastery through the multi-head attention layer, and perform weighted aggregation of the behavior embedding vector sequence according to the importance weight to obtain the weighted behavior representation vector.

[0176] S35. Input the weighted behavior representation vector into the bidirectional long short-term memory network layer, and capture the forward and backward dependencies of the weighted behavior representation vector through the bidirectional long short-term memory network layer to obtain the hidden state sequence.

[0177] S36. Input the hidden state of the last time step in the hidden state sequence into the fully connected output layer, and perform feature mapping on the hidden state of the last time step through the fully connected output layer to obtain the predicted value of skill mastery.

[0178] S37. Input the predicted value of skill mastery and the supervision label into the loss function to calculate the prediction error and obtain the prediction error;

[0179] S38. Based on the prediction error, update the network parameters of the inference model using the backpropagation algorithm to obtain the updated model network parameters;

[0180] S39. Select different samples from the training sample set and repeat steps S33 to S38 to perform multiple rounds of iterative training on the initial skill mastery inference model until the prediction error is less than the preset error threshold or the number of iterations reaches the preset iteration threshold.

[0181] S310. Based on the updated model network parameters obtained in the last iteration, update the initial skill mastery inference model to obtain the skill mastery inference model.

[0182] In one embodiment, the counterfactual reasoning module 505 can also be used for:

[0183] S41. Based on the skill nodes in the personalized causal learning path, generate different intervention actions for different skill nodes through a preset set of intervention action classifications, and generate a set of candidate intervention actions.

[0184] S42. The posterior probability distribution of potential exogenous variables affecting the user's learning performance is obtained by calculating the user's structured event dataset using the variational Bayesian inference algorithm.

[0185] S43. Select a candidate intervention action from the candidate intervention action set as the current intervention action, modify the structural causal model according to the current intervention action, cut off all input edges of the skill node corresponding to the current intervention action and assign a preset intervention value to obtain the post-intervention causal model.

[0186] S44. Substitute the posterior probability distribution into the post-intervention causal model, and calculate the counterfactual mastery prediction values ​​of all uninterrupted skill nodes by forward propagating the linear regression equation in the post-intervention causal model. Combine the counterfactual mastery prediction values ​​of all skill nodes into a counterfactual mastery vector.

[0187] S45. Using the skill points in the counterfactual mastery vector that exceed the preset mastery threshold as the counterfactual starting point and the target skill point set corresponding to the user's career goal as the endpoint, perform path search calculation on the dynamic causal skill graph corresponding to the post-intervention causal model with the goal of maximizing the cumulative causal effect to obtain the counterfactual path.

[0188] S46. Estimate the time required to learn along the counterfactual path using the learning time prediction function to obtain the expected counterfactual learning time.

[0189] In one embodiment, this application provides a computer device including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above-described method embodiments.

[0190] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the steps in the above method embodiments.

[0191] For the device embodiments, since they basically correspond to the method embodiments, the relevant parts can be referred to in the description of the method embodiments. The device embodiments described above are merely illustrative. The components described as separate parts may or may not be physically separate, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this disclosure according to actual needs. Those skilled in the art can understand and implement this without creative effort.

[0192] The above-described embodiments are merely illustrative of several implementation methods of the embodiments of this application, and their descriptions are relatively specific and detailed. However, they should not be construed as limiting the scope of the patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the embodiments of this application, and these modifications and improvements all fall within the protection scope of the embodiments of this application.

Claims

1. An artificial intelligence-based professional skill personalized learning path planning method, characterized in that, The method includes: S1. Collect the user's structured event dataset; based on the structured event dataset, extract instrumental variables and user covariates, and use the instrumental variables and user covariates as inputs to perform a first-stage linear regression on all skill points to calculate the user's instrumental variable prediction of mastery level for each skill point; S2. Based on the instrumental variables, predict the mastery level data, perform a second-stage linear regression to calculate the causal effect strength between skill points, and construct a dynamic causal skill graph based on the causal effect strength. The dynamic causal skill graph is constructed with skill points as nodes and the causal effect strength as directed edges. S3. Extract the user's dynamic behavior sequence from the structured event dataset, and input the dynamic behavior sequence into the pre-trained skill mastery inference model to obtain the user's skill mastery vector at each skill point; S4. Taking the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of skill points corresponding to the user's career goal as the ending point, and the goal of maximizing the cumulative causal effect, a path search is performed to obtain a personalized causal learning path. S5. Based on the dynamic causal skill graph, the personalized causal learning path is input into the structural causal model for counterfactual inference to predict the counterfactual mastery vector, counterfactual expected learning time, and counterfactual path under different intervention conditions. The counterfactual path is the learning path replanned based on the dynamic causal skill graph and the skill mastery inference model after applying corresponding intervention conditions to the personalized causal learning path. S6. The counterfactual knowledge vector and the counterfactual expected learning time are weighted and fused to obtain the intervention causal benefit value. The counterfactual path corresponding to the highest intervention causal benefit value is taken as the optimal learning path adjustment scheme.

2. The method of claim 1, wherein, The process of using the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of skill points corresponding to the user's career goal as the ending point, and maximizing the cumulative causal effect as the objective to perform path search, to obtain a personalized causal learning path, includes: S11. Based on the dynamic causal skill graph, select any candidate path from the starting point to the ending point, traverse all directed edges on the candidate path, obtain the causal effect intensity corresponding to each directed edge, and compare the starting skill of the directed edge with all skill points in the current mastered skill point set to obtain the comparison result. S12. Based on the comparison results, the causal effect intensity is weighted and adjusted using a cumulative causal effect value function to obtain the cumulative causal effect value of the candidate path, wherein the expression of the cumulative causal effect value function is: ; in, For the candidate path The cumulative causal effect value, Skill points skill points The strength of the causal effect As an indicator function, when the comparison result determines the user's skill points degree of mastery Greater than the mastery threshold The value is 1 if it is true, and 0 otherwise. For the updated personalized gain parameters, For users skill points The degree of mastery; S13. Perform a heuristic search on the dynamic causal skill graph to obtain the best candidate path corresponding to the cumulative causal effect value with the largest value, and use the best candidate path as the personalized causal learning path.

3. The method of claim 2, wherein, The step of performing a heuristic search on the dynamic causal skill graph to obtain the optimal candidate path corresponding to the largest cumulative causal effect value, and then using the optimal candidate path as the personalized causal learning path, further includes: S21. Record the actual mastery time of the completed skill points during the learning process, and compare the actual mastery time with the expected learning time of the corresponding skill points in the personalized causal learning path to obtain the user's learning efficiency coefficient. S22. Calculate the user's transfer learning ability representation value by performing an exponentially weighted moving average based on the learning efficiency coefficient. S23. Input the transfer learning ability representation value into the gain parameter mapping function to calculate the updated personalized gain parameter, wherein the expression of the gain parameter mapping function is: ; wherein, is the updated individualized gain parameter, is the individualized gain parameter, is the transfer learning capability representation value, is a normalization reference constant.

4. The method of claim 1, wherein, The skill mastery inference model was trained using the following method: S31. Extract all users' historical dynamic behavior sequences from the structured event dataset, and use the skill test scores corresponding to the historical dynamic behavior sequences as supervision labels. Construct a training sample set based on the historical dynamic behavior sequences and the supervision labels. S32. Construct an initial skill mastery inference model, which includes an embedding layer, a multi-head attention layer, a bidirectional long short-term memory network layer, and a fully connected output layer. S33. Input the historical dynamic behavior sequence into the embedding layer, and map the discrete behavior types into dense vectors through the embedding layer to obtain a behavior embedding vector sequence. S34. Input the behavior embedding vector sequence into the multi-head attention layer, calculate the importance weight of different behaviors in the behavior embedding vector sequence to mastery through the multi-head attention layer, and perform weighted aggregation on the behavior embedding vector sequence according to the importance weight to obtain a weighted behavior representation vector. S35. Input the weighted behavior representation vector into the bidirectional long short-term memory network layer, and capture the forward and backward dependencies of the weighted behavior representation vector through the bidirectional long short-term memory network layer to obtain the hidden state sequence. S36. Input the hidden state of the last time step in the hidden state sequence into the fully connected output layer, and perform feature mapping on the hidden state of the last time step through the fully connected output layer to obtain the skill mastery prediction value. S37. Input the predicted skill mastery value and the supervision label into the loss function to calculate the prediction error, and obtain the prediction error; S38. Update the network parameters of the initial skill mastery inference model using the backpropagation algorithm based on the prediction error to obtain the updated model network parameters; S39. Select different samples from the training sample set and repeat steps S33 to S38 to perform multiple rounds of iterative training on the initial skill mastery inference model until the prediction error is less than a preset error threshold or the number of iterations reaches a preset iteration threshold. S310. Based on the updated model network parameters obtained in the last iteration, update the initial skill mastery inference model to obtain the skill mastery inference model.

5. The method of claim 1, wherein, Based on the dynamic causal skill map, the personalized causal learning path is input into the structural causal model for counterfactual inference to predict the counterfactual mastery vector, expected counterfactual learning time, and counterfactual path under different intervention conditions, including: S41. Based on the skill nodes on the personalized causal learning path, generate different intervention actions for different skill nodes through a preset set of intervention action classifications, and generate a candidate set of intervention actions. S42. The posterior probability distribution of the latent exogenous variables affecting the user's learning performance is obtained by calculating the structured event dataset of the user using the variational Bayesian inference algorithm. S43. Select a candidate intervention action from the candidate intervention action set as the current intervention action, modify the structural causal model according to the current intervention action, cut off all input edges of the skill node corresponding to the current intervention action and assign a preset intervention value to obtain the post-intervention causal model. S44. Substitute the posterior probability distribution into the post-intervention causal model, calculate the counterfactual mastery prediction values ​​of all uninterrupted skill nodes by forward propagation through the linear regression equation in the post-intervention causal model, and combine the counterfactual mastery prediction values ​​of all skill nodes into the counterfactual mastery vector. S45. Using the skill points in the counterfactual mastery vector that exceed the preset mastery threshold as the counterfactual starting point and the target skill point set corresponding to the user's career goal as the ending point, perform path search calculation on the dynamic causal skill graph corresponding to the post-intervention causal model with the goal of maximizing the cumulative causal effect to obtain the counterfactual path. S46. Estimate the time required to learn along the counterfactual path using a learning time prediction function to obtain the expected counterfactual learning time.

6. An artificial intelligence-based vocational skill personalized learning path planning system for implementing the method of any one of claims 1 to 5, characterized in that, The system includes: The data acquisition and regression module is used to collect users' structured event datasets; based on the structured event datasets, it extracts instrumental variables and user covariates, and uses the instrumental variables and user covariates as inputs to perform a first-stage linear regression on all skill points to calculate the user's instrumental variable prediction of mastery level for each skill point; The dynamic causal graph construction module is used to predict mastery level data based on the instrumental variables, perform second-stage linear regression, calculate the causal effect strength between skill points, and construct a dynamic causal skill graph based on the causal effect strength. The dynamic causal skill graph is constructed with skill points as nodes and the causal effect strength as directed edges. The skill mastery inference module is used to extract the dynamic behavior sequence of users in the structured event dataset, and input the dynamic behavior sequence into the pre-trained skill mastery inference model to obtain the skill mastery vector of users at each skill point; The causal path search module is used to take the current set of mastered skill points corresponding to the skill mastery vector as the starting point, the target set of target skill points corresponding to the user's career goal as the ending point, and the goal of maximizing the cumulative causal effect to perform path search and obtain a personalized causal learning path. The counterfactual reasoning module is used to input the personalized causal learning path into the structural causal model based on the dynamic causal skill graph to perform counterfactual reasoning, and predict the counterfactual mastery vector, counterfactual expected learning time and counterfactual path under different intervention conditions. The counterfactual path is the learning path replanned based on the dynamic causal skill graph and the skill mastery inference model after applying corresponding intervention conditions to the personalized causal learning path. The benefit assessment and scheme generation module is used to perform weighted fusion of the counterfactual mastery vector and the counterfactual expected learning time to obtain the intervention causal benefit value, and the counterfactual path corresponding to the highest intervention causal benefit value is taken as the optimal learning path adjustment scheme. 7.A computer device, comprising a memory and a processor, wherein the memory stores a computer program, and the computer device is configured to perform the method according to any one of claims 1-6 when the computer program is executed by the processor. When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 5.

8. A computer-readable storage medium having stored thereon a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 5.