AI-based learning path planning method, system, device and readable medium

By constructing learner profiles through multimodal data collection and artificial intelligence algorithms, implicit dependencies of knowledge points are discovered, and learning paths are optimized in real time. This solves the problems of personalization and universality in learning path planning, and improves learning efficiency and interest.

CN122243243APending Publication Date: 2026-06-19BEIJING ZHIDAKE INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING ZHIDAKE INFORMATION TECH CO LTD
Filing Date
2026-03-18
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies lack comprehensive learner profiles, making it difficult to accurately match learners' learning pace and weaknesses. Knowledge association mining is not deep enough, and the ability to dynamically adjust is limited, resulting in low learning efficiency and insufficient learning interest. Furthermore, path planning lacks flexibility and universality.

Method used

Learning data is acquired using a multimodal acquisition method. Learner profiles are constructed by combining an improved CNN convolutional neural network and a KT knowledge tracing model. Implicit dependencies between knowledge points are mined through graph neural networks. Personalized learning paths are generated using an artificial intelligence hybrid planning algorithm and iteratively optimized in real time to form a closed-loop learning path planning system.

🎯Benefits of technology

It enables precise quantitative support for learning paths, adapts to learners' learning status in real time, improves the logic and adaptability of learning paths, fits various educational scenarios, and continuously improves learning outcomes.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122243243A_ABST
    Figure CN122243243A_ABST
Patent Text Reader

Abstract

This invention discloses an AI-based learning path planning method, system, device, and readable medium, belonging to the interdisciplinary field of artificial intelligence and educational technology. The method includes the following steps: S1, data acquisition; S2, learner model construction; S3, knowledge model construction; S4, initial path generation; S5, dynamic path optimization; and S6, effect evaluation and closed-loop optimization. This invention employs the aforementioned AI-based learning path planning method, system, device, and readable medium. Through multimodal data acquisition combined with improved models, it constructs a precise and quantifiable learner profile, iterates and optimizes the learning path in real time, leverages the discovery of implicit dependencies in knowledge points and expert calibration to ensure path logic, and uses hybrid algorithms to adapt to various educational scenarios, forming a complete closed loop to continuously improve the adaptability of the learning path and learning effectiveness.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the interdisciplinary field of artificial intelligence and educational technology, specifically involving AI-based learning path planning methods, systems, devices, and readable media. Background Technology

[0002] With the rapid development of intelligent education, the traditional "one-size-fits-all" learning model can no longer meet the personalized needs of different learners. There are significant differences in learners' knowledge base, learning pace, and learning preferences. Fixed learning paths can easily lead to problems such as low learning efficiency and insufficient learning interest.

[0003] However, existing technologies have the following shortcomings: First, path planning lacks comprehensive learner profile support, relying only on partial learning data (such as test scores) for planning, which cannot accurately adapt to learners' learning pace and weaknesses; second, knowledge association mining is not deep enough, relying heavily on manually preset knowledge point relationships, making it difficult to uncover implicit dependencies between knowledge points, resulting in insufficient logicality in path planning; third, dynamic adjustment capabilities are limited, mostly adopting offline training and online application modes, which cannot respond to changes in learners' learning status in real time, leading to delayed optimization; fourth, some specialized planning schemes are only suitable for specific scenarios (such as college entrance examination preparation), have poor versatility, and are prone to problems such as single algorithms and lack of flexibility in path generation.

[0004] Furthermore, in existing technologies, some path planning schemes are driven by a single RAG knowledge base, multiple agents are bound to the RAG knowledge base, or specific optimization algorithms (such as the improved Whale optimization algorithm that includes exam-driven offset terms). These schemes have problems such as overlapping protection scope, insufficient innovation, and limited scenario adaptability, and cannot take into account both versatility and personalization, as well as real-time performance and reliability.

[0005] Therefore, a new method is urgently needed. Summary of the Invention

[0006] The purpose of this invention is to provide an AI-based learning path planning method, system, device, and readable medium. This method constructs a precise and quantitative learner profile by combining multimodal data acquisition with improved models, iterates and optimizes the learning path in real time, and ensures the path logic by mining implicit dependencies of knowledge points and calibrating them with experts. It adapts to various educational scenarios with hybrid algorithms, forming a complete closed loop to continuously improve the adaptability of the learning path and the learning effect.

[0007] To achieve the above objectives, this invention provides an AI-based learning path planning method, system, device, and readable medium, comprising the following steps: S1. Acquire multi-dimensional learning data of learners using a multi-modal acquisition method, encrypt and store the multi-dimensional learning data, and then output the multi-dimensional learning data. S2. Using the multi-dimensional learning data output from S1 as input, feature extraction and analysis are performed based on artificial intelligence algorithms. An improved CNN convolutional neural network with the introduction of a temporal convolution module and attention mechanism is used to extract the learner's learning efficiency features and focus features. Combined with an improved KT knowledge tracking model that has undergone personalized probability adjustment, multi-knowledge point association tracking, and forgetting probability correction, the learner's mastery of each knowledge point is tracked in real time. The learning efficiency features, focus features, knowledge point mastery features, learner's learning preference features, and goal achievement progress features are integrated to construct a learner's ability profile in the form of feature vectors, and the learner's ability profile is output. S3. Organize the knowledge point system of the target learning domain, and adopt an improved GNN graph neural network that introduces knowledge point priority weights, answer association strength, L1 regularization term and node sampling strategy to mine the implicit dependencies between knowledge points in the knowledge point system. Combine the experience of domain experts to calibrate the priority relationship of knowledge points, integrate the explicit dependencies, implicit dependencies and priority relationships between knowledge points, construct a structured knowledge graph, and output the structured knowledge graph. S4. Using the learner's ability profile output by S2, the structured knowledge graph output by S3, and the learner's pre-set learning target data as joint inputs, an artificial intelligence hybrid planning algorithm combining reinforcement learning algorithm and particle swarm optimization algorithm is adopted. A fusion mechanism between the RAG subject vector knowledge base and the structured knowledge graph is introduced to generate a personalized initial learning path and output the initial learning path. S5. Based on the initial learning path output by S4, real-time feedback data is collected during the learner's execution of the initial learning path. An optimization trigger threshold is preset. When the feedback data meets any of the optimization trigger thresholds, immediate path optimization is performed through a multi-agent collaborative reasoning mechanism to generate and output the optimized learning path. At the same time, the learner ability profile constructed by S2 and the knowledge point association weights of the structured knowledge graph constructed by S3 are updated in reverse based on the feedback data. S6. Using the actual execution data of the optimized learning path output in S5 as the evaluation input, conduct multi-dimensional quantitative evaluation of the learner's learning effect according to a preset period, calculate the average score of each evaluation indicator, and preset the evaluation threshold. If the average score is lower than the evaluation threshold, trigger the full-process path iterative optimization, adjust the core parameters of the path planning, and feed the optimization parameters back to S4. Based on the updated learner ability profile, the updated structured knowledge graph, and the optimization parameters, S4 iteratively generates a new personalized learning path, forming a closed-loop learning path planning system of "collection-planning-execution-evaluation-optimization".

[0008] Preferably, in S1, the multi-dimensional learning data includes knowledge mastery status data, learning behavior data, learning preference data, and learning goal data; Among them, the knowledge mastery status data is collected through online quizzes, offline tests, and knowledge point retests, including the correct answer rate, types of wrong questions, and time spent answering questions for each knowledge point; Learning behavior data is collected through the learning platform, including video learning duration, number of pauses / fast forwards / rewinds, participation in interactive quizzes, and note-taking activities; Learning preference data is derived from learner questionnaires and learning behaviors, including preferred learning formats, learning time periods, and learning pace. The learning objectives are set by the learners themselves and include short-term objectives, long-term objectives, and difficulty levels.

[0009] Preferably, in S2, the improvement to the CNN convolutional neural network is the introduction of a temporal convolution module and an attention mechanism. The improved convolutional layer formula is as follows: ; In the formula, For attention mechanism functions; The temporal feature vector of the input data; Indicates the first In the convolutional layer Output feature value of position; For activation functions; The kernel size; For the first Layer convolution kernel in Position weight; For the first Layer input feature map in The pixel value of the location; For the first Bias terms of convolutional layers; Improvements to the KT knowledge tracing model include dynamically adjusting personalized learning probabilities and error probabilities based on learner learning efficiency characteristics, achieving multi-knowledge point association tracing by combining knowledge point dependency weights from the knowledge graph, introducing forgetting probability to correct state transition probabilities, and incorporating answering time into observation probability calculations. The feature vector of the learner's ability profile is [mean knowledge mastery, learning efficiency, learning preference weight, and goal achievement progress].

[0010] Preferably, in S3, the improvement of the improved GNN graph neural network further includes feature fusion processing of the initial node feature matrix to incorporate the average answer accuracy and average mastery data of knowledge points; The node sampling strategy is to sample only the top-K neighboring nodes of each knowledge point for feature aggregation, with the default value of K being 10. The improved GNN graph neural network adopts the ELU activation function, and an L1 regularization term is added to the model loss function, with a regularization coefficient λ=0.002; The structured knowledge graph is stored in a separate knowledge model module, which supports real-time access, node updates, and association weight adjustments to the structured knowledge graph.

[0011] Preferably, in S4, the reinforcement learning algorithm uses a dual reward function based on the learner's ability improvement speed and the goal achievement rate. The reward function formula is as follows: ; In the formula, To enhance the total reward value of learning; , These are the weighting coefficients; To accelerate the improvement of capabilities; For target achievement rate; The integration mechanism of RAG subject vector knowledge base and knowledge graph is to associate knowledge points in knowledge graph with learning resources in RAG, use cosine similarity to calculate the correlation between knowledge points, and prioritize recommending basic knowledge points with a correlation of ≥0.8 with learners' weak points and that meet the knowledge dependency relationship.

[0012] Preferably, in S5, the feedback data includes the correct answer rate, learning time deviation, knowledge point mastery retest data, and learner's proactive adjustment requests; The optimization trigger thresholds are: a correct answer rate of less than 70%, a learning time deviation of more than ±30 minutes, and a retest correct answer rate of less than 80%. The multi-agent collaborative reasoning mechanism includes a feedback data evaluation agent, a path optimization planning agent, and an optimization result push feedback agent. The three agents work together to complete the identification of weaknesses, path adjustment, and optimization result push.

[0013] Preferably, in S6, the preset evaluation cycle is once a week; The multi-dimensional quantitative evaluation indicators include knowledge mastery assessment, learning efficiency assessment, goal achievement rate assessment, and learning experience assessment, with each of the four indicators quantified from 0 to 100 points. The evaluation threshold is 80 points. The full-process path iterative optimization includes adjusting the weight coefficient of the hybrid algorithm, the allocation of knowledge point learning time, and the weight of knowledge graph association.

[0014] This invention also provides an AI-based learning path planning system, comprising: The data acquisition module is used to execute step S1, which uses a multimodal acquisition method to collect multi-dimensional learning data from learners, encrypts and stores the data, and outputs it to the learner model building module. The learner model building module, connected to the data acquisition module, is used to execute step S2. It processes multi-dimensional learning data through an improved CNN convolutional neural network and a KT knowledge tracing model, and builds and outputs the learner's ability profile to the initial path generation module. The knowledge model building module is used to execute step S3, build a structured knowledge graph of the target learning domain, and output the knowledge graph to the initial path generation module and the dynamic path optimization module. The initial path generation module, connected to the learner model building module and the knowledge model building module, is used to execute step S4. It combines learner ability profiles, structured knowledge graphs, and learning target data to generate an initial learning path through an artificial intelligence hybrid planning algorithm and output it to the dynamic path optimization module. The dynamic path optimization module is connected to the initial path generation module, the learner model building module, and the knowledge model building module. It is used to execute step S5, collect learning feedback data and perform dynamic path optimization, and at the same time, update the ability profile in the learner model building module and the knowledge graph association weights in the knowledge model building module in reverse. The effect evaluation and closed-loop optimization module is connected to the dynamic path optimization module and the initial path generation module. It is used to execute step S6, evaluate the learning effect in multiple dimensions, trigger closed-loop optimization based on the evaluation results, and adjust the planning parameters of the initial path generation module.

[0015] Therefore, by employing the aforementioned AI-based learning path planning method, system, device, and readable medium, the technical solution of this invention has the following beneficial effects compared with the prior art: (1) A multimodal acquisition method is adopted to acquire four types of full-dimensional learning data: knowledge mastery status, learning behavior, learning preferences, and learning goals. At the same time, an improved CNN convolutional neural network with temporal convolution and attention mechanism is integrated with an improved KT knowledge tracking model with personalized probability adjustment and multi-knowledge point association tracking. This enables the accurate extraction of learning efficiency and focus features and real-time dynamic tracking of knowledge point mastery, providing accurate and quantitative core support for path planning and breaking through the limitations of traditional label-based profiling. (2) Adopt a real-time iterative optimization strategy, collect feedback data in real time and adjust the learning path to solve the problem of lagging optimization of existing technologies and ensure that the path always adapts to the learner's real-time learning status. (3) By mining the implicit dependencies between knowledge points through graph neural networks and combining them with the experience of domain experts for calibration, the accuracy of the knowledge graph can be improved, ensuring the logic and scientific nature of the learning path; (4) Adopting an artificial intelligence hybrid planning algorithm, supporting flexible combination of multiple algorithms to adapt to various educational scenarios, while enhancing the uniqueness of the technical solution through differentiated technical constraints; (5) Form a complete closed loop of “collection-planning-execution-evaluation-optimization”, and combine multi-dimensional evaluation and manual adjustment functions to continuously improve the adaptability of the learning path and the learning effect.

[0016] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. Attached Figure Description

[0017] Figure 1 This is a flowchart illustrating an embodiment of the AI-based learning path planning method, system, device, and readable medium of the present invention. Detailed Implementation

[0018] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention. Unless otherwise defined, the technical or scientific terms used in the present invention should have the ordinary meaning understood by those skilled in the art.

[0019] Example 1 like Figure 1 As shown, this embodiment provides an AI-based learning path planning method, system, device, and readable medium. It should be understood that the specific parameters, models, and protocols mentioned in this embodiment are merely examples to help those skilled in the art understand the present invention, and are not intended to limit the present invention.

[0020] The present invention provides an AI-based learning path planning method, system, device, and readable medium, comprising the following steps: S1. Employing a multimodal acquisition method, comprehensively collecting learners' multi-dimensional learning data, specifically including: Knowledge mastery status data: collected through online quizzes, offline tests, and knowledge point retests, including the correct answer rate, types of wrong answers, and time spent answering questions for each knowledge point; Learning behavior data: collected through the learning platform, including video learning duration, number of pauses / fast forwards / rewinds, participation in interactive quizzes, note-taking activities, etc. Learning preference data: derived from learners' self-completed questionnaires and systematic analysis of learning behaviors, including preferred learning formats, learning time periods, and learning pace; Learning objective data: set actively by learners, including short-term goals, long-term goals, and difficulty level of the goals; All collected data is stored in the data storage module and encrypted to ensure the privacy and security of learners' data. Output multi-dimensional learning data; S2. Based on artificial intelligence algorithms, feature extraction and analysis are performed on the multi-dimensional learning data in S1 to construct a learner ability profile. The specific process is as follows: Deep learning feature extraction algorithms (such as CNN convolutional neural networks) are used to extract features from learning behavior data and physiological state data to obtain learning efficiency features and focus features; Considering the unique characteristics of learning behavior data (temporal multidimensional data) and physiological state data (continuous temporal data), the basic CNN is improved by introducing a temporal convolution module and an attention mechanism. The improved convolutional layer formula is as follows: ; In the formula, This is the attention mechanism function, used to assign weights to the extracted features, highlighting key features related to learning efficiency and focus. The temporal feature vector of the input data has the same dimension as the convolutional intermediate features. The vector dimension is adaptively adjusted according to the dimension of the input data. It is used to capture the temporal change patterns of learning behavior and physiological state, and adapts to the temporal characteristics of the two types of data. Indicates the first In the convolutional layer Output feature value of position; For activation functions; The kernel size is 3×3. For the first Layer convolution kernel in Position weight, ; For the first Layer input feature map in The pixel value of the location; For the first The bias term of the convolutional layer, ; The formula for calculating the weights of the attention mechanism is: ; In the formula, The feature vector extracted by convolution; Temporal feature vector Simplified representation; For feature dimensions; For feature vectors The Each dimension value; It is a natural exponential function; The pooling layer formula is: ; In the formula, Indicates the first Layered pooling The output value of the position; The pooling core size (2×2 is selected); For the first Each convolutional layer corresponds to a feature value within the pooling region; The activation function formula is: ; In the formula, For activation function hyperparameters, ; The input value for the activation function; Machine learning classification algorithms are used to classify knowledge mastery status data to obtain knowledge point mastery characteristics; The input learning behavior and physiological state data are standardized to eliminate the influence of units and ensure the adaptability of temporal features and convolutional features; The KT (Knowledge Tracing) model is used to track changes in the mastery status of each knowledge point in real time based on learners' answer data and update the mastery characteristics of each knowledge point. By combining learner ability profiles (learning efficiency, knowledge gaps) and knowledge graphs (knowledge point dependencies), the KT knowledge tracing model is improved. The improved formula is as follows: Personalized learning probability and error probability are dynamically adjusted in conjunction with learner learning efficiency characteristics (values ​​range from 0 to 1, extracted by CNN, with higher values ​​indicating higher learning efficiency). ; ; In the formula, Personalize the learning probability for each learner; The characteristic value of learner learning efficiency; The basic learning probability of the basic KT model; Personalize the error probability for learners; The basic error probability of the basic KT model; To ensure that learners with high learning efficiency have a higher probability of learning and a lower probability of making mistakes, and to fit individualized scenarios; Multi-knowledge point association tracking, combined with the dependency weights of knowledge points in the knowledge graph ( , indicating knowledge points Knowledge points Dependency weight (value 0-1), updating knowledge points. Posterior knowledge probability: ; In the formula, for After answering questions, learners gain a better understanding of the knowledge points. The posterior mastery probability; To help learners master knowledge points hour, The observed probability of answering the question correctly at any given moment; for Before answering questions, learners review the knowledge points. The prior knowledge probability; Prerequisite knowledge points The current probability of possession; For knowledge points The set of prerequisite knowledge points; For knowledge points Knowledge points Dependency weights; For learners who have not mastered the knowledge points hour, The observed probability of answering the question correctly at any given moment; for Before answering questions, learners review the knowledge points. The probability of not being known; Ensure that the tracking of knowledge mastery status aligns with knowledge dependencies, thereby improving tracking accuracy; The EM (Expectation-Maximization) algorithm is used to iteratively update the initial mastery probability in real time based on learners' historical answer data (the last 30 answer records). Basic learning probability Basic guess probability Basic error probability The iterative formula is as follows: ; In the formula, This refers to the number of times a question has been answered in history. For the basis of the previous The results of this question. I haven't mastered the knowledge points at all Always keep track of the joint posterior probability of knowledge points; For the basis of the previous The results of this question. The posterior probability of not having mastered a knowledge point at any given time; For the front The results of the next question; By iterating, the parameters are made to fit the learner's actual answer performance, avoiding tracking bias caused by fixed parameters; Introducing the probability of forgetting (Default value is 0.02, decaying over time), adjusting state transition probabilities to adapt to long-term learning scenarios: ; In the formula, for After mastering the knowledge points, The state transition probability of this knowledge point is always understood; The time interval between two responses is in days. Basic forgetting probability; The longer the time interval, the higher the probability of forgetting, which aligns with the forgetting patterns in actual learning. Set upper and lower limits for the posterior probability of knowledge. This avoids extreme shifts in the probability of mastery due to a single incorrect answer or a correct guess (e.g., a single incorrect answer drops the probability to 0, while a single correct answer raises it to 1), thus improving tracking stability. Time spent answering questions By incorporating observational probability calculations, when the time spent answering a question is excessively long (more than twice the average time spent on that knowledge point), the observational probability of a correct answer is reduced to avoid misjudgments of mastery status due to "guessing correctly." ; In the formula, This represents the average time spent answering questions on this knowledge point. For learners who have not mastered the knowledge points hour, The observed probability of answering the question correctly at any given moment; The basic guess probability of the basic KT model. ; For this learner's knowledge points The actual time spent answering the questions; The updated knowledge point mastery features (i.e., posterior mastery probability) The learning efficiency features extracted by CNN, the learning preference weights derived from learning preference data, and the progress of goal achievement calculated based on learning goal data are all integrated into the feature vector of the learner's ability profile. The learner's ability profile is ultimately stored as a feature vector, specifically [mean knowledge point mastery, learning efficiency, learning preference weights, and goal achievement progress]. An example is shown below: The feature vector of a learner's ability profile is [0.85, 0.7, 0.6 / 0.3 / 0.1, 0.4], where 0.85 is the average value of mastery of all knowledge points, 0.7 is the quantitative value of learning efficiency, 0.6 / 0.3 / 0.1 are the preference weights of video learning, document learning and live learning respectively, and 0.4 is the progress of goal achievement; Output learner competency profiles; S3. Construct a knowledge graph for the target learning domain. The specific process is as follows: By analyzing high school mathematics textbooks and examination syllabi, we can obtain the knowledge point system of the target area, including core modules such as functions, geometry, probability and statistics, with each module containing specific knowledge points. A Graph Neural Network (GNN) algorithm is used to analyze the relationships between knowledge points and uncover implicit dependencies. By combining the knowledge point priorities of the knowledge graph with the characteristics of question-answering related data, the GNN is improved. The improved formula is as follows: By introducing knowledge point priority weights and optimizing the adjacency matrix normalization process, the correlation and influence of core knowledge points are highlighted. ; In the formula, For the improved adjacency matrix of knowledge points, dimension; The basic knowledge point is the explicit association adjacency matrix. dimension; The total number of knowledge points in the knowledge graph; It is the identity matrix. dimension; This is a diagonal matrix of knowledge point priority weights. Dimension, with diagonal elements representing the priority weights of each knowledge point. ; This is a priority weight vector for knowledge points; For the first The priority weight of each knowledge point (with values ​​ranging from 0.1 to 1.0, 0.8 to 1.0 for core knowledge points, and 0.1 to 0.3 for extended knowledge points) is adjusted to give higher weight to the relationships between core knowledge points in feature aggregation, thereby improving the targeting of implicit dependency mining. By incorporating the correlation strength of answer responses and optimizing the node feature update formula, the ability to uncover implicit dependencies of weakly correlated knowledge points is enhanced. ; In the formula, This is the updated node feature matrix. dimension; The initial node feature matrix, dimension; This is the weight matrix. dimension; The knowledge point answer association strength matrix has the following dimensions: , Representing knowledge points With knowledge points The correlation strength between answers (calculated from the frequency of cross-exam questions and the frequency of incorrect answers, with a value of 0-1); This is the association strength weighting coefficient (default value 0.4), used to balance the influence of explicit associations and implicit associations in the answers, and to avoid missing implicit dependencies of weak associations; For node feature dimensions; Introducing L1 regularization term: ; In the formula, The regularization coefficient is . ; Here is the L1 regularization term; here is the weight matrix. exist The element at the specified position; By incorporating a model loss function, overfitting of the weight matrix is ​​suppressed, ensuring the generalization ability of latent dependency mining and avoiding excessive reliance on association patterns in the training data. For the initial node feature matrix Feature fusion is performed, incorporating data such as the accuracy rate of answering questions about knowledge points and the average learner mastery level. The formula is as follows: ; In the formula, For enhanced node feature matrix; For feature splicing operations; This is the vector representing the average correct answer rate for each knowledge point. The average mastery vector of knowledge points; This is a diagonal matrix representing the average correct answer rate for each knowledge point. The dimension is defined by the diagonal elements, which represent the average correct answer rate for each knowledge point, and the remaining elements are 0. This is a diagonal matrix representing the average mastery level of each knowledge point. Dimension 1, all other elements are 0; The diagonal elements represent the average mastery level of each knowledge point; this enhances the richness of node features and facilitates the discovery of implicit dependencies. A sampling strategy is adopted, sampling only the Top-K neighboring nodes (K defaults to 10) for each knowledge point for feature aggregation, replacing the calculation of the full adjacency matrix. The formula is adjusted as follows: ; In the formula, This is the sampled adjacency matrix. dimension; This is the sampled question-response correlation strength matrix. dimension; The number of adjacent nodes sampled for each knowledge point; Reduce model computation and adapt to knowledge graph scenarios with multiple knowledge points (such as thousands or more); The activation function is the ELU function, and the formula is: ; In the formula, It is a natural exponential function; Based on the experience of experts in the field of high school mathematics, the priority relationship of knowledge points is adjusted. For example, the priority of core knowledge points such as functions and trigonometric functions is set to high, the priority of extended knowledge points is set to medium, and the priority of interesting extended knowledge points is set to low. The knowledge points, explicit / implicit dependencies, and priority relationships between them are integrated to form a structured knowledge graph, which is stored in the knowledge model module and supports real-time retrieval and updates. Output a structured knowledge graph; S4. Combining the capability profile from S2, the structured knowledge graph from S3, and the learning target data, an initial personalized learning path is generated using an artificial intelligence hybrid planning algorithm. The specific process is as follows: A synergistic combination of reinforcement learning and particle swarm optimization algorithms is employed. The reinforcement learning algorithm uses a dual reward function based on both the learner's ability improvement rate and the goal achievement rate. The reward function formula is as follows: ; In the formula, To enhance the total reward value of learning; , These are the weighting coefficients. , ; To accelerate the improvement of capabilities; For target achievement rate; The particle swarm optimization algorithm is used to optimize the learning order and learning time allocation of knowledge points. The number of particles is set to 50, the number of iterations is 30, and the inertia weight is 0.7. A fusion mechanism between the RAG subject vector knowledge base and the knowledge graph is introduced to associate knowledge points in the knowledge graph with relevant learning resources (such as knowledge point explanation documents and exercises) in the RAG subject vector knowledge base, and calculate the degree of association between knowledge points (using cosine similarity calculation, with a value range of 0-1). Based on the learner's skill profile, priority is given to recommending basic knowledge points that are highly relevant to the weaknesses (relevance ≥ 0.8) and conform to knowledge point dependencies. Appropriate learning time is allocated (e.g., 2 hours for linear functions, 3 hours for quadratic functions), suitable learning methods are recommended, and an initial personalized learning path is generated, for example: Day 1: Linear Functions (1.5 hours of video learning + 30 minutes of practice questions); Day 2: Retest of linear functions (30 minutes) and basics of quadratic functions (1.5 hours of video learning); Day 3: Advanced Quadratic Functions (1.5 hours of video learning + 1 hour of practice problems), and so on; S5. Real-time collection of learner feedback data during the execution of the initial personalized learning path, and dynamic path optimization based on the feedback data. The specific process is as follows: The real-time collected feedback data includes: answer accuracy, learning time deviation, knowledge point mastery retest data, and learners' proactive adjustment requests; The system has preset optimization trigger thresholds, including a correct answer rate of less than 70%, a learning time deviation of more than ±30 minutes, and a retest correct answer rate of less than 80%. When the feedback data meets any of these thresholds, the system will trigger immediate optimization. For example, if the correct answer rate for quadratic function practice questions is 65% (less than 70%), optimization will be triggered. A multi-agent collaborative reasoning mechanism is adopted, in which three agents work together: The feedback data evaluation agent is responsible for analyzing the feedback data and identifying the weak points as advanced knowledge points of quadratic functions. The path optimization planning agent, based on the evaluation results, adjusts the learning time for advanced quadratic functions, increasing it from 1.5 hours to 2 hours, and adds one set of practice questions; The optimization results are pushed to the intelligent agent, which synchronizes the optimized path to the learner and reminds them to pay special attention to the advanced content of quadratic functions. Based on feedback data, learner ability profiles are dynamically adjusted (e.g., the mastery of quadratic functions is adjusted from 0.5 to 0.55), while the association weights of knowledge points in the knowledge graph are adjusted (e.g., the association weight between quadratic and linear functions is increased from 0.8 to 0.85) to ensure more accurate subsequent path planning. S6. Regularly conduct multi-dimensional assessments of learners' learning outcomes to create a closed-loop optimization process, as follows: The evaluation cycle is set to once a week, and the multi-dimensional evaluation indicators include: knowledge point mastery evaluation (number of knowledge points mastered this week / number of planned knowledge points this week, quantitative score 0-100), learning efficiency evaluation (number of knowledge points mastered this week / cumulative learning time this week, quantitative score 0-100), goal achievement rate evaluation (current cumulative number of knowledge points mastered / total target number of knowledge points, quantitative score 0-100), and learning experience evaluation (learner questionnaire rating, quantitative score 0-100). Calculate the average score of the four evaluation indicators. The preset evaluation threshold is 80 points. If the average score is ≥80 points, maintain the current path planning strategy; if the average score is <80 points, trigger full-process path iterative optimization. For example, if the weekly evaluation average score is 75 points, which is lower than 80 points, the path needs to be optimized. Based on the evaluation results, the path planning parameters are adjusted, such as optimizing the weight coefficients of the hybrid algorithm, adjusting the learning time allocation of knowledge points, and updating the association weights of the knowledge graph. At the same time, combined with the changes in learner ability profiles, new learning paths are iteratively generated to form a complete closed loop of "collection-planning-execution-evaluation-optimization".

[0021] Therefore, the present invention adopts the above-mentioned AI-based learning path planning method, system, device and readable medium. The method constructs a precise and quantitative learner profile by combining multimodal acquisition with improved models, iterates and optimizes the learning path in real time, and ensures the path logic by mining the implicit dependencies of knowledge points and calibrating with experts. It adapts to various educational scenarios with hybrid algorithms, forming a complete closed loop to continuously improve the adaptability of the learning path and the learning effect.

[0022] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0023] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can still be made to the technical solutions of the present invention, and these modifications or equivalent substitutions cannot cause the modified technical solutions to deviate from the spirit and scope of the technical solutions of the present invention.

Claims

1. An AI-based learning path planning method, characterized in that, Includes the following steps: S1. Acquire multi-dimensional learning data of learners using a multi-modal acquisition method, encrypt and store the multi-dimensional learning data, and then output the multi-dimensional learning data. S2. Using the multi-dimensional learning data output from S1 as input, feature extraction and analysis are performed based on artificial intelligence algorithms. An improved CNN convolutional neural network with the introduction of a temporal convolution module and attention mechanism is used to extract the learner's learning efficiency features and focus features. Combined with an improved KT knowledge tracking model that has undergone personalized probability adjustment, multi-knowledge point association tracking, and forgetting probability correction, the learner's mastery of each knowledge point is tracked in real time. The learning efficiency features, focus features, knowledge point mastery features, learner's learning preference features, and goal achievement progress features are integrated to construct a learner's ability profile in the form of feature vectors, and the learner's ability profile is output. S3. Organize the knowledge point system of the target learning domain, and adopt an improved GNN graph neural network that introduces knowledge point priority weights, answer association strength, L1 regularization term and node sampling strategy to mine the implicit dependencies between knowledge points in the knowledge point system. Combine domain expert experience to calibrate the priority relationship of knowledge points, integrate knowledge points, explicit dependencies, implicit dependencies and priority relationships between knowledge points, construct a structured knowledge graph, and output the structured knowledge graph. S4. Using the learner's ability profile output by S2, the structured knowledge graph output by S3, and the learner's pre-set learning target data as joint inputs, an artificial intelligence hybrid planning algorithm combining reinforcement learning algorithm and particle swarm optimization algorithm is adopted. A fusion mechanism between the RAG subject vector knowledge base and the structured knowledge graph is introduced to generate a personalized initial learning path and output the initial learning path. S5. Based on the initial learning path output by S4, feedback data is collected in real time during the learner's execution of the initial learning path. An optimization trigger threshold is preset. When the feedback data meets any of the optimization trigger thresholds, immediate path optimization is performed through a multi-agent collaborative reasoning mechanism to generate and output the optimized learning path. At the same time, the learner ability profile constructed by S2 and the knowledge point association weights of the structured knowledge graph constructed by S3 are updated in reverse based on the feedback data. S6. Using the actual execution data of the optimized learning path output in S5 as the evaluation input, conduct multi-dimensional quantitative evaluation of the learner's learning effect according to a preset period, calculate the average score of each evaluation indicator, and preset the evaluation threshold. If the average score is lower than the evaluation threshold, trigger the full-process path iterative optimization, adjust the core parameters of the path planning, and feed the optimization parameters back to S4. Based on the updated learner ability profile, the updated structured knowledge graph, and the optimization parameters, S4 iteratively generates a new personalized learning path, forming a closed-loop learning path planning system of "collection-planning-execution-evaluation-optimization".

2. The AI-based learning path planning method according to claim 1, characterized in that, In S1, the multi-dimensional learning data includes knowledge mastery status data, learning behavior data, learning preference data, and learning goal data; Among them, the knowledge mastery status data is collected through online quizzes, offline tests, and knowledge point retests, including the correct answer rate, types of wrong questions, and time spent answering questions for each knowledge point; Learning behavior data is collected through the learning platform, including video learning duration, number of pauses / fast forwards / rewinds, participation in interactive quizzes, and note-taking activities; Learning preference data is derived from learner questionnaires and learning behaviors, including preferred learning formats, learning time periods, and learning pace. The learning objectives are set by the learners themselves and include short-term objectives, long-term objectives, and difficulty levels.

3. The AI-based learning path planning method according to claim 2, characterized in that, In S2, the improvement to the CNN convolutional neural network is the introduction of a temporal convolution module and an attention mechanism. The improved convolutional layer formula is as follows: ; In the formula, For attention mechanism functions; The temporal feature vector of the input data; Indicates the first In the convolutional layer Output feature value of position; For activation functions; The kernel size; For the first Layer convolution kernel in Position weight; For the first Layer input feature map in The pixel value of the location; For the first Bias terms of convolutional layers; Improvements to the KT knowledge tracing model include dynamically adjusting personalized learning probabilities and error probabilities based on learner learning efficiency characteristics, achieving multi-knowledge point association tracing by combining knowledge point dependency weights from the knowledge graph, introducing forgetting probability to correct state transition probabilities, and incorporating answering time into observation probability calculations. The feature vector of the learner's ability profile is [mean knowledge mastery, learning efficiency, learning preference weight, and goal achievement progress].

4. The AI-based learning path planning method according to claim 3, characterized in that, In S3, the improvement to the improved GNN graph neural network also includes feature fusion processing of the initial node feature matrix, incorporating the average answer accuracy and average mastery data of knowledge points; The node sampling strategy is to sample only the top-K neighboring nodes of each knowledge point for feature aggregation, with the default value of K being 10. The improved GNN graph neural network adopts the ELU activation function, and an L1 regularization term is added to the model loss function, with a regularization coefficient λ=0.002; The structured knowledge graph is stored in a separate knowledge model module, which supports real-time access, node updates, and association weight adjustments to the structured knowledge graph.

5. The AI-based learning path planning method according to claim 4, characterized in that, In S4, the reinforcement learning algorithm uses a dual reward function based on the learner's ability improvement rate and the goal achievement rate. The reward function formula is as follows: ; In the formula, To enhance the total reward value of learning; , These are the weighting coefficients; To accelerate the improvement of capabilities; For target achievement rate; The integration mechanism of RAG subject vector knowledge base and knowledge graph is to associate knowledge points in knowledge graph with learning resources in RAG, use cosine similarity to calculate the correlation between knowledge points, and prioritize recommending basic knowledge points with a correlation of ≥0.8 with learners' weak points and that meet the knowledge dependency relationship.

6. The AI-based learning path planning method according to claim 5, characterized in that, In S5, the feedback data includes the correct answer rate, learning time deviation, knowledge point mastery retest data, and learner's proactive adjustment requests; The optimization trigger thresholds are: a correct answer rate of less than 70%, a learning time deviation of more than ±30 minutes, and a retest correct answer rate of less than 80%. The multi-agent collaborative reasoning mechanism includes a feedback data evaluation agent, a path optimization planning agent, and an optimization result push feedback agent. The three agents work together to complete the identification of weaknesses, path adjustment, and optimization result push.

7. The AI-based learning path planning method according to claim 6, characterized in that, In S6, the preset evaluation cycle is once a week; The multi-dimensional quantitative evaluation indicators include knowledge mastery assessment, learning efficiency assessment, goal achievement rate assessment, and learning experience assessment, with each of the four indicators quantified from 0 to 100 points. The evaluation threshold is 80 points. The full-process path iterative optimization includes adjusting the weight coefficient of the hybrid algorithm, the allocation of knowledge point learning time, and the weight of knowledge graph association.

8. An AI-based learning path planning system, applied to the AI-based learning path planning method according to any one of claims 1-7, characterized in that, include: The data acquisition module is used to execute step S1, which uses a multimodal acquisition method to collect multi-dimensional learning data from learners, encrypts and stores the data, and outputs it to the learner model building module. The learner model building module, connected to the data acquisition module, is used to execute step S2. It processes multi-dimensional learning data through an improved CNN convolutional neural network and a KT knowledge tracing model, and builds and outputs the learner's ability profile to the initial path generation module. The knowledge model building module is used to execute step S3, build a structured knowledge graph of the target learning domain, and output the knowledge graph to the initial path generation module and the dynamic path optimization module. The initial path generation module, connected to the learner model building module and the knowledge model building module, is used to execute step S4. It combines learner ability profiles, structured knowledge graphs, and learning target data to generate an initial learning path through an artificial intelligence hybrid planning algorithm and output it to the dynamic path optimization module. The dynamic path optimization module is connected to the initial path generation module, the learner model building module, and the knowledge model building module. It is used to execute step S5, collect learning feedback data and perform dynamic path optimization, and at the same time, update the ability profile in the learner model building module and the knowledge graph association weights in the knowledge model building module in reverse. The effect evaluation and closed-loop optimization module is connected to the dynamic path optimization module and the initial path generation module. It is used to execute step S6, evaluate the learning effect in multiple dimensions, trigger closed-loop optimization based on the evaluation results, and adjust the planning parameters of the initial path generation module.

9. A computer device, characterized in that, include: A processor configured to be coupled to a memory, read and execute instructions and / or program code in the memory to perform the method as described in any one of claims 1-7.

10. A computer-readable medium, characterized in that, The computer-readable medium stores computer program code that, when executed on a computer, causes the computer to perform the method as described in any one of claims 1-7.