A learning plan intelligent generation method based on big data analysis

By using multi-source learning data analysis and the dual-battle PER-D3QN model, a learning task graph is constructed, enabling intelligent generation and dynamic adjustment of learning plans. This solves the problems of insufficient personalization and lack of dynamic adjustment capabilities in existing technologies, and improves the adaptability and execution effectiveness of learning plans.

CN122198554APending Publication Date: 2026-06-12SHENZHEN CHENZHAN EDUCATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHENZHEN CHENZHAN EDUCATION TECH CO LTD
Filing Date
2026-05-12
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing learning plan development methods are ineffective in reflecting individual differences when dealing with multiple learners, resulting in low planning efficiency, insufficient personalization, and a lack of dynamic adjustment capabilities.

Method used

By analyzing multi-source learning data, learner feature vectors and learning task maps are constructed. The time allocation and sequence arrangement of learning tasks are combined with the dual-battle PER-D3QN model to form an intelligent learning plan, which is dynamically adjusted during execution.

🎯Benefits of technology

It improves the personalization and effectiveness of learning plans, ensures the rationality and adaptability of the plan's timing, and enables timely responses to changes in the learning process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122198554A_ABST
    Figure CN122198554A_ABST
Patent Text Reader

Abstract

The application discloses a learning plan intelligent generation method based on big data analysis, which comprises the following steps: collecting multi-source learning data and preprocessing to obtain a standardized learning data set; constructing a learner feature vector; acquiring course structure information, knowledge point correlation, teaching resource mapping relationship and learning target constraint relationship, and constructing a learning task graph; inputting a learning state evaluation model to obtain a learning state evaluation result; obtaining a candidate learning task set and performing screening; inputting into a dual duel PER-D3QN model to form an initial learning plan; continuously collecting new learning behavior data and execution feedback data corresponding to a learning object, and monitoring the execution of the initial learning plan to form an execution monitoring result; updating the learner feature vector and the learning state evaluation result to generate an adjusted learning plan, and realizing intelligent generation and dynamic adjustment of the learning plan.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of big data analytics and intelligent education, and in particular to a method for intelligently generating learning plans based on big data analytics. Background Technology

[0002] With the rapid development of internet, cloud computing, and big data analytics, educational informatization is gradually evolving from a resource digitization stage to a data-driven, intelligent stage. Currently, online learning platforms, smart education platforms, course management platforms, and mobile learning terminals are widely used in teaching activities. Learners continuously generate a large amount of learning data during the learning process, including course access records, knowledge point practice records, resource browsing records, learning duration records, interaction behavior records, and stage assessment records. This learning data is characterized by diverse sources, complex structures, frequent updates, and strong temporal sequence, providing a data foundation for learning status analysis, learning ability assessment, and learning plan generation. Against this backdrop, how to utilize big data analytics to finely depict the learning process of learners and further realize the intelligent generation of learning plans has become an important research direction in the field of intelligent education technology.

[0003] In existing technologies, learning plans are mainly developed through three methods: manual development, rule-template development, and semi-automatic development based on a single metric. Manual development typically involves teachers, tutors, or the learner arranging learning tasks based on experience. While this method offers some flexibility, it suffers from low efficiency, insufficient individual adaptability, and delayed adjustments when dealing with a large number of learners, diverse learning goals, and complex relationships between knowledge points. Rule-template development pre-sets fixed learning paths, task sequences, and time allocation rules, then applies the learner to the corresponding template. This method can improve plan generation efficiency to some extent, but because it relies heavily on static rules, it struggles to effectively reflect differences in knowledge mastery, learning ability, learning preferences, and learning pace among learners. Semi-automatic development based on a single metric recommends learning tasks based on a limited number of indicators such as test scores, course completion progress, or historical task completion rates. While this method offers stronger data support than manual development, the limited dimensionality of input features often results in locally optimal outcomes, making it difficult to create a complete, continuous, and adaptable learning plan. Summary of the Invention

[0004] One objective of this invention is to propose an intelligent learning plan generation method based on big data analysis. This invention fully utilizes multi-source learning data analysis technology, learning feature modeling technology, learning task graph construction technology, and the dual-battle PER-D3QN model. This invention has the advantages of high accuracy in learning state recognition, clear learning task organization structure, high degree of intelligence in learning plan generation, and strong dynamic adjustment capability of learning plans, which can effectively improve the personalization level and execution effect of learning plans.

[0005] A method for intelligently generating learning plans based on big data analysis according to an embodiment of the present invention includes the following steps: Collect and preprocess multi-source learning data to obtain a standardized learning dataset; Based on standardized learning datasets, multidimensional learning features of learning objects are extracted to construct learner feature vectors; Obtain course structure information, knowledge point relationships, teaching resource mapping relationships, and learning objective constraints to construct a learning task graph; The learner's feature vector is input into the learning status assessment model to identify the current learning status of the learner and obtain the learning status assessment result. Based on the learning status assessment results and the learning task map, a set of candidate learning tasks is obtained and then filtered. The selected set of candidate learning tasks is input into the dual-battle PER-D3QN model, which outputs the time allocation decision and the order arrangement decision for each candidate learning task to form an initial learning plan. During the execution of the initial learning plan, new learning behavior data and execution feedback data corresponding to the learners are continuously collected, and the execution of the initial learning plan is monitored to form execution monitoring results; When the performance monitoring results meet the preset adjustment threshold, the learner feature vector and learning status assessment results are updated, and an adjusted learning plan is generated.

[0006] Optionally, the multi-source learning data includes basic information data, historical performance data, course learning record data, knowledge point practice data, learning duration data, resource access data, interaction behavior data, and stage assessment data. The preprocessing includes data cleaning, outlier removal, missing value completion, timestamp alignment, data deduplication, format conversion, standardization processing, and unified encoding processing.

[0007] Optionally, the construction of the learner feature vector specifically includes: Obtain basic information data, knowledge point practice data, course learning record data, resource access data, learning duration data, interaction behavior data, and phase assessment data corresponding to the learning objects from the standardized learning dataset; Based on the basic information data, basic learning features are extracted to obtain the basic learning feature vector; Based on the knowledge point practice data and the stage assessment data, knowledge mastery features are extracted to obtain a knowledge mastery feature vector. Learning ability features are extracted based on course learning record data, knowledge point practice data, and stage assessment data to obtain learning ability feature values; Learning preference features are extracted from resource access data to obtain a learning preference feature vector; Learning activity features are extracted based on learning duration data and interaction behavior data to obtain learning activity feature values; Learning stability features are extracted based on course learning record data and learning duration data to obtain learning stability feature values; The learner feature vector is obtained by concatenating the learning foundation feature vector, knowledge mastery feature vector, learning ability feature value, learning preference feature vector, learning activity feature value, and learning stability feature value.

[0008] Optionally, the construction of the learning task graph specifically includes: Obtain course structure information, knowledge point relationships, teaching resource mapping relationships, and learning objective constraints; determine the course unit set based on the course structure information, the knowledge point set based on the knowledge point relationships, and the learning task set based on the teaching resource mapping relationships. Construct a set of course unit nodes based on the set of course units, construct a set of knowledge point nodes based on the set of knowledge points, and construct a set of learning task nodes based on the set of learning tasks to form a node set; Based on the course structure information, we establish the inclusion relationship between course unit nodes, the sequential constraint relationship between knowledge point nodes based on the knowledge point association relationship, the teaching resource mapping relationship between knowledge point nodes and learning task nodes based on the teaching resource mapping relationship, and the learning objective constraint relationship between learning task nodes and learning objectives based on the learning objective constraint relationship, thus forming an edge set. The sequential constraint relationships between knowledge point nodes are matrixed to obtain the sequential constraint matrix. The mapping relationship of teaching resources between knowledge point nodes and learning task nodes is matrixed to obtain the teaching resource mapping matrix. The learning objective constraint relationship between learning task nodes and learning objectives is vectorized to obtain the learning objective constraint vector. A learning task graph is constructed based on the node set, edge set, sequential constraint matrix, teaching resource mapping matrix, and learning objective constraint vector.

[0009] Optionally, obtaining the learning state evaluation result specifically includes: A learning state evaluation model is constructed, which includes an input layer, a feature mapping layer, a hidden layer, and a multi-task output layer connected in sequence. The input layer receives the learner's feature vector, the feature mapping layer performs feature mapping on the learner's feature vector, the hidden layer extracts the learning state representation, and the multi-task output layer includes a knowledge mastery output layer, a task completion ability level output layer, a learning pace level output layer, and a goal achievement progress output layer. The learner feature vector is input into the feature mapping layer. The learner feature vector is then subjected to linear transformation and nonlinear mapping through the weight parameters, bias parameters and activation function of the feature mapping layer to obtain the first layer of implicit representation. The first hidden representation is input into the hidden layer to obtain the learned state representation; The learning state is input into the knowledge mastery output layer. The weight and bias parameters of the knowledge mastery output layer are used to calculate the learning object's mastery of each knowledge point. The knowledge mastery vector is obtained after compression and mapping. Determine the set of weak knowledge points based on the knowledge mastery vector; The learning state is input into the task completion ability level output layer to obtain the task completion ability level. Input the learning state representation into the learning rhythm level output layer to obtain the learning rhythm level; The learning state is input into the target achievement progress output layer to obtain the target achievement progress; The learning status assessment result is formed by combining the knowledge mastery, the set of weak knowledge points, the level of task completion ability, the level of learning pace, and the progress of goal achievement.

[0010] Optionally, obtaining and filtering the candidate learning task set specifically includes: Extract learning task nodes from the learning task graph that correspond to the set of weak knowledge points and the progress of goal achievement in the learning status assessment results, and form an initial set of learning tasks. Based on the sequential constraints between the learning task nodes in the learning task graph, the initial learning task set is constrained and verified to obtain the candidate learning task set. Determine the task priority for each candidate learning task in the candidate learning task set; Based on the task priority of each candidate learning task, the candidate learning task set is filtered to obtain the filtered candidate learning task set.

[0011] Optionally, the formation of the initial learning plan specifically includes: The selected set of candidate learning tasks is input into the Dual Duel PER-D3QN model, which includes a constraint-aware state embedding layer, a graph-guided attention aggregation layer, a dual-stream duel value decomposition layer, and a dual-Q collaborative decision-making layer connected in sequence. In the constraint-aware state embedding layer, the state embedding process is performed on the filtered candidate learning task set through state mapping weight parameters, state mapping bias parameters and activation functions to obtain the task embedding vector corresponding to each candidate learning task. In the graph-guided attention aggregation layer, the task association weights between each candidate learning task are calculated based on the task sequence constraint relationship between each candidate learning task in the learning task graph. Based on the task association weights between each candidate learning task, the task embedding vectors that have a task sequence constraint relationship with the current candidate learning task are aggregated to obtain the graph-enhanced task representation vectors corresponding to each candidate learning task. The initial state embedding vector is concatenated and fused with the representation vectors of each graph enhancement task to obtain a joint state representation vector corresponding to each candidate learning task. The joint state representation vectors are then input into the dual-stream duel value decomposition layer. The state value branch in the dual-stream duel value decomposition layer maps the initial state embedding vector using built-in state value mapping weight parameters and state value mapping bias parameters to obtain the state value corresponding to the current learning plan generation state. The action advantage branch in the dual-stream duel value decomposition layer maps the joint state representation vector using built-in action advantage mapping weight parameters and action advantage mapping bias parameters to obtain the action advantage corresponding to each candidate learning task. The dual-Q collaborative decision-making layer aggregates the state value and the action advantage of each candidate learning task to obtain the action value of each candidate learning task. Based on the action value of each candidate learning task, the action with the highest action value is selected from all candidate learning task actions as the target action. Generate an instant reward value based on the target action; The dual-Q collaborative decision-making layer adopts an online value network and a target value network. The online value network selects the action with the highest action value from all actions corresponding to the task feature data at the next time step. The target value network calculates the action value of the selected action at the next time step and adds the instant reward value to the discounted action value at the next time step to obtain the target action value at the current time step. The network parameters of the dual duel PER-D3QN model are updated based on the difference between the target action value and the action value at the current time step. Based on the value of the target action, the time allocation decision and sequence arrangement decision for each candidate learning task are output to form an initial learning plan.

[0012] Optionally, the formation of the execution monitoring results specifically includes: During the execution of the initial learning plan, new learning behavior data and execution feedback data corresponding to the learners are continuously collected; The newly added learning behavior data and execution feedback data are processed by time alignment, field matching and unified encoding, and segmented and mapped according to the start and end times of each learning task in the initial learning plan and the order of task execution to obtain the execution monitoring data corresponding to each learning task; The completion status of each learning task is monitored based on the execution monitoring data; The deviation from the initial learning plan is monitored based on the execution monitoring data; We monitor changes in knowledge mastery based on execution monitoring data; Monitor changes in target achievement based on execution monitoring data; The results of the performance monitoring are compiled by summarizing the task completion status, deviations from the plan, changes in the mastery of knowledge points, and changes in the achievement of goals.

[0013] Optionally, the generation of the adjusted learning plan specifically includes: Based on the monitoring results of the implementation, data for adjusting the plan are generated and the adjustment judgment value is calculated. The adjustment judgment value is compared with the preset adjustment threshold. When the adjustment judgment value meets the preset adjustment threshold, the newly added learning behavior data, execution feedback data, learner feature vector, learning status evaluation results and learning task map are read. Based on the newly added learning behavior data and execution feedback data, the learning basic features, knowledge mastery features, learning ability features, learning preference features, learning activity features, learning stability features and learning goal features in the learner feature vector are updated to obtain the updated learner feature vector. The updated learner feature vector is input into the learning state evaluation model, and the learning state recognition process is re-executed to obtain the updated learning state evaluation result. Based on the updated learning status assessment results, the learning task nodes corresponding to the updated set of weak knowledge points and the updated target achievement progress are re-extracted to form an updated initial learning task set. Based on the sequential constraint relationship between each learning task node in the learning task graph, the updated initial learning task set is constrained, verified, and re-selected to obtain an updated candidate learning task set. The updated set of candidate learning tasks is input into the dual-battle PER-D3QN model, which outputs the time allocation decision and order arrangement decision for each candidate learning task. The updated set of candidate learning tasks is then used to reallocate task time periods, rearrange task order, and rematch task resources to generate an adjusted learning plan.

[0014] The beneficial effects of this invention are: This invention, through the collection, preprocessing, and standardization of multi-source learning data, further extracts multi-dimensional learning features of learners and constructs learner feature vectors. This allows learning plan generation to no longer rely on a single performance indicator or fixed rule templates, but rather comprehensively reflects the learner's true state across multiple dimensions, including learning foundation, knowledge mastery, learning ability, learning preferences, learning activity, learning stability, and learning goals. Therefore, this invention significantly improves the comprehensiveness and accuracy of learning state identification, providing a more reliable data foundation for subsequent candidate learning task selection and learning plan generation, thereby overcoming the problems of coarse-grained learning state assessment and insufficient feature representation in existing technologies.

[0015] This invention constructs a learning task graph by acquiring course structure information, knowledge point relationships, teaching resource mapping relationships, and learning objective constraints. It organizes and associates course units, knowledge points, and learning tasks within a unified graph structure, clearly expressing the sequential constraints, knowledge point dependencies, and resource mapping relationships between learning tasks. Based on this learning task graph, this invention can more accurately locate learning tasks corresponding to weak knowledge points and target achievement progress based on learning status assessment results. It also performs constraint verification and screening for learning tasks that do not meet prerequisites, effectively avoiding the problems of isolated learning task organization, unbalanced task order, and insufficient connection between tasks in existing technologies. This improves the rationality of learning task selection and the completeness of the plan structure.

[0016] This invention further inputs the selected set of candidate learning tasks into the dual-battle PER-D3QN model. By making time allocation and sequence scheduling decisions for the candidate learning tasks, intelligent decision optimization is achieved during the learning plan generation process. Compared to traditional methods based on fixed rules, simple sorting, or experience-based configuration, this invention can comprehensively consider learning status evaluation results, task sequence constraints, task priorities, learning resource matching, and time constraints to perform more refined joint scheduling of candidate learning tasks. Therefore, it can improve the personalization, temporal rationality, and overall optimization capability of the generated learning plan, making the resulting initial learning plan more in line with the learner's current actual learning needs and goal achievement requirements.

[0017] This invention continuously collects new learning behavior data and execution feedback data during the initial learning plan execution process, and monitors task completion, plan deviation, changes in knowledge mastery, and goal achievement. When adjustment conditions are met, the learner's feature vector and learning status evaluation results are updated. Based on the updated results, candidate learning tasks are re-selected, time allocation is rearranged, and the order is reordered to generate an adjusted learning plan. Through this dynamic closed-loop mechanism, this invention can respond promptly to changes in the learner's learning process, overcoming the problem of existing technologies lacking process monitoring and automatic adjustment capabilities after learning plan generation. This allows the learning plan to be continuously revised and optimized according to the learner's execution status, thereby improving the adaptability, continuity, and execution effectiveness of the learning plan.

[0018] This invention realizes a complete technical chain from multi-source learning data processing, learning status assessment, learning task map construction, candidate learning task screening, to intelligent generation and dynamic adjustment of learning plans. It not only improves the automation and intelligence level of learning plan generation, but also enhances the learning plan's adaptability to individual differences and its responsiveness to changes in the execution process. It has the beneficial effects of more accurate learning status identification, more reasonable task organization, more intelligent plan generation, and more timely dynamic adjustment. Attached Figure Description

[0019] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings: Figure 1 This is an overall flowchart of a learning plan intelligent generation method based on big data analysis proposed in this invention; Figure 2 This is a schematic diagram illustrating the construction of a learning task graph for a learning plan intelligent generation method based on big data analysis proposed in this invention. Figure 3 This is a schematic diagram illustrating the construction of the initial learning plan for a learning plan intelligent generation method based on big data analysis proposed in this invention. Detailed Implementation

[0020] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.

[0021] refer to Figures 1-3 A method for intelligently generating learning plans based on big data analysis includes the following steps: Collect and preprocess multi-source learning data to obtain a standardized learning dataset; Based on standardized learning datasets, multidimensional learning features of learning objects are extracted to construct learner feature vectors; Obtain course structure information, knowledge point relationships, teaching resource mapping relationships, and learning objective constraints to construct a learning task graph; The learner's feature vector is input into the learning status assessment model to identify the current learning status of the learner and obtain the learning status assessment result. Based on the learning status assessment results and the learning task map, a set of candidate learning tasks is obtained and then filtered. The selected set of candidate learning tasks is input into the dual-battle PER-D3QN model, which outputs the time allocation decision and the order arrangement decision for each candidate learning task to form an initial learning plan. During the execution of the initial learning plan, new learning behavior data and execution feedback data corresponding to the learners are continuously collected, and the execution of the initial learning plan is monitored to form execution monitoring results; When the performance monitoring results meet the preset adjustment threshold, the learner feature vector and learning status assessment results are updated, and an adjusted learning plan is generated.

[0022] In this embodiment, the multi-source learning data includes basic information data, historical performance data, course learning record data, knowledge point practice data, learning duration data, resource access data, interaction behavior data, and stage assessment data. Preprocessing includes data cleaning, outlier removal, missing value completion, timestamp alignment, data deduplication, format conversion, standardization processing, and unified encoding processing.

[0023] In this embodiment, the construction of the learner feature vector specifically includes: Obtain basic information data, knowledge point practice data, course learning record data, resource access data, learning duration data, interaction behavior data, and phase assessment data corresponding to the learning objects from the standardized learning dataset; Based on the basic information data, basic learning features are extracted to obtain a basic learning feature vector. The basic learning feature vector includes feature values ​​corresponding to grade, major category, course category, learning stage, and terminal usage attributes. Based on knowledge point practice data and stage assessment data, knowledge mastery features are extracted to obtain a knowledge mastery feature vector. Each dimension of the knowledge mastery feature vector corresponds to the knowledge mastery feature value of each knowledge point. The knowledge mastery feature value is obtained by weighting the accuracy rate, completion rate and normalized average response time of the corresponding knowledge point, and the sum of the weight coefficients corresponding to the accuracy rate, completion rate and normalized average response time is 1. Learning ability features are extracted based on course learning record data, knowledge point practice data, and stage assessment data to obtain learning ability feature values. The learning ability feature values ​​are obtained by weighting the normalized value of the average score of the stage assessment, the on-time completion rate of learning tasks, and the efficiency value of completing effective learning tasks within a unit of time. The sum of the weight coefficients corresponding to the normalized value of the average score of the stage assessment, the on-time completion rate of learning tasks, and the efficiency value is 1. Learning preference features are extracted from resource access data to obtain a learning preference feature vector. Each dimension of the learning preference feature vector corresponds to the learning preference feature value of each type of learning resource. The learning preference feature value is the proportion of the number of times the learning object accesses the corresponding type of learning resource to the total number of times all types of learning resources are accessed. Learning activity features are extracted based on learning duration data and interaction behavior data to obtain learning activity feature values. The learning activity feature values ​​are obtained by weighting the normalized values ​​of effective learning duration, interaction number, and resource access frequency within a preset period, and the sum of the weight coefficients corresponding to the normalized values ​​of effective learning duration, interaction number, and resource access frequency is 1. Learning stability features are extracted based on course learning record data and learning duration data to obtain learning stability feature values. The learning stability feature values ​​are determined by: calculating the average and standard deviation of the learning duration of each learning unit within a preset period, and then determining the learning stability feature value based on the ratio of the standard deviation to the average. The learner feature vector is obtained by concatenating the learning foundation feature vector, knowledge mastery feature vector, learning ability feature value, learning preference feature vector, learning activity feature value, and learning stability feature value.

[0024] In this embodiment, the construction of the learning task graph specifically includes: Obtain course structure information, knowledge point relationships, teaching resource mapping relationships, and learning objective constraints; determine the course unit set based on the course structure information, the knowledge point set based on the knowledge point relationships, and the learning task set based on the teaching resource mapping relationships. The acquisition process is as follows: Course outline data, chapter directory data, knowledge point annotation data, teaching resource tag data, and learning objective configuration data corresponding to the target course are read from the teaching management platform, course resource platform, and learning plan configuration library. The read data undergoes field extraction, format standardization, identifier mapping, and association processing. The hierarchical relationships between course units are extracted from the course outline data and chapter directory data to form course structure information. Precedence, succession, and parallel relationships between knowledge points are extracted from the knowledge point annotation data and knowledge point dependency configuration data to form knowledge point association relationships. The correspondence between knowledge points and learning resources / tasks is extracted from the teaching resource tag data, knowledge point annotation results, and task type configuration data to form teaching resource mapping relationships. The constraint relationships between learning tasks and target completion time limits, target knowledge point coverage, target grade range, and target task quantity are extracted from the learning objective configuration data, task completion condition data, and course progress constraint data to form learning objective constraint requirements. These are then combined with the course unit set, knowledge point set, and learning task set to form learning objective constraint relationships. Construct a set of course unit nodes based on the set of course units, construct a set of knowledge point nodes based on the set of knowledge points, and construct a set of learning task nodes based on the set of learning tasks to form a node set; The construction process is as follows: Each element in the course unit set, knowledge point set, and learning task set is uniquely identified and encoded; a corresponding node identifier is assigned to each course unit, knowledge point, and learning task. Node objects are generated according to a pre-defined node data structure. Each node object contains a node identifier, node name, node type, and node attributes. The node type distinguishes course unit nodes, knowledge point nodes, and learning task nodes, while the node attributes record the corresponding descriptive information. For each course unit in the course unit set, the unit name, chapter, hierarchical position, and unit description are extracted and encapsulated as course unit nodes, forming a course unit node set. For each knowledge point in the knowledge point set, the knowledge point name, course unit, difficulty level, and knowledge point description are extracted and encapsulated as knowledge point nodes, forming a knowledge point node set. For each learning task in the learning task set, the task name, task type, corresponding knowledge point, estimated completion time, and task completion conditions are extracted and encapsulated as learning task nodes, forming a learning task node set. Based on the course structure information, we establish the inclusion relationship between course unit nodes, the sequential constraint relationship between knowledge point nodes based on the knowledge point association relationship, the teaching resource mapping relationship between knowledge point nodes and learning task nodes based on the teaching resource mapping relationship, and the learning objective constraint relationship between learning task nodes and learning objectives based on the learning objective constraint relationship, thus forming an edge set. The establishment process is as follows: Based on the chapter hierarchy, hierarchical unit affiliation, and unit sequence in the course structure information, the higher-level course unit nodes are pointed to the lower-level course unit nodes, establishing inclusion relationship edges between course unit nodes; for the sequential constraint relationship between knowledge point nodes, based on the configuration of prerequisite knowledge points, subsequent knowledge points, and dependent knowledge points in the knowledge point association relationship, the knowledge point nodes that serve as prerequisites are pointed to the knowledge point nodes that are learned later, establishing sequential constraint relationship edges between knowledge point nodes; for the teaching resource mapping relationship between knowledge point nodes and learning task nodes, based on the corresponding configuration between knowledge points and learning tasks recorded in the teaching resource mapping relationship, the relevant knowledge point nodes are connected to the learning task nodes that undertake the learning, practice, assessment, or review functions of that knowledge point, establishing teaching resource mapping relationship edges between knowledge point nodes and learning task nodes; for the learning objective constraint relationship between learning task nodes and learning objectives, based on the objective completion time limit, objective knowledge point coverage, objective score range, and objective task quantity requirements recorded in the learning objective constraint relationship, the learning task nodes that meet the corresponding objective requirements are connected to the corresponding learning objectives, establishing learning objective constraint relationship edges between learning task nodes and learning objectives. The sequential constraint relationship between knowledge point nodes is matrixed to obtain the sequential constraint matrix. The rows and columns of the sequential constraint matrix correspond to the knowledge point nodes in the knowledge point node set. Each element in the sequential constraint matrix represents whether there is a sequential constraint relationship between two knowledge point nodes. The teaching resource mapping relationship between knowledge point nodes and learning task nodes is matrixed to obtain a teaching resource mapping matrix. The rows in the teaching resource mapping matrix correspond to the knowledge point nodes in the knowledge point node set, and the columns in the teaching resource mapping matrix correspond to the learning task nodes in the learning task node set. Each element in the teaching resource mapping matrix represents whether there is a teaching resource mapping relationship between the corresponding knowledge point node and the learning task node. The learning objective constraint relationship between learning task nodes and learning objectives is vectorized to obtain the learning objective constraint vector. The vectorization process is as follows: First, establish vector position indices corresponding to the learning task nodes according to their arrangement order. Then, read the learning objective constraint data associated with each learning task node one by one, extracting the target completion time constraint value, target knowledge point coverage constraint value, target score interval constraint value, and target task quantity constraint value corresponding to the learning task. These are then converted according to a unified numerical conversion rule: the target completion time constraint value is normalized, the target knowledge point coverage constraint value is represented by a proportional value, the target score interval constraint value is represented by a numerical value after interval mapping, and the target task quantity constraint value is represented by an integer value or a normalized value. Next, write the above constraint values ​​corresponding to each learning task node into the vector position corresponding to the learning task node according to a preset combination order, forming the target constraint feature value of each learning task node. Finally, arrange the target constraint feature values ​​sequentially according to the node order of all learning task nodes to obtain the learning objective constraint vector. A learning task graph is constructed based on the node set, edge set, sequential constraint matrix, teaching resource mapping matrix, and learning objective constraint vector. The construction process is as follows: Using a set of nodes as the basic entity set of the learning task graph, course unit nodes, knowledge point nodes, and learning task nodes are written into the graph storage space according to a unified graph data structure. Then, using a set of edges as the set of connections between nodes, the inclusion relationship edges between course unit nodes, the sequence constraint relationship edges between knowledge point nodes, the teaching resource mapping relationship edges between knowledge point nodes and learning task nodes, and the learning objective constraint relationship edges between learning task nodes and learning objectives are associated with the corresponding nodes one by one. The sequence dependency relationships of knowledge points corresponding to each element in the sequence constraint matrix are loaded into the relationship attributes between knowledge point nodes. The mapping relationships between knowledge points and learning tasks corresponding to each element in the teaching resource mapping matrix are loaded into the relationship attributes between knowledge point nodes and learning task nodes. Finally, the learning objective constraint values ​​corresponding to each vector element in the learning objective constraint vector are written into the relationship attributes between each learning task node or between a learning task node and a learning objective, thus forming the learning task graph.

[0025] In this embodiment, obtaining the learning status evaluation result specifically includes: A learning state assessment model is constructed, which includes an input layer, a feature mapping layer, a hidden layer, and a multi-task output layer connected in sequence. The input layer receives the learner's feature vector, the feature mapping layer performs feature mapping on the learner's feature vector, the hidden layer extracts the learning state representation, and the multi-task output layer includes a knowledge mastery output layer, a task completion ability level output layer, a learning pace level output layer, and a goal achievement progress output layer. The learner's feature vector is input into the feature mapping layer. The learner's feature vector is then subjected to linear transformation and nonlinear mapping through the weight parameters, bias parameters and activation function of the feature mapping layer to obtain the first layer of implicit representation. The first hidden representation is input into the hidden layer, and features are extracted through the weight parameters, bias parameters and activation function of the hidden layer to obtain the learned state representation; The learning state is input into the knowledge mastery output layer. The weight and bias parameters of the knowledge mastery output layer are used to calculate the learning object's mastery of each knowledge point. After compression and mapping, a knowledge mastery vector is obtained. Each element in the knowledge mastery vector represents the learning object's mastery of each knowledge point. Determine the set of weak knowledge points based on the knowledge mastery vector; The specific process of determination is as follows: compare the mastery level of each knowledge point in the knowledge mastery vector with the preset mastery threshold, identify the knowledge points with a mastery level less than the preset mastery threshold as weak knowledge points, and summarize them to form a set of weak knowledge points; The learning state is input into the task completion ability level output layer. The task completion ability score is calculated through the weight parameters and bias parameters of the task completion ability level output layer. The task completion ability level is obtained based on the correspondence between the task completion ability score and the preset level range. The learning state is input into the learning rhythm level output layer. The learning rhythm score is calculated using the weight and bias parameters of the learning rhythm level output layer. The learning rhythm level is obtained based on the correspondence between the learning rhythm score and the preset level range. The learning state representation is input into the target achievement progress output layer. The target achievement progress value is calculated through the weight parameters and bias parameters of the target achievement progress output layer, and the target achievement progress is obtained through compression mapping. The learning status assessment result is formed by combining the knowledge mastery, the set of weak knowledge points, the level of task completion ability, the level of learning pace, and the progress of goal achievement.

[0026] In this embodiment, obtaining and filtering the candidate learning task set specifically includes: Extract learning task nodes from the learning task graph that correspond to the set of weak knowledge points and the progress of goal achievement in the learning status assessment results, and form an initial set of learning tasks. Based on the sequential constraints between the learning task nodes in the learning task graph, the initial learning task set is constrained and verified to obtain the candidate learning task set. The specific process is as follows: taking each learning task in the initial learning task set as the verification object, the sequential constraint relationship data corresponding to each learning task in the learning task graph is read one by one, and the set of preceding learning tasks, the set of succeeding learning tasks, and the task dependency order corresponding to each learning task are extracted; based on the learning task records that the current learning object has completed, the current learning status evaluation results, and the task distribution in the initial learning task set, the completion status of the preceding learning tasks of each learning task is matched and verified; for learning tasks whose preceding learning tasks have been completed, or whose preceding learning tasks have been retained and are in the preceding position in the current task arrangement, they are retained as learning tasks that satisfy the constraint relationship; for learning tasks with missing preceding learning tasks, incomplete preceding learning tasks, or whose task order is inconsistent with the sequential constraint relationship, they are removed from the initial learning task set, and the learning tasks that satisfy the sequential constraint relationship are summarized to obtain the candidate learning task set. For each candidate learning task in the candidate learning task set, a task priority is determined. The task priority is determined comprehensively based on the sequential constraints of the corresponding candidate learning tasks, the correlation of the goal achievement progress, and the task importance. The sequential constraints are used to characterize the degree of prerequisite dependence of the corresponding candidate learning task in the learning task graph. The correlation of the goal achievement progress is used to characterize the degree of influence of the corresponding candidate learning task on the current goal achievement progress. The task importance is used to characterize the importance level of the corresponding candidate learning task in the learning goal. Based on the task priority of each candidate learning task, the candidate learning task set is filtered to obtain the filtered candidate learning task set. The selection process is as follows: The task priorities of each candidate learning task in the candidate learning task set are read one by one and sorted from highest to lowest priority value to form a task priority sequence. Based on the learner's current progress towards their learning objective, the available learning time, the estimated completion time of the task, and the task connections in the learning task graph, candidate learning tasks are selected sequentially from the sorted task priority sequence. Candidate learning tasks that are highly relevant to current weak knowledge points, significantly improve progress towards their learning objective, and maintain a sequential connection with previously selected learning tasks are retained in the selected candidate learning task set. Candidate learning tasks that are poorly relevant to the current learning objective, conflict with previously selected learning tasks, or are not completed within the current learning timeframe are eliminated, resulting in the final selected candidate learning task set.

[0027] In this embodiment, the formation of the initial learning plan specifically includes: The selected set of candidate learning tasks is input into the Dual Duel PER-D3QN model, which includes a constraint-aware state embedding layer, a graph-guided attention aggregation layer, a dual-stream duel value decomposition layer, and a dual-Q collaborative decision-making layer connected in sequence. In the constraint-aware state embedding layer, the state embedding process is performed on the filtered candidate learning task set through state mapping weight parameters, state mapping bias parameters and activation functions to obtain the task embedding vector corresponding to each candidate learning task. The state embedding process is as follows: First, task feature data is obtained from the filtered set of candidate learning tasks. This data includes task identifier, associated knowledge points, estimated task completion time, task priority, corresponding learning resources, and task sequence constraints. Second, the task feature data is uniformly encoded: task identifiers are indexed, associated knowledge points are multidimensionally encoded, estimated task completion time is numerically normalized, task priority is numerically quantified, corresponding learning resources are categorized, and task sequence constraints are encoded using relational features. Third, the encoded task features are concatenated in a unified order to form the task input vector for each candidate learning task. The task input vector is linearly mapped using state mapping weight parameters, and a state mapping bias parameter is superimposed to obtain the mapping result. Finally, the mapping result is nonlinearly transformed using an activation function to obtain the task embedding vector for each candidate learning task. In the graph-guided attention aggregation layer, the task association weights between each candidate learning task are calculated based on the task sequence constraint relationship between each candidate learning task in the learning task graph. Based on the task association weights between each candidate learning task, the task embedding vectors that have a task sequence constraint relationship with the current candidate learning task are aggregated to obtain the graph-enhanced task representation vectors corresponding to each candidate learning task. The calculation process is as follows: Based on the learning task graph, the task sequence constraint relationship between each candidate learning task is extracted, and the associated task set corresponding to each current candidate learning task is determined. The associated task set includes the preceding candidate learning task and the following candidate learning task. The task embedding vector corresponding to the current candidate learning task and the task embedding vector corresponding to each associated candidate learning task in the associated task set are read. Combined with the strength of the task sequence constraint relationship between the current candidate learning task and each associated candidate learning task, the difference in task priority, the difference in the expected completion time of the task, and the degree of overlap of associated knowledge points, the association score between the current candidate learning task and each associated candidate learning task is calculated. The association scores are normalized to obtain the task association weight between the current candidate learning task and each associated candidate learning task. The initial state embedding vector is concatenated and fused with the representation vectors of each graph enhancement task to obtain the joint state representation vector corresponding to each candidate learning task. The joint state representation vectors are then input into the dual-stream duel value decomposition layer. The state value branch in the dual-stream duel value decomposition layer maps the initial state embedding vector through built-in state value mapping weight parameters and state value mapping bias parameters to obtain the state value corresponding to the current learning plan generation state. The action advantage branch in the dual-stream duel value decomposition layer maps each joint state representation vector through built-in action advantage mapping weight parameters and action advantage mapping bias parameters to obtain the action advantage corresponding to each candidate learning task. The dual-Q collaborative decision-making layer aggregates the state value and the action advantage of each candidate learning task to obtain the action value of each candidate learning task. The action value of each candidate learning task is obtained by subtracting the average of the action advantages of all candidate learning tasks from the sum of the state value and the action advantage of the corresponding action. Based on the action value of each candidate learning task, the action with the highest action value is selected from all candidate learning task actions as the target action. The target action corresponds to the task selection result, time allocation result, and sequence arrangement result of the candidate learning task. An immediate reward value is generated based on the target action. The immediate reward value is obtained by weighting the contribution value of weak knowledge point coverage, the contribution value of task sequence constraint satisfaction, and the target achievement progress improvement value, and subtracting the duration load deviation value. The contribution value of weak knowledge point coverage represents the degree to which the target action covers the set of weak knowledge points. The contribution value of task sequence constraint satisfaction represents the degree to which the target action satisfies the task sequence constraint relationship. The target achievement progress improvement value represents the degree to which the target action improves the target achievement progress. The duration load deviation value represents the degree of deviation between the task time allocation corresponding to the target action and the available learning time of the learning object. The dual-Q collaborative decision-making layer adopts an online value network and a target value network. The online value network selects the action with the highest action value from all actions corresponding to the task feature data at the next time step. The target value network calculates the action value of the selected action at the next time step and adds the instant reward value to the discounted action value at the next time step to obtain the target action value at the current time step. The network parameters of the dual duel PER-D3QN model are updated based on the difference between the target action value and the action value at the current time step. The update process is as follows: using the difference as the basis for adjusting the network parameters, the difference is passed back to the dual-Q collaborative decision-making layer, the dual-stream duel value decomposition layer, the graph-guided attention aggregation layer, and the constraint perception state embedding layer. The weight parameters and bias parameters in each layer are iteratively corrected, and the action value output results corresponding to the updated network parameters are gradually approached to the target action value to complete the update. Based on the value of the target action, the time allocation decision and sequence arrangement decision of each candidate learning task are output to form an initial learning plan; The output process is as follows: The target action value corresponding to each candidate learning task in the filtered candidate learning task set is summarized, and the optimal order of candidate learning tasks is determined based on the magnitude of the target action value. Candidate learning tasks with higher target action values ​​are given priority in entering the learning plan generation sequence. For each candidate learning task entering the learning plan generation sequence, the action output result corresponding to the candidate learning task is read. The action output result includes the task execution time period selection result and the task arrangement position selection result. The task execution time period selection result is determined as the time allocation decision for the corresponding candidate learning task, and the task arrangement position selection result is determined as the order arrangement decision for the corresponding candidate learning task.

[0028] In this embodiment, the formation of the execution monitoring results specifically includes: During the execution of the initial learning plan, new learning behavior data and execution feedback data corresponding to the learners are continuously collected. New learning behavior data includes task start time, task end time, actual task learning time, resource access records, knowledge point practice records, task interaction records, and stage assessment records. Execution feedback data includes task completion markers, task completion results, knowledge point assessment results, and stage goal completion results. The newly added learning behavior data and execution feedback data are processed by time alignment, field matching and unified encoding, and segmented and mapped according to the start and end times of each learning task in the initial learning plan and the order of task execution to obtain the execution monitoring data corresponding to each learning task; The completion status of each learning task is monitored based on the execution monitoring data. The task completion status is determined by the actual completion rate of the corresponding learning task. The actual completion rate is obtained by dividing the amount of completed learning content of the corresponding learning task by the amount of planned learning content of the corresponding learning task. Based on the execution monitoring data, the deviation of the initial learning plan is monitored. The deviation is determined by the difference between the actual execution time and the planned execution time of each learning task. The difference is obtained by subtracting the planned learning time of the corresponding learning task from the actual learning time. The difference for each learning task is then summarized to obtain the plan deviation value, which is obtained by dividing the sum of the absolute values ​​of the differences for each learning task by the total number of learning tasks. The changes in knowledge point mastery are monitored based on execution monitoring data. The changes in knowledge point mastery are determined by the difference between the mastery of newly added knowledge points and the initial mastery of knowledge points. The mastery of newly added knowledge points is calculated based on knowledge point practice records, stage assessment records, and knowledge point assessment results. The change in knowledge point mastery is obtained by subtracting the initial mastery of the corresponding knowledge point from the newly added mastery of the corresponding knowledge point. The progress of goal achievement is monitored based on the execution monitoring data. The progress of goal achievement is determined by the difference between the current goal achievement progress and the initial goal achievement progress. The current goal achievement progress is obtained by weighting the proportion of the number of learning tasks completed to the total number of learning tasks in the initial learning plan, the proportion of the number of target knowledge points covered to the total number of target knowledge points, and the proportion of the current stage test score to the target score. The results of the performance monitoring are compiled by summarizing the task completion status, deviations from the plan, changes in the mastery of knowledge points, and changes in the achievement of goals.

[0029] In this embodiment, the generation of the adjusted learning plan specifically includes: Based on the monitoring results of the implementation, data for adjusting the plan are generated and the adjustment judgment value is calculated. The adjustment judgment value is compared with the preset adjustment threshold. When the adjustment judgment value meets the preset adjustment threshold, the newly added learning behavior data, execution feedback data, learner feature vector, learning status evaluation results and learning task map are read. Based on the newly added learning behavior data and execution feedback data, the learning basic features, knowledge mastery features, learning ability features, learning preference features, learning activity features, learning stability features and learning goal features in the learner feature vector are updated to obtain the updated learner feature vector. The updated learner feature vector is input into the learning status assessment model, and the learning status identification process is re-executed to obtain the updated learning status assessment results. The updated learning status assessment results include the updated knowledge mastery, the updated set of weak knowledge points, the updated task completion ability level, the updated learning pace level, and the updated goal achievement progress. Based on the updated learning status assessment results, the learning task nodes corresponding to the updated set of weak knowledge points and the updated target achievement progress are re-extracted to form an updated initial learning task set. Based on the sequential constraint relationship between each learning task node in the learning task graph, the updated initial learning task set is constrained, verified, and re-selected to obtain an updated candidate learning task set. The updated set of candidate learning tasks is input into the dual-battle PER-D3QN model, which outputs the time allocation decision and order arrangement decision for each candidate learning task. The updated set of candidate learning tasks is then used to reallocate task time periods, rearrange task order, and rematch task resources to generate an adjusted learning plan.

[0030] Example 1: An online learning scenario using a university course "Fundamentals of Data Analysis" as an application example. This course is designed for second-year students and includes modules such as data preprocessing, basic statistical analysis, visualization analysis, regression analysis, and simple machine learning methods. Due to the clear sequential dependencies between course knowledge points—for example, learners need to master data cleaning and statistical basics before they can effectively complete regression analysis and model evaluation tasks—this course is highly suitable as an application scenario for intelligent learning plan generation methods. The course comprises 5 course units, 42 knowledge points, 126 learning task nodes, and 214 task sequence constraints. Teaching resources include various types such as video lectures, text and image courseware, exercises, phase assessments, and case studies. A total of 120 learners participated in the test of this example, all using a unified intelligent learning platform to complete the course. The platform continuously collected learning behavior data generated by each learner over 6 weeks, including course access data, resource browsing data, knowledge point practice data, task completion data, learning duration data, interaction behavior data, and phase assessment data, which served as the data foundation for the method of this invention.

[0031] In this scenario, traditional methods typically involve teachers assigning learning tasks to the entire class in the order of course chapters, or the platform simply pushing practice content based on stage test scores. In practical application, this approach has been found to have significant shortcomings. Some learners, while achieving high test scores initially, have inconsistent mastery of certain fundamental knowledge points, making them prone to interruptions when completing comprehensive case studies. Other learners, despite investing considerable time in learning, employ inefficient learning paths, spending excessive time repetitively studying already mastered content without addressing their weaknesses. Still others, due to fragmented learning time, struggle to complete their plans on time under a uniform learning schedule, leading to a continuous accumulation of plan deviations. Ultimately, this manifests as low overall completion efficiency, inability to identify individual differences, and delayed adjustments to learning plans, failing to effectively address the problems of "incomplete learning status recognition, unreasonable learning task arrangement, and lack of dynamic update capabilities for learning plans" in existing technologies.

[0032] When applied to this scenario, the method of this invention first involves the platform collecting multi-source learning data generated by each learner within a preset learning period. This multi-source data is then cleaned, deduplicated, time-aligned, outlier-removed, and standardized to form a standardized learning dataset. Next, learning foundation features, knowledge mastery features, learning ability features, learning preference features, learning activity features, learning stability features, and learning goal features are extracted from the standardized learning dataset to construct a learner feature vector. For example, for a specific learner, the system considers not only their three most recent assessment scores but also their video resource viewing time, accuracy rate in knowledge point exercises, completion rate of knowledge point exercises, response time to questions, daily learning time distribution, task completion continuity, and available learning time window. This results in a learner feature vector that more accurately depicts the learner's true learning state, rather than relying solely on a single performance indicator.

[0033] In terms of learning task organization, the platform utilizes course outlines, knowledge point annotation files, task resource mapping tables, and target constraint configuration tables from the course resource center and teaching management configuration library to construct a learning task graph. This learning task graph integrates course unit nodes, knowledge point nodes, and learning task nodes into a unified structure, and expresses the dependencies between knowledge points, the mapping relationships between knowledge points and learning tasks, and the constraints of learning objectives on learning tasks. Thus, when the system identifies that a learner's mastery of knowledge points such as "outlier identification," "missing value handling," and "interpretation of linear regression results" is insufficient, it can accurately locate learning tasks associated with these knowledge points and automatically eliminate subsequent tasks that do not meet the prerequisites, avoiding directly assigning higher-order tasks that lack the necessary foundation for execution.

[0034] During the learning status assessment phase, this invention inputs the learner's feature vector into the learning status assessment model to identify the learner's current knowledge mastery, set of weak knowledge points, task completion ability level, learning pace level, and progress towards achieving goals. Compared to traditional solutions, this invention's method does not simply output coarse-grained learning levels like "high," "medium," and "low," but rather outputs more targeted status results. For example, the system can identify that a learner has a good grasp of statistical mean and standard deviation calculations, but lacks proficiency in data cleaning process design and regression residual analysis; it can also determine that although the learner's overall learning time is sufficient, the learning pace fluctuates greatly, and the task completion ability is at a medium level. Therefore, it is not advisable to assign excessively long continuous tasks, but rather to prioritize learning tasks of moderate duration that are highly relevant to the weak knowledge points.

[0035] In the candidate learning task set screening stage, this invention extracts learning task nodes from all 126 learning tasks based on the learning status assessment results and the learning task map. These nodes match the set of weak knowledge points, the progress of goal achievement, and the constraints of learning objectives. Constraint verification and screening are then performed by combining task sequence constraints, task priority, and stage goal requirements. Taking a specific learning object as an example, the system initially extracts 31 initial learning tasks related to the current problem from the 126 tasks. Then, it removes 8 tasks whose prerequisites are incomplete or unsuitable for execution at the current stage, ultimately retaining 23 candidate learning tasks for subsequent decision-making. Compared to the traditional method of directly assigning tasks according to chapter order, this method significantly reduces interference from irrelevant and repetitive tasks, allowing the learning plan to focus more on improving the current learning effect.

[0036] During the learning plan generation phase, this invention inputs the selected set of candidate learning tasks into the Dual-Duel PER-D3QN model. The model combines the learner's current learning status, candidate learning task characteristics, task priority, expected completion time, and available learning time slots to make time allocation and sequence arrangements for each candidate learning task. For example, if a learner has 5.5 hours available for course learning per week, the system does not simply divide the 23 candidate tasks evenly across days. Instead, it prioritizes tasks directly related to weak knowledge points, with preconditions met, and high expected returns. Short, high-frequency practice tasks are allocated to weekday evenings, while case analysis tasks requiring continuous thinking are allocated to full weekend time slots, thus forming an initial learning plan that better suits actual execution conditions. This plan not only clarifies the task content and execution order but also specifies the start time, end time, and resource type for each task, enabling the learner to execute tasks directly according to the platform's schedule.

[0037] During the execution of the learning plan, the platform continuously collects new learning behavior data and execution feedback data, and monitors the execution of the initial learning plan. Monitoring includes task completion status, plan deviation, changes in knowledge mastery, and changes in goal achievement. If a learner experiences task timeouts, insignificant knowledge mastery growth, or goal achievement progress slower than planned within the first three days, the system will trigger a plan adjustment process. This involves updating the learner's feature vector and learning status assessment results, re-selecting the candidate learning task set, and regenerating the adjusted learning plan. For example, a learner originally planned to complete the tasks of "regression model building" and "results analysis" within two days. However, due to unstable mastery of the preceding knowledge point "outlier handling," the execution was poor. The system, in the adjustment, postponed the "results analysis" task and inserted two supplementary tasks: "outlier cleaning practice" and "regression input data checking," making the subsequent task arrangement more aligned with the learner's actual ability.

[0038] To verify the beneficial effects of this invention, 120 learners were randomly divided into two groups under the same course, same duration, and similar learning foundation. 60 learners used the traditional fixed-template learning plan method, while the other 60 used the method of this invention to generate their learning plans. Comparison dimensions included on-time completion rate of learning tasks, improvement rate of weak knowledge points, plan deviation rate, average score of stage assessments, goal achievement rate, and learner satisfaction. The experimental results are shown in Table 1 below.

[0039] Table 1. Comparison of teaching application effects between the method of this invention and the traditional fixed template method.

[0040] As shown in Table 1, the experimental group significantly outperformed the control group in core indicators such as on-time completion rate of learning tasks, improvement rate of weak knowledge points, average score of stage assessment, and goal achievement rate. This indicates that the method of this invention can more accurately identify the current learning status of learners during the learning plan formulation stage and organize learning tasks around real weaknesses, making the learning plan more targeted. In particular, the improvement rate of weak knowledge points was 27.5 percentage points higher in the experimental group than in the control group, indicating that by combining the learning task map with the learning status assessment results, this invention can more effectively allocate learning resources and tasks to the knowledge points that most need to be strengthened.

[0041] From the perspective of plan execution results, the experimental group showed a significant reduction in plan deviation rate and a substantial decrease in the average weekly duration of ineffective repeated learning, indicating that the method of this invention has higher rationality in terms of time allocation and sequence arrangement decisions. In other words, the experimental group's learning subjects did not simply "learn more," but rather "learn more accurately and more smoothly," which reflects the technical advantages of this invention in candidate learning task selection and intelligent arrangement of reinforcement learning.

[0042] In terms of dynamic adjustment capability, the experimental group achieved a task recovery completion rate of 85.1% after plan adjustments, significantly higher than the control group's 58.7%. This demonstrates that the invention can promptly update learner feature vectors and learning status assessment results after monitoring deviations in task completion, changes in knowledge mastery, and goal achievement, and regenerate a learning plan more suited to the current state. Combined with satisfaction indicators, this further illustrates that the invention not only outperforms existing technologies in objective effectiveness but also enjoys better acceptance from learners' subjective experience, proving its strong practical application value in intelligent education scenarios.

[0043] The above are merely preferred embodiments of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.

Claims

1. A method for intelligently generating learning plans based on big data analysis, characterized in that, Includes the following steps: Collect and preprocess multi-source learning data to obtain a standardized learning dataset; Based on standardized learning datasets, multidimensional learning features of learning objects are extracted to construct learner feature vectors; Obtain course structure information, knowledge point relationships, teaching resource mapping relationships, and learning objective constraints to construct a learning task graph; The learner's feature vector is input into the learning status assessment model to identify the current learning status of the learner and obtain the learning status assessment result. Based on the learning status assessment results and the learning task map, a set of candidate learning tasks is obtained and then filtered. The selected set of candidate learning tasks is input into the dual-battle PER-D3QN model, which outputs the time allocation decision and the order arrangement decision for each candidate learning task to form an initial learning plan. During the execution of the initial learning plan, new learning behavior data and execution feedback data corresponding to the learners are continuously collected, and the execution of the initial learning plan is monitored to form execution monitoring results; When the performance monitoring results meet the preset adjustment threshold, the learner feature vector and learning status assessment results are updated, and an adjusted learning plan is generated.

2. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The multi-source learning data includes basic information data, historical performance data, course learning record data, knowledge point practice data, learning duration data, resource access data, interaction behavior data, and stage assessment data. The preprocessing includes data cleaning, outlier removal, missing value completion, timestamp alignment, data deduplication, format conversion, standardization, and unified encoding.

3. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The construction of the learner feature vector specifically includes: Obtain basic information data, knowledge point practice data, course learning record data, resource access data, learning duration data, interaction behavior data, and phase assessment data corresponding to the learning objects from the standardized learning dataset; Based on the basic information data, basic learning features are extracted to obtain the basic learning feature vector; Based on the knowledge point practice data and the stage assessment data, knowledge mastery features are extracted to obtain a knowledge mastery feature vector. Learning ability features are extracted based on course learning record data, knowledge point practice data, and stage assessment data to obtain learning ability feature values; Learning preference features are extracted from resource access data to obtain a learning preference feature vector; Learning activity features are extracted based on learning duration data and interaction behavior data to obtain learning activity feature values; Learning stability features are extracted based on course learning record data and learning duration data to obtain learning stability feature values; The learner feature vector is obtained by concatenating the learning foundation feature vector, knowledge mastery feature vector, learning ability feature value, learning preference feature vector, learning activity feature value, and learning stability feature value.

4. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The construction of the learning task graph specifically includes: Obtain course structure information, knowledge point relationships, teaching resource mapping relationships, and learning objective constraints; determine the course unit set based on the course structure information, the knowledge point set based on the knowledge point relationships, and the learning task set based on the teaching resource mapping relationships. Construct a set of course unit nodes based on the set of course units, construct a set of knowledge point nodes based on the set of knowledge points, and construct a set of learning task nodes based on the set of learning tasks to form a node set; Based on the course structure information, we establish the inclusion relationship between course unit nodes, the sequential constraint relationship between knowledge point nodes based on the knowledge point association relationship, the teaching resource mapping relationship between knowledge point nodes and learning task nodes based on the teaching resource mapping relationship, and the learning objective constraint relationship between learning task nodes and learning objectives based on the learning objective constraint relationship, thus forming an edge set. The sequential constraint relationships between knowledge point nodes are matrixed to obtain the sequential constraint matrix. The mapping relationship of teaching resources between knowledge point nodes and learning task nodes is matrixed to obtain the teaching resource mapping matrix. The learning objective constraint relationship between learning task nodes and learning objectives is vectorized to obtain the learning objective constraint vector. A learning task graph is constructed based on the node set, edge set, sequential constraint matrix, teaching resource mapping matrix, and learning objective constraint vector.

5. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The learning status assessment results are obtained specifically through: A learning state evaluation model is constructed, which includes an input layer, a feature mapping layer, a hidden layer, and a multi-task output layer connected in sequence. The input layer receives the learner's feature vector, the feature mapping layer performs feature mapping on the learner's feature vector, the hidden layer extracts the learning state representation, and the multi-task output layer includes a knowledge mastery output layer, a task completion ability level output layer, a learning pace level output layer, and a goal achievement progress output layer. The learner feature vector is input into the feature mapping layer. The learner feature vector is then subjected to linear transformation and nonlinear mapping through the weight parameters, bias parameters and activation function of the feature mapping layer to obtain the first layer of implicit representation. The first hidden representation is input into the hidden layer to obtain the learned state representation; The learning state is input into the knowledge mastery output layer. The weight and bias parameters of the knowledge mastery output layer are used to calculate the learning object's mastery of each knowledge point. The knowledge mastery vector is obtained after compression and mapping. Determine the set of weak knowledge points based on the knowledge mastery vector; The learning state is input into the task completion ability level output layer to obtain the task completion ability level. Input the learning state representation into the learning rhythm level output layer to obtain the learning rhythm level; The learning state is input into the target achievement progress output layer to obtain the target achievement progress; The learning status assessment result is formed by combining the knowledge mastery, the set of weak knowledge points, the level of task completion ability, the level of learning pace, and the progress of goal achievement.

6. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The acquisition and selection of the candidate learning task set specifically includes: Extract learning task nodes from the learning task graph that correspond to the set of weak knowledge points and the progress of goal achievement in the learning status assessment results, and form an initial set of learning tasks. Based on the sequential constraints between the learning task nodes in the learning task graph, the initial learning task set is constrained and verified to obtain the candidate learning task set. Determine the task priority for each candidate learning task in the candidate learning task set; Based on the task priority of each candidate learning task, the candidate learning task set is filtered to obtain the filtered candidate learning task set.

7. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The formation of the initial learning plan specifically includes: The selected set of candidate learning tasks is input into the Dual Duel PER-D3QN model, which includes a constraint-aware state embedding layer, a graph-guided attention aggregation layer, a dual-stream duel value decomposition layer, and a dual-Q collaborative decision-making layer connected in sequence. In the constraint-aware state embedding layer, the state embedding process is performed on the filtered candidate learning task set through state mapping weight parameters, state mapping bias parameters and activation functions to obtain the task embedding vector corresponding to each candidate learning task. In the graph-guided attention aggregation layer, the task association weights between each candidate learning task are calculated based on the task sequence constraint relationship between each candidate learning task in the learning task graph. Based on the task association weights between each candidate learning task, the task embedding vectors that have a task sequence constraint relationship with the current candidate learning task are aggregated to obtain the graph-enhanced task representation vectors corresponding to each candidate learning task. The initial state embedding vector is concatenated and fused with the representation vectors of each graph enhancement task to obtain a joint state representation vector corresponding to each candidate learning task. The joint state representation vectors are then input into the dual-stream duel value decomposition layer. The state value branch in the dual-stream duel value decomposition layer maps the initial state embedding vector using built-in state value mapping weight parameters and state value mapping bias parameters to obtain the state value corresponding to the current learning plan generation state. The action advantage branch in the dual-stream duel value decomposition layer maps the joint state representation vector using built-in action advantage mapping weight parameters and action advantage mapping bias parameters to obtain the action advantage corresponding to each candidate learning task. The dual-Q collaborative decision-making layer aggregates the state value and the action advantage of each candidate learning task to obtain the action value of each candidate learning task. Based on the action value of each candidate learning task, the action with the highest action value is selected from all candidate learning task actions as the target action. Generate an instant reward value based on the target action; The dual-Q collaborative decision-making layer adopts an online value network and a target value network. The online value network selects the action with the highest action value from all actions corresponding to the task feature data at the next time step. The target value network calculates the action value of the selected action at the next time step and adds the instant reward value to the discounted action value at the next time step to obtain the target action value at the current time step. The network parameters of the dual duel PER-D3QN model are updated based on the difference between the target action value and the action value at the current time step. Based on the value of the target action, the time allocation decision and sequence arrangement decision for each candidate learning task are output to form an initial learning plan.

8. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The formation of the execution monitoring results specifically includes: During the execution of the initial learning plan, new learning behavior data and execution feedback data corresponding to the learners are continuously collected; The newly added learning behavior data and execution feedback data are processed by time alignment, field matching and unified encoding, and segmented and mapped according to the start and end times of each learning task in the initial learning plan and the order of task execution to obtain the execution monitoring data corresponding to each learning task; The completion status of each learning task is monitored based on the execution monitoring data; The deviation from the initial learning plan is monitored based on the execution monitoring data; We monitor changes in knowledge mastery based on execution monitoring data; Monitor changes in target achievement based on execution monitoring data; The results of the performance monitoring are compiled by summarizing the task completion status, deviations from the plan, changes in the mastery of knowledge points, and changes in the achievement of goals.

9. The intelligent learning plan generation method based on big data analysis according to claim 1, characterized in that, The generation of the adjusted learning plan specifically includes: Based on the monitoring results of the implementation, data for adjusting the plan are generated and the adjustment judgment value is calculated. The adjustment judgment value is compared with the preset adjustment threshold. When the adjustment judgment value meets the preset adjustment threshold, the newly added learning behavior data, execution feedback data, learner feature vector, learning status evaluation results and learning task map are read. Based on the newly added learning behavior data and execution feedback data, the learning basic features, knowledge mastery features, learning ability features, learning preference features, learning activity features, learning stability features and learning goal features in the learner feature vector are updated to obtain the updated learner feature vector. The updated learner feature vector is input into the learning state evaluation model, and the learning state recognition process is re-executed to obtain the updated learning state evaluation result. Based on the updated learning status assessment results, the learning task nodes corresponding to the updated set of weak knowledge points and the updated target achievement progress are re-extracted to form an updated initial learning task set. Based on the sequential constraint relationship between each learning task node in the learning task graph, the updated initial learning task set is constrained, verified, and re-selected to obtain an updated candidate learning task set. The updated set of candidate learning tasks is input into the dual-battle PER-D3QN model, which outputs the time allocation decision and order arrangement decision for each candidate learning task. The updated set of candidate learning tasks is then used to reallocate task time periods, rearrange task order, and rematch task resources to generate an adjusted learning plan.