An online learning information recommendation method and system

By constructing a ternary dynamic hypergraph and a hypergraph convolutional neural network, and combining a large language model with reinforcement learning, the problems of limited context modeling dimensions and policy disconnect in online learning information recommendation are solved, and a recommendation system with high-order association modeling and causal explanation is realized.

CN122240913APending Publication Date: 2026-06-19LUOYANG VOCATIONAL&TECHNICAL COLLEGE +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
LUOYANG VOCATIONAL&TECHNICAL COLLEGE
Filing Date
2026-02-10
Publication Date
2026-06-19

Smart Images

  • Figure CN122240913A_ABST
    Figure CN122240913A_ABST
Patent Text Reader

Abstract

This invention discloses an online learning information recommendation method and system, relating to the field of data processing technology. The method includes: constructing a ternary dynamic hypergraph containing learner nodes, knowledge point nodes, and context nodes, where context nodes are associated with learner nodes and knowledge point nodes via hyperedges; the ternary dynamic hypergraph is represented by a third-order adjacency tensor; using a hypergraph convolutional neural network for feature encoding to output a learner state representation; inputting the learner state representation into a large language model, and outputting the weight parameters of a reinforcement learning policy network through a parameter generator; constructing a reinforcement learning policy network to output recommendation probabilities, and backpropagating gradients to the third-order adjacency tensor to update hyperedge weights; constructing a structural causal model based on knowledge point nodes and prior relationships, calculating causal effects to identify weak knowledge points, and generating a causal diagnostic chain; determining recommended learning resources based on the recommendation probabilities, and outputting the recommended learning resources and the causal diagnostic chain. This invention achieves high-order context modeling, bidirectional evolution of policy and structure, and causally interpretable recommendation.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data processing technology, and in particular to an online learning information recommendation method and system. Background Technology

[0002] Online learning information recommendation aims to recommend suitable learning resources based on learners' knowledge status and learning preferences, and is a key technology for adaptive learning systems and intelligent education platforms.

[0003] Current online learning information recommendation technologies primarily employ methods based on knowledge graphs and graph neural networks. Regarding graph structure modeling, existing technologies construct bipartite or heterogeneous graphs between learners and knowledge points, using graph convolutional networks to encode node features. For context processing, existing technologies treat the learning context as attribute vectors of nodes, integrating it into the recommendation model through feature concatenation or attention weighting. The learning context includes the time of learning, the device used, the learner's emotions, and the social environment. For decision generation, existing technologies use reinforcement learning frameworks to make recommendation decisions based on the encoded node features. For recommendation explanation, existing technologies use attention weight visualization or post-attribution methods to generate explanatory text.

[0004] However, existing technologies have at least the following shortcomings: First, context as an attribute vector can only model the binary relationship between learners and knowledge points, and it is difficult to represent the ternary higher-order association of a particular learner's preference for a particular knowledge point in a particular context, which leads to the disconnect between the recommendation results and the learning context.

[0005] Secondly, the construction and updating of knowledge graphs rely on predefined rules or behavioral data statistics, which are independent of the optimization process of recommendation strategies. The graph structure is difficult to adaptively adjust according to the recommendation effect.

[0006] Third, large language models are mainly used for query expansion or content interpretation in recommender systems, and their output is text or embedding vectors. They lack an end-to-end parameter transfer mechanism with reinforcement learning policy networks.

[0007] Fourth, ex post facto attribution methods can only reveal the correlation between features and recommendation results, and are unable to answer counterfactual questions such as how the results would change if a certain precondition were changed.

[0008] Therefore, there is an urgent need for an online learning information recommendation technology that can achieve high-order situational modeling, bidirectional evolution of strategy and structure, end-to-end mapping from cognition to decision-making, and causal interpretability. Summary of the Invention

[0009] In view of the shortcomings of the prior art, embodiments of the present invention provide an online learning information recommendation method and system to solve the technical problems in the prior art, such as limited context modeling dimensions, disconnect between knowledge structure and recommendation strategy, separation between semantic understanding and decision generation, and lack of causal support for recommendation explanation.

[0010] A first aspect of this invention provides an online learning information recommendation method, comprising: S1: Construction of a context-integrated ternary high-order hypergraph: Construct a ternary dynamic hypergraph, which includes learner nodes, knowledge point nodes, and context nodes. The context nodes are independent nodes and are associated with the learner nodes and the knowledge point nodes through hyperedges. The hyperedges are generalized edges that associate multiple heterogeneous nodes. The ternary dynamic hypergraph is represented by a third-order adjacency tensor, and the element values ​​of the third-order adjacency tensor are the hyperedge weights. S2: Hypergraph Convolutional Feature Encoding for Fusion Context: The hypergraph convolutional neural network is used to encode the features of the ternary dynamic hypergraph, and the learner state representation that integrates contextual information is output. S3: End-to-end parameter generation from cognition to decision: Input the learner state representation into the large language model, and output the weight parameters of the reinforcement learning policy network through the parameter generation head; S4: Construct the reinforcement learning policy network based on the weight parameters and output the recommendation probability. Calculate the loss function according to the learning feedback and backpropagate the gradient of the loss function to the third-order adjacency tensor to update the hyperedge weights. S5: Based on the knowledge point nodes and their prerequisite relationships in the ternary dynamic hypergraph, construct a structural causal model, calculate the causal effect of each prerequisite knowledge point, identify the prerequisite knowledge point with the largest causal effect as the weak knowledge point, and input the weak knowledge point into the large language model to generate a causal diagnosis chain. S6: Recommended Resources and Causal Diagnostic Chain Output: Determine the recommended learning resources based on the recommendation probability, and output the recommended learning resources and the causal diagnostic chain.

[0011] A second aspect of the present invention provides an online learning information recommendation system, comprising: a processor and a memory; The memory stores programs or instructions that can run on the processor, which, when executed by the processor, implement the steps of the online learning information recommendation method as described in the first aspect.

[0012] The beneficial effects of the technical solutions provided in the embodiments of the present invention include at least the following: In this embodiment of the invention, a ternary dynamic hypergraph containing learner nodes, knowledge point nodes, and context nodes is constructed. The context is modeled as an independent node and associated with learner nodes and knowledge point nodes through hyperedges, thus achieving an explicit representation of the high-order ternary association between learners, knowledge points, and context. The weight parameters of the policy network are directly output through the parameter generation head of the large language model, establishing an end-to-end mapping path from semantic understanding to decision parameters. By setting the third-order adjacency tensor as a differentiable variable, the gradient of the loss function can be backpropagated to the graph structure and the hyperedge weights are updated, thus achieving joint optimization of the recommendation strategy and the knowledge structure. The causal effect of prerequisite knowledge points is calculated based on the structural causal model to locate weak knowledge points, and a causal diagnostic chain containing weak knowledge point identifiers, learning obstacle descriptions, recommendation reasons, and expected effect predictions is generated through the large language model, thus achieving causal explanation of the recommendation. Attached Figure Description

[0013] Figure 1 This is a flowchart illustrating an online learning information recommendation method provided in an embodiment of the present invention.

[0014] Figure 2 This is a schematic diagram of the structure of an online learning information recommendation system provided in an embodiment of the present invention. Detailed Implementation

[0015] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings.

[0016] Reference manual attached Figure 1 The diagram illustrates a flowchart of an online learning information recommendation method provided by an embodiment of the present invention.

[0017] This invention provides an online learning information recommendation method, which may include the following steps: S1: Construction of a ternary high-order hypergraph for contextualization: Construct a ternary dynamic hypergraph, which includes learner nodes, knowledge point nodes, and context nodes. Context nodes are independent nodes and are associated with learner nodes and knowledge point nodes through hyperedges. Hyperedges are generalized edges that connect multiple heterogeneous nodes. The ternary dynamic hypergraph is represented by a third-order adjacency tensor, and the element values ​​of the third-order adjacency tensor are the hyperedge weights.

[0018] It should be noted that the ternary dynamic hypergraph is denoted as ,in, Represents a set of nodes. Represents the set of superedges. Node set. It contains three types of nodes: learner node set Knowledge point node set Context node set ,in, Indicates the first Each learner node , Indicates the number of learner nodes. Indicates the first Each knowledge point node , Indicates the number of knowledge point nodes. Indicates the first Each context node , Indicates the number of context nodes.

[0019] In one possible implementation, the context node includes at least one of a time slice node, a device type node, an emotional state node, and a social environment node. The time slice node represents the time period during which learning occurs, the device type node represents the type of device used for learning, the emotional state node represents the learner's emotional state, and the social environment node represents the learner's social environment.

[0020] Furthermore, the embedding vectors of various types of nodes are obtained through an embedding layer. Specifically, embedding matrices are set for learner nodes, knowledge point nodes, and context nodes respectively. , , ,in, express OK A real matrix of columns, express OK A real matrix of columns, express OK A real matrix of columns, This represents the dimension of the embedding vector. The corresponding embedding vector is obtained by looking up the node index in a table; learner nodes... The embedding vector is Knowledge Points The embedding vector is Context nodes The embedding vector is The parameters of the embedding matrix are learned through the training process.

[0021] Furthermore, super-border To associate with a learner node at the same time At least one knowledge point node and at least one context node The generalized edge represents the learner node. In the context node Below are the knowledge point nodes The interactive relationship.

[0022] In one possible implementation, the ternary dynamic hypergraph is mediated by a third-order adjacency tensor. express, for The three-dimensional real number array, the third-order adjacency tensor is a three-dimensional array with learner nodes, knowledge point nodes, and context nodes as its three dimensions, tensor elements Represents learner nodes In the context node Below are the knowledge point nodes Interactive hyperedge weights.

[0023] In one possible implementation, the weights of each hyperedge are calculated using a hypergraph attention mechanism. Specifically, for each hyperedge... Associated learner nodes Knowledge Points Context nodes The embedded vectors are projected and concatenated, and then processed by activation and normalization functions to obtain the hyperedge weights. The formula for calculating the hyperedge weights is shown in formula (1): in, Represents learner nodes in a third-order adjacency tensor Knowledge Points Context nodes The corresponding hyperedge weights, Represents learner nodes The embedding vector and is 3D real vector Representing knowledge point nodes The embedding vector and is 3D real vector Representing context nodes The embedding vector and is 3D real vector This represents the dimension of the embedding vector. Represents the projection matrix of the learner node and is OK A real matrix of columns, The projection matrix of the knowledge point node is and is OK A real matrix of columns, The projection matrix of the context node is and is OK A real matrix of columns, Indicates the dimension after projection. This represents a vector concatenation operation. Represents the attention vector and is 3D real vector This represents the LeakyReLU activation function. Represents an exponential function. and The summation index is independent of j and q, and all knowledge points and all scenarios are traversed separately.

[0024] Optionally, the dimension of the embedding vector The value is 128, representing the dimensions after projection. The value is 64. It is understood that those skilled in the art can adjust the value according to the actual situation. and The value of is adjusted, but the embodiments of the present invention do not impose specific limitations on this.

[0025] The hypergraph attention mechanism described above differs from the existing technology that treats context as a node attribute vector. In this embodiment of the invention, context is treated as an independent node in the calculation of hyperedge weights, which distinguishes the importance of different learners interacting with different knowledge points in different contexts. The ternary higher-order association information is encoded into the hyperedge weights.

[0026] S2: Hypergraph Convolutional Feature Encoding with Fusion Context: A hypergraph convolutional neural network is used to encode the features of the ternary dynamic hypergraph, outputting a learner state representation that integrates contextual information.

[0027] In one possible implementation, the feature encoding of a hypergraph convolutional neural network includes two stages: aggregation of node features to hyperedges and distribution of hyperedge features to nodes.

[0028] Furthermore, the first stage involves aggregating node features towards hyperedges. For hyperedges... For all associated nodes, the feature vectors of each node are weighted and aggregated to obtain the feature representation of the hyperedge. The calculation formula for hyperedge feature aggregation is shown in formula (2): in, Indicates the first Super edges in layer hypergraph convolution The feature representation and is 3D real vector Indicates the first The feature dimensions of the layer Indicates the superedge Associated nodes, Indicates the superedge The weight of the superedge and when Related learners Knowledge Points Context hour Equal to formula (1) , Indicates the superedge The degree is the number of associated nodes. Represents a node The degree is the number of associated hyperedges. Indicates the first The weight matrix of the layer aggregation stage is OK A real matrix of columns, Indicates the first The feature dimensions of the layer Indicates the first Layer input nodes eigenvectors and are 3D real vector Indicates the first The bias vector in the layer aggregation stage is 3D real vector This represents the ReLU activation function.

[0029] Furthermore, the second stage involves distributing the hyperedge features to the nodes. For each node... For all associated hyperedges, the feature representations of each hyperedge are weighted and aggregated to obtain the updated features of the node. The calculation formula for node feature distribution is shown in formula (3): in, Indicates the first Layer nodes eigenvectors and are 3D real vector Represents a node The set of associated hyperedges Indicates the first The weight matrix of the layer distribution stage is OK A real matrix of columns, Indicates the first The bias vector in the layer distribution phase is A 3D real vector.

[0030] Furthermore, after After the hypergraph convolution iteration, the output is a learner state representation that incorporates contextual information. ,in, Represents learner nodes The state is represented and is 3D real vector This represents the feature dimension of the final output. Indicates the first Learner nodes output by hypergraph convolution eigenvectors and are A 3D real vector.

[0031] Optionally, the number of hypergraph convolutional layers The value is 3. It is understood that those skilled in the art can adjust the value according to the actual situation. The value of is adjusted, but the embodiments of the present invention do not impose specific limitations on this.

[0032] The aforementioned hypergraph convolutional neural network differs from ordinary graph convolutional networks in the prior art. Ordinary graph convolutional networks only process binary node relationships, while the embodiments of the present invention process ternary higher-order associations through bidirectional information transmission between nodes and hyperedges. Contextual information is incorporated into the learner's state representation via the hyperedges.

[0033] S3: End-to-end parameter generation from cognition to decision: Input the learner state representation into the large language model, and output the weight parameters of the reinforcement learning policy network through the parameter generation head.

[0034] In one possible implementation, the learner state is represented. Input a large language model. Specifically, the learner state representation is mapped to the embedding space of the large language model through a linear projection layer to obtain the projection vector. ,in, Describes the projected vector and is 3D real vector The weight matrix of the linear projection layer is OK A real matrix of columns, This represents the embedding dimension of the large language model. The projection vector is concatenated with the embedding vector of the prompt text as a prefix vector and then input into the large language model. After semantically understanding the learner's state, the large language model outputs a hidden state vector. , This indicates that the large language model targets learner nodes. The output hidden state vector is A 3D real vector.

[0035] Furthermore, a parameter generation head is set in the output layer of the large language model. The parameter generation head maps the hidden states of the large language model to the weight parameters of the reinforcement learning policy network. The parameter generation head outputs the weight parameters using a low-rank decomposition method. Low-rank decomposition represents the weight parameters as the product of two low-rank matrices, which is used to reduce the dimensionality of the output parameters. The calculation formula of low-rank decomposition is shown in formula (4): in, The weight parameters of the reinforcement learning policy network are denoted as and are . OK A real matrix of columns, The dimension representing the action space is the number of candidate learning resources. The dimension representing the learner's state. This represents the basic weight matrix of the reinforcement learning policy network and is... OK A real matrix of columns, Let be the first matrix of the low-rank decomposition and be OK A real matrix of columns, Let the second matrix be the low-rank decomposition and be OK A real matrix of columns, It represents the rank of a low-rank decomposition.

[0036] Furthermore, the parameter generation head includes two linear layers, the first linear layer having a weight matrix... Map the hidden state vector to a matrix The flattened vector, where, for OK A matrix of real numbers in the columns. The weight matrix of the second linear layer. Map the hidden state vector to a matrix The flattened vector, where, for OK The columns of real number matrices are then reshaped into matrices of the corresponding dimensions.

[0037] Optionally, the rank of the low-rank decomposition The value is 16. It is understood that those skilled in the art can adjust the value according to the actual situation. The value of is adjusted, but the embodiments of the present invention do not impose specific limitations on this.

[0038] The parameter generation head described above differs from the existing methods of outputting text or embedding vectors in large language models. In this embodiment, the parameter generation head directly outputs the weight parameters of the policy network, transforming the semantic understanding results of the large language model into decision parameters, thus shortening the mapping path from cognition to decision. S4: Bidirectional Co-evolution of Policy and Structure: Construct a reinforcement learning policy network based on weight parameters and output recommendation probabilities. Calculate the loss function based on learning feedback and backpropagate the gradient of the loss function to the third-order adjacency tensor to update the hyperedge weights.

[0039] In one possible implementation, based on weight parameters A reinforcement learning policy network is constructed, which outputs the recommendation probability for each candidate learning resource based on the learner's state representation. The formula for calculating the output probability of the reinforcement learning policy network is shown in formula (5): in, Indicates the learner's state Select action The probability of recommendation This indicates an index of recommended learning resources. Denotes the bias vector of the reinforcement learning policy network and is 3D real vector Indicates taking the first vector. One element, The temperature coefficient indicates that it is used to control the smoothness of the output probability distribution. To sum the indices and iterate from 1 to An index of all candidate learning resources.

[0040] Optionally, temperature coefficient The value is 1.0. It is understood that those skilled in the art can adjust the value according to the actual situation. The value of is adjusted, but the embodiments of the present invention do not impose specific limitations on this.

[0041] Furthermore, the loss function is calculated based on the recommendation results and learning feedback. The loss function includes reinforcement learning loss, hypergraph structure regularization loss, and causal consistency loss. The formula for calculating the joint loss function is shown in formula (6): in, Denotes the joint loss function. This indicates the loss from reinforcement learning. This represents the hypergraph structure regularization loss. This indicates a loss of causal consistency. The weights represent the weights of the hypergraph structure regularization loss. The weighting coefficient represents the loss of causal consistency.

[0042] Optionally, The value is 0.1. The value is 0.05. It is understood that those skilled in the art can adjust the value according to the actual situation. and The value of is adjusted, but the embodiments of the present invention do not impose specific limitations on this.

[0043] Furthermore, the reinforcement learning loss is calculated based on the policy gradient method, and the formula for calculating the reinforcement learning loss is shown in formula (7): in, Indicates the length of the learning sequence. This represents the time step index, and 'b' represents the discount factor used to balance immediate rewards and long-term rewards. Indicates the first The reward value of each step Indicates the first The selected action, Indicates the first The learner's state is represented by the step. Indicates the learner's state Select action The recommendation probability is calculated using formula (5). Reward value The reward value is determined based on the learner's performance on a test after completing the recommended resources; a positive reward is given when the test is passed. The reward value is negative when the test is failed. .

[0044] Optionally, discount factor The value is 0.99. It is understood that those skilled in the art can adjust the value of b according to actual circumstances, and this embodiment of the invention does not specifically limit this. The above discount factor is close to 1, which makes the reward weights in the reinforcement learning loss decay more slowly at later time steps, and the optimization objective of the policy network focuses on long-term learning effects rather than just immediate feedback.

[0045] Furthermore, the hypergraph structure regularization loss is used to constrain the sparsity of the third-order adjacency tensor. The calculation formula for the hypergraph structure regularization loss is shown in Equation (8): in, The coefficients of the L1 regularization term are used to promote sparsity. The coefficients of the L2 regularization term are used to prevent excessively large weights. This indicates taking the absolute value.

[0046] Optionally, The value is 0.01. The value is 0.001. It is understood that those skilled in the art can adjust the value according to the actual situation. and The values ​​of are adjusted, but the embodiments of the present invention do not impose specific limitations on this. The above-mentioned L1 regularization term makes some hyperedge weights approach zero, and the weak correlations in the hypergraph structure are suppressed; the above-mentioned L2 regularization term limits the magnitude of the hyperedge weights, avoiding the dominance of feature encoding by a single hyperedge weight being too large.

[0047] Furthermore, the causal consistency loss is used to constrain the consistency between the recommendation results and the causal diagnosis, meaning that the recommended learning resources should target the weak knowledge points identified by the causal diagnosis. The formula for calculating the causal consistency loss is shown in formula (9): in, The dimension of the action space represents the number of candidate learning resources, and 'a' represents the index of the learning resource and is a summation variable. Indicate weak knowledge points, Indicates learning resources Weak knowledge points The degree of relevance, when learning resources Covering weak knowledge points hour ,otherwise , For learners The state representation, Indicates the learner's state The recommended probability of choosing action a is calculated using formula (5). During model training, weak knowledge points... The loss function is calculated by S5 and used for the current batch.

[0048] Furthermore, the gradient of the loss function is backpropagated to the third-order adjacency tensor via the chain rule to update the hyperedge weights. This is because the learner's state representation... Depends on third-order adjacency tensor Joint loss function The gradient of the third-order adjacency tensor can be calculated using the chain rule, and then the gradient descent method can be used to update the hyperedge weights, thereby realizing the bidirectional co-evolution of recommendation strategy and knowledge structure.

[0049] The gradient backpropagation mechanism described above differs from the fixed or independently updated knowledge graph structure in existing technologies. In existing technologies, the graph structure relies on predefined rules or behavioral data statistics, and is independent of the optimization process of the recommendation strategy. In this embodiment of the invention, the third-order adjacency tensor is set as a differentiable variable to participate in end-to-end training, and the hyperedge weights are adaptively adjusted according to the recommendation effect. The recommendation strategy and knowledge structure are jointly updated under the same optimization objective.

[0050] S5: Counterfactual reasoning-driven weak knowledge point location: Based on the knowledge point nodes and their prerequisite relationships in the ternary dynamic hypergraph, a structural causal model is constructed, the causal effect of each prerequisite knowledge point is calculated, the prerequisite knowledge point with the largest causal effect is identified as the weak knowledge point, and the weak knowledge point is input into the large language model to generate a causal diagnostic chain.

[0051] In one possible implementation, a structural causal model is constructed based on knowledge point nodes and their prerequisite relationships in a ternary dynamic hypergraph. The prerequisite relationships between the knowledge point nodes are obtained from the course knowledge system or teaching syllabus, and the prerequisite relationships represent the knowledge points that must be mastered before learning a certain knowledge point. The structural causal model forms a directed graph describing the causal dependencies between knowledge points, with the knowledge point nodes as nodes and the prerequisite relationships between the knowledge point nodes as directed edges.

[0052] Furthermore, the causal effect of each prerequisite knowledge point is calculated, and the causal effect characterizes the degree of influence of the prerequisite knowledge point on learning success. The causal effect is calculated using the do operator, which performs an intervention operation on the prerequisite knowledge points in the structural causal model. The causal effect is the difference in the probability of learning success before and after the intervention. The formula for calculating the causal effect is shown in formula (10): in, Prerequisite knowledge points causal effect This indicates the result of mastering the knowledge points of the learning objectives. This indicates successful learning. Prerequisite knowledge points The state of control, This indicates that the information has been obtained. This indicates that the information is not yet available. The do operator represents the action performed on the variable. Intervention indicates prior knowledge points This represents the probability of successfully learning the target knowledge point when the user has already mastered the skill. Intervention indicates prior knowledge points This represents the probability of successfully learning a target knowledge point when the knowledge is not yet mastered.

[0053] Furthermore, the intervention probability is calculated by adjusting the formula. Based on the graphical structure of the structural causal model, the backdoor criterion is used to eliminate confounding bias. Let... For knowledge points The set of parent nodes The set of prerequisite knowledge points is used, and the intervention probability is calculated by weighting and summing all possible combinations of values ​​in the parent node set. The conditional probability and marginal probability are estimated based on historical learning data.

[0054] Furthermore, the prerequisite knowledge points with the greatest causal effect are identified as the weak knowledge points, that is... ,in This indicates areas where knowledge is weak.

[0055] Furthermore, weak knowledge points are input into the large language model to generate a causal diagnostic chain. Specifically, the weak knowledge point identifier, causal effect value, and prerequisite knowledge path are input into the large language model as prompts. The large language model then generates a causal diagnostic chain based on these prompts. The causal diagnostic chain comprises four components: weak knowledge point identifier, description of learning obstacles, explanation of the recommendation rationale, and prediction of expected results. This causal diagnostic chain explains the basis for the recommendation and the expected results to the learner, rather than simply outputting the recommendation result.

[0056] In this embodiment of the invention, the counterfactual reasoning-driven weak knowledge point location described above differs from the existing weak point identification method based on accuracy statistics. The existing method only counts the learner's historical accuracy rate on each knowledge point and identifies knowledge points with low accuracy rates as weak points. This method does not distinguish between correlation and causation. In this embodiment of the invention, the intervention effect is calculated through the do operator, and the prior knowledge points that have a causal impact on learning success are identified, rather than knowledge points that are only related to learning failure.

[0057] S6: Recommended Resources and Causal Diagnostic Chain Output: Output recommended learning resources and causal diagnostic chain.

[0058] In one possible implementation, based on the recommendation probability The recommended learning resources are determined, and the top few with the highest recommendation probability are selected as the final recommendations. The recommended learning resources and their corresponding causal diagnostic chains are output, allowing learners to understand the reasons for the recommendations and the expected learning outcomes.

[0059] In this embodiment of the invention, the above-mentioned recommendation output method differs from the prior art method of only outputting a recommendation list. The prior art recommendation system usually only presents a list of recommended learning resources to learners, and learners cannot understand the reasons for the recommendations. This embodiment of the invention also outputs a causal diagnostic chain, which includes weak knowledge points, descriptions of learning obstacles, reasons for recommendations, and expected effects. Learners can understand the recommendation logic based on this and decide whether to adopt the recommendations independently.

[0060] Reference manual attached Figure 2 The diagram shows a structural schematic of an online learning information recommendation system provided by an embodiment of the present invention.

[0061] This invention also provides an online learning information recommendation system 20, comprising: a processor 201 and a memory 202; The memory 202 stores programs or instructions that can run on the processor 201. When the program or instructions are executed by the processor 201, they implement the steps of the online learning information recommendation method described above and achieve the same technical effect. To avoid repetition, the present invention will not elaborate further.

[0062] It should be understood that the processor 201 in this embodiment of the invention may be a central processing unit (CPU), or it may be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor.

[0063] It should also be understood that the memory 202 in the embodiments of the present invention can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of random access memory (RAM) are available, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate synchronous DRAM (DDR SDRAM), enhanced synchronous DRAM (ESDRAM), synchronous linked DRAM (SLDRAM), and direct rambus RAM (DR RAM).

[0064] The above embodiments can be implemented, in whole or in part, by software, hardware (such as circuits), firmware, or any other combination thereof. When implemented using software, the above embodiments can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, all or part of the flow or function according to the embodiments of the present invention is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. Computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., infrared, wireless, microwave, etc.) means. A computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more sets of available media. Available media can be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media. Semiconductor media can be solid-state drives.

[0065] It should be understood that, in various embodiments of the present invention, the order of the above-mentioned process numbers does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

[0066] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0067] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the devices, apparatuses, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0068] In the embodiments provided by this invention, it should be understood that the disclosed devices, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0069] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0070] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0071] If a function is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0072] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the embodiments of the present invention, and are not intended to limit them. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in the present invention should be included within the protection scope of the present invention.

Claims

1. A method for recommending online learning information, characterized in that, include: S1: Construct a ternary dynamic hypergraph, which includes learner nodes, knowledge point nodes, and context nodes. The context nodes are independent nodes and are associated with the learner nodes and the knowledge point nodes through hyperedges. The hyperedges are generalized edges that associate multiple heterogeneous nodes. The ternary dynamic hypergraph is represented by a third-order adjacency tensor, and the element values ​​of the third-order adjacency tensor are the hyperedge weights. S2: Use a hypergraph convolutional neural network to encode the features of the ternary dynamic hypergraph and output a learner state representation that incorporates contextual information; S3: Input the learner state representation into the large language model, and output the weight parameters of the reinforcement learning policy network through the parameter generator; S4: Construct the reinforcement learning policy network based on the weight parameters and output the recommendation probability. Calculate the loss function according to the learning feedback and backpropagate the gradient of the loss function to the third-order adjacency tensor to update the hyperedge weights. S5: Based on the knowledge point nodes and their prerequisite relationships in the ternary dynamic hypergraph, construct a structural causal model, calculate the causal effect of each prerequisite knowledge point, identify the prerequisite knowledge point with the largest causal effect as the weak knowledge point, and input the weak knowledge point into the large language model to generate a causal diagnosis chain. S6: Determine the recommended learning resources based on the recommendation probability, and output the recommended learning resources and the causal diagnostic chain.

2. The online learning information recommendation method according to claim 1, characterized in that, The context nodes include at least one of time slice nodes, device type nodes, emotional state nodes, and social environment nodes.

3. The online learning information recommendation method according to claim 1, characterized in that, The third-order adjacency tensor is a three-dimensional array with learners, knowledge points, and context as its three dimensions, and the weights of each hyperedge are calculated through a hypergraph attention mechanism.

4. The online learning information recommendation method according to claim 1, characterized in that, The S2 step of using a hypergraph convolutional neural network to encode the features of the ternary dynamic hypergraph includes two stages: node features are aggregated to hyperedges, and hyperedge features are distributed to nodes.

5. The online learning information recommendation method according to claim 1, characterized in that, The parameter generation head outputs the weight parameters of the reinforcement learning policy network using a low-rank decomposition method, whereby the weight parameters are represented as the product of two low-rank matrices.

6. The online learning information recommendation method according to claim 1, characterized in that, The loss function includes reinforcement learning loss, hypergraph structure regularization loss, and causal consistency loss.

7. The online learning information recommendation method according to claim 1, characterized in that, The structural causal model uses the knowledge point nodes as nodes and the prior relationships between the knowledge point nodes as directed edges.

8. The online learning information recommendation method according to claim 1, characterized in that, The causal effect is calculated using the do operator, which performs an intervention operation on the prerequisite knowledge points in the structural causal model. The causal effect is the difference in the probability of successful learning before and after the intervention.

9. The online learning information recommendation method according to claim 1, characterized in that, The causal diagnostic chain includes identification of weak knowledge points, description of learning difficulties, explanation of reasons for recommendation, and prediction of expected results.

10. An online learning information recommendation system, characterized in that, include: Processor and memory; The memory stores programs or instructions that can run on the processor, which, when executed by the processor, implement the steps of the online learning information recommendation method as described in any one of claims 1 to 9.