Theoretical education interaction system and method based on knowledge graph

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a dynamic knowledge graph and a cognitive state diagnosis module, the problems of learner cognitive differences and delayed teaching feedback in theoretical education were solved, enabling the planning of personalized learning paths and multimodal interaction, thereby improving learning efficiency and participation.

CN122201093APending Publication Date: 2026-06-12LIAONING UNIVERSITY OF TECHNOLOGY

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: LIAONING UNIVERSITY OF TECHNOLOGY
Filing Date: 2026-04-29
Publication Date: 2026-06-12

Application Information

Patent Timeline

29 Apr 2026

Application

12 Jun 2026

Publication

CN122201093A

IPC: G09B19/00; G06N5/022; G06F40/30; G06F16/3329; G06Q50/20

AI Tagging

Application Domain

Digital data information retrieval Data processing applications

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

In existing theoretical education, learners have large differences in their cognitive starting points, preconceptions are difficult to identify accurately, the logical connections between concepts lack visualization, teaching feedback is lagging, and personalized learning paths cannot be provided.

⚗Method used

A dynamic knowledge graph is constructed, and the learner's cognitive state is identified through the cognitive state diagnosis module. Personalized learning paths are planned, and multimodal interaction and feedback updates are adopted to form a closed loop of diagnosis-planning-interaction-feedback.

🎯Benefits of technology

It achieved accurate identification of learners' cognitive states, improved learning efficiency by 30%, increased learners' participation and theoretical acceptance, and enabled the teaching knowledge system to self-optimize.

✦ Generated by Eureka AI based on patent content.

Patent Text Reader

Abstract

The application discloses a kind of theoretical education interactive system and method based on knowledge graph, comprising: knowledge graph construction module, for constructing dynamic knowledge graph with theoretical core concept as node;Cognitive state diagnosis module, the pre-concept bias and misunderstanding type of learner are identified using semantic recognition technology, and personalized cognitive state vector is generated;Learning path planning module, adaptive learning path is generated using path search algorithm;Multi-modal interaction module presents learning content in multiple forms and collects feedback;Feedback evaluation update module is used to update cognitive state and dynamically adjust knowledge graph weight.The application solves the technical problems of lack of precise diagnosis and personalized path in traditional education through semantic recognition and closed-loop control strategy.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the interdisciplinary field of intelligent education technology and ideological and political education, specifically involving a theoretical education interactive system and method that combines knowledge graphs, cognitive diagnosis, and adaptive learning path planning. Background Technology

[0002] Theoretical education plays a crucial role in cultivating learners' scientific worldview. However, current teaching methods face the following challenges: learners have vastly different cognitive starting points, making it difficult to accurately identify pre-existing concepts; the logical connections between theoretical concepts lack visual representation, hindering learners from building a systematic framework; teaching feedback is delayed, lacking real-time dynamic adjustment capabilities; and large classes cannot provide personalized learning paths for each learner. Existing intelligent education technologies are mostly geared towards mathematics and science, making it difficult to adapt to the specific needs of theoretical education regarding the dialectical logical relationships between concepts and value guidance. Therefore, there is an urgent need for an intelligent education system specifically designed for this field. Summary of the Invention

[0003] This invention provides a theoretical education interaction system and method based on knowledge graphs. By constructing dynamic knowledge graphs, diagnosing learners' cognitive states, planning personalized paths, multimodal interaction, and feedback updates, it achieves precise and intelligent education.

[0004] This invention provides a knowledge graph-based theoretical education interactive system, including a knowledge graph construction module, a cognitive state diagnosis module, a learning path planning module, a multimodal interaction module, and a feedback, evaluation, and update module. These modules work collaboratively to form a closed loop of "diagnosis—planning—interaction—feedback."

[0005] The technical solution provided by this invention is as follows: A knowledge graph-based theoretical education interactive system, characterized in that it includes: The knowledge graph construction module is used to extract entities, relationships, and attributes from classic theoretical literature, textbooks, and policy documents to construct a dynamic knowledge graph with theoretical concepts as nodes. The cognitive state diagnosis module, connected to the knowledge graph construction module, is used to collect learners’ text input through interactive question answering, use natural language processing technology to identify learners’ preconcept distribution and the deviation between them and scientific concepts, and generate personalized cognitive state vectors. The learning path planning module, connected to the cognitive state diagnosis module, is used to generate an adaptive learning path containing a sequence of learning units based on the difference between the personalized cognitive state vector and the preset educational goal using a path search algorithm. A multimodal interaction module, connected to the learning path planning module, is used to present learning content in the form of text, images, audio, video or virtual reality according to the adaptive learning path, and to collect learner feedback data in real time. The feedback evaluation and update module is connected to the multimodal interaction module and the knowledge graph construction module, respectively. It is used to parse the feedback data, update the personalized cognitive state vector, and dynamically adjust the weight of the knowledge graph according to the group learning effect.

[0006] Preferably, the knowledge graph construction module includes: The document parsing unit is used to perform named entity recognition and relation extraction on theoretical texts. Concept hierarchical annotation units are used to determine the hierarchical, parallel, and progressive relationships between concepts; The timeline annotation unit is used to annotate the time of introduction and development of each theoretical concept; The Misunderstanding Library building unit is used to collect common misunderstandings and compare them with correct concepts.

[0007] Preferably, the cognitive state diagnosis module employs a text encoder for the ideological and political domain based on a pre-trained language model to encode the learner's text response into a semantic vector, and outputs the mastery probability of each concept by calculating the similarity with the concept node vectors in the knowledge graph; the cognitive state diagnosis module also includes a misunderstanding type identifier, used to identify the misunderstanding type matched by the learner's response.

[0008] Preferably, the learning path planning module employs a heuristic search algorithm or a reinforcement learning algorithm; the state space is defined as the learner's current cognitive state vector, the action space is defined as the set of next learning units that can be pushed, and the heuristic term in the cost function is the difference between the current concept mastery probability and the target mastery probability.

[0009] Preferably, the multimodal interaction module includes: The virtual historical scene reconstruction unit is used to construct a three-dimensional virtual environment, enabling learners to participate in interactive activities through role-playing. An eye-tracking unit is used to collect the learner's gaze coordinates and gaze duration; The voice interaction unit is used to convert spoken responses into text and extract paralinguistic features; A branching narrative engine used to provide story scenarios with multiple branching choices.

[0010] A theoretical education interaction method, based on the knowledge graph-based theoretical education interaction system, includes the following: S1: Construct a dynamic knowledge graph with core theoretical concepts as nodes, and pre-set an educational goal graph; S2: Perform initial cognitive diagnosis on learners and obtain basic cognitive state vectors; S3: Based on the difference between the basic cognitive state vector and the educational goal map, plan an adaptive learning path; S4: Following the adaptive learning path, learning units are presented to learners through multimodal interaction, and feedback data is collected in real time; S5: Analyze the feedback data and update the learner's cognitive state vector; S6: Determine whether the current cognitive state meets the termination condition. If not, return to S3; otherwise, output the learning report.

[0011] The beneficial effects of this invention are: Compared with existing technologies, this invention has the following beneficial effects: it automatically diagnoses learners' mastery of theoretical concepts and types of misunderstandings through semantic recognition technology, with an accuracy rate of over 85%; it dynamically plans learning paths based on cognitive states, improving learning efficiency by approximately 30%; it enhances learners' participation and theoretical acceptance through multimodal interaction; and it achieves self-optimization of the teaching knowledge system through continuous feedback from group learning data. In summary, this invention significantly improves the personalization level and teaching effectiveness of theoretical education. Detailed Implementation

[0012] The technical solution of the present invention will be described in detail below with reference to specific embodiments. This section focuses on the semantic recognition method and system control strategy of the present invention.

[0013] I. Semantic Recognition Methods Semantic recognition is the core technology of the cognitive diagnosis module of this system. Its goal is to accurately extract learners' comprehension status from their natural language responses, including: their mastery of a concept, the type of misunderstanding they hold, and the confidence level of their responses.

[0014] 1.1 Text Preprocessing and Vectorization The text input by learners is first cleaned to remove emojis, repeated punctuation, and meaningless interjections. Then, a pre-trained language model fine-tuned on a theoretical corpus is used as an encoder. This encoder maps input text of arbitrary length to a fixed-dimensional semantic vector. This vector encodes not only lexical information but also syntactic structure and contextual semantics. For example, the semantic vectors of the different statements "matter exists objectively" and "matter does not depend on human senses" are very close in spatial distance; while "matter is something that can be seen and touched" will fall into a different region.

[0015] 1.2 Concept Matching and Mastery Probability Calculation Each theoretical concept in the knowledge graph has a standard vector, which is obtained by encoding the standard definition text of the concept. For example, the standard definition of the concept of "matter" is "objective reality that is independent of human consciousness and can be reflected by human consciousness", and its vector representation is V_matter.

[0016] When a learner answers a question, they receive an answer vector V_answer. The system calculates the cosine similarity between V_answer and the target concept vector V_concept: similarity = (V_answer · V_concept) / (||V_answer|| × ||V_concept||); Since learners' answers may not be entirely standard, directly using similarity as the probability of mastery would cause fluctuations. The system uses a sigmoid function with a temperature parameter for smooth mapping. P_mastery = 1 / (1 + exp(-k × (similarity - threshold))); Where k is a temperature coefficient that controls the sensitivity of the mapping; threshold is the baseline threshold. For core fundamental concepts, a higher threshold is set (more stringent requirements); for advanced concepts, a lower threshold is set.

[0017] 1.3 Misunderstanding Type Identification The misunderstanding database stores vector representations of each typical misunderstanding. These vectors are aggregated from the textual codes of common expressions of that misunderstanding. For example, the vector V_mis1 for the misunderstanding "matter is concrete objects" comes from the average of the codes of dozens of actual student responses such as "matter is tables and chairs" and "matter is something that can be seen."

[0018] The system simultaneously calculates the similarity between V_answer and each misunderstanding vector V_mis_j, as well as the similarity with the correct concept vector V_correct. If max_j(sim(V_answer, V_mis_j)) is significantly greater than sim(V_answer, V_correct) (the difference is greater than a preset threshold of 0.2), the learner is determined to hold that misunderstanding type. The matching strength is also output for subsequent severity grading.

[0019] 1.4 Confidence Level and Uncertainty Estimation The system outputs not only the mastery probability and misunderstanding label, but also a confidence score. When the learner's answer has a very low similarity to all concept vectors and misunderstanding vectors (maximum similarity < 0.4), it indicates that the system cannot reliably judge the answer, resulting in low confidence. In this case, the control strategy will trigger a follow-up questioning mechanism instead of directly accepting the answer. For example, the system could reply: "I didn't fully understand what you meant. Could you explain it in another way?"

[0020] 1.5 Intent Branch Identification for Open-Ended Responses Beyond simple concept matching, the system also needs to identify learners' intentions in answering open-ended questions. For example, for the question "Please talk about your understanding of practice," possible intentions include: definitional (giving a concept definition), illustrative (listing examples), relational (explaining its relationship with other concepts), and questioning (expressing confusion or differing opinions). The system uses a lightweight intention classifier (adding a softmax layer on top of the encoder) to classify the intentions of the answers. Different intentions trigger different follow-up interaction strategies: after a definitional answer, the system will ask, "Could you give an example?"; after a questioning answer, the system will first confirm empathy before guiding the learner to clarify the material.

[0021] II. System Control Strategy The control strategy determines the interaction timing, state transition conditions, and exception handling between modules. The system uses a finite state machine as the top-level controller, combined with a reinforcement learning action selection mechanism.

[0022] 2.1 Top-level Finite State Machine The system defines the following states and transitions between them: STATE_IDLE: Idle state. Waiting for learners to log in or select a course.

[0023] STATE_DIAGNOSIS: Diagnostic status. Cognitive diagnosis is being performed, collecting the learner's initial knowledge state.

[0024] STATE_PLANNING: Planning state. Generates the next learning path based on the current cognitive state.

[0025] STATE_TEACHING: Teaching status. Learning content is being presented, and real-time feedback is being collected.

[0026] STATE_EVALUATING: Evaluation status. Parsing feedback data and updating the learning model.

[0027] STATE_TERMINAL: Termination status. Learning objectives completed; output a report.

[0028] State transition conditions: IDLE → DIAGNOSIS: The learner is logged in but has no valid history, or is requesting a re-diagnosis.

[0029] DIAGNOSIS → PLANNING: Diagnosis complete, initial cognitive state vector obtained.

[0030] PLANNING → TEACHING: Generate the next action, preparing to push the learning unit.

[0031] TEACHING → EVALUATING: The current learning unit has ended, and the interaction log has been collected.

[0032] • EVALUATING → PLANNING: Assessment complete, cognitive status not up to standard, continue planning.

[0033] EVALUATING → TERMINAL: Assessment completed, cognitive status met, learning ends.

[0034] 2.2 Control Strategies in the Diagnostic Phase The diagnostic module also has its own state machine. At the start of the diagnostic process, the system first filters a list of concepts to be diagnosed from the knowledge graph. The filtering logic is as follows: using core concepts in the learning objective graph as seeds, a breadth-first traversal is performed, retaining concepts that are ≤2 hops away from the seed concept. After obtaining a number of concepts, they are sorted according to their concept level from basic to advanced.

[0035] The diagnosis is divided into three rounds: Round 1 (Rapid Screening): One multiple-choice question is presented for each concept. Four options are provided, including one correct answer and three commonly misunderstood options. After the learner answers, the system immediately records whether the answer is correct or not. At the end of this round, the system calculates the initial mastery probability P0 for each concept.

[0036] The second round (refined diagnosis): For concepts with P0 between 0.3 and 0.7 (the fuzzy area), open-ended questions are pushed out. The text of the open-ended questions is generated from a template, for example: "Please explain in your own words what [concept name] is." After the learner inputs the information, the system performs semantic recognition to obtain the exact mastery probability P1 and the misconception label.

[0037] The third round (in-depth diagnosis): For concepts that remain unclear (low confidence) in P1, dilemma-based analytical questions are presented. For example: "Some say [misleading statement], while others say [correct statement]. Which do you think is more reasonable? Why?" Through the learner's reasoning process, deeper cognitive information is obtained. The total time for the three rounds of diagnosis is controlled within 15 minutes to avoid learner fatigue. During the diagnosis process, if a learner demonstrates correct understanding of a concept twice consecutively, the system will skip further diagnosis of that concept to improve efficiency.

[0038] 2.3 Control Strategies in the Planning Phase The learning path planning module employs a hierarchical planning strategy. Top-level planning determines the learning order: based on the gap between the current cognitive state vector and the target vector, concepts are ordered from largest to smallest gap, while also considering the teaching dependencies in the knowledge graph (e.g., "commodity" must be learned before "currency"). These dependencies are then used to generate a feasible learning sequence through topological sorting.

[0039] The underlying planning is responsible for selecting the optimal interaction form and difficulty for each concept in the sequence. The selection is based on a multi-factor decision function: Score(m, d) = α × effectiveness(m, d, learner_profile) + β × efficiency(m, d) - γ × cost(m, d); Effectiveness is the learning gain predicted based on the learner's historical data (response effects to different interaction methods); efficiency is the amount of knowledge acquired per unit of time; cost includes content preparation costs and equipment requirements (VR is more expensive, text is less expensive). α, β, and γ are dynamically adjusted weights. For learners with weak foundations, α (prioritizing effectiveness) is increased; for learners with limited time, β (prioritizing efficiency) is increased.

[0040] 2.4 Real-time control strategies during the teaching phase During the teaching phase, the multimodal interaction module not only passively presents content but also makes fine adjustments based on real-time sensor data. The following are some key control loops: Eye-tracking feedback control loop: The system collects eye-tracking data 30 times per second. It calculates the proportion of time the learner gazes at key areas of the current screen (such as definition text or example charts). If the gaze proportion is below 20% for 10 consecutive seconds, it is judged as inattentiveness, and the system automatically inserts a short interactive question: "Do you have any questions about what we just discussed?" If the learner's gaze repeatedly revisits a certain area (forming a closed loop in the saccade path), it is judged as confusion, and the system pops up a prompt at an appropriate time: "Does this part need further explanation?"

[0041] Voice emotion control loop: The voice interaction unit extracts features such as speech rate, fundamental frequency, energy, and pause patterns in real time. A lightweight classifier is trained (input is frame-level features extracted by deep learning, output is emotion labels). When frustrated emotions are detected (significantly slower speech rate, lower fundamental frequency, and more long pauses), the system adjusts subsequent interaction strategies: reducing difficulty, adding encouraging words, or switching the teaching format (from question-and-answer to watching videos). When positive emotions are detected (normal speech rate, rich fundamental frequency), the system maintains or slightly increases the challenge.

[0042] Response time control loop: The system records the time from when the learner sees the question to when they begin inputting. Normal response times are between 5 and 30 seconds. If the response time is less than 3 seconds (possibly due to random clicking), the system will ask for confirmation: "Please think carefully before answering." If the response time is longer than 60 seconds, the system will proactively intervene: "Need a hint?" and provide a progressive chain of hints (first keyword hints, then option hints if necessary). This mechanism prevents learners from wasting excessive time when stuck.

[0043] 2.5 Control Strategies in the Evaluation Phase Upon receiving the interaction log, the feedback evaluation update module performs the following sequential operations: 1. Instantly update the learner model: Update the learner's cognitive state vector and misunderstanding profile using the latest response data. An exponential decay weighting method is used, with a weight of 0.4 for new data and a cumulative historical weight of 0.6.

[0044] 2. Update the experience replay pool: Store the (s, a, r, s') tuple from this interaction into the replay pool. The replay pool adopts a priority experience replay mechanism, giving higher sampling weights to samples with high rewards or high unexpectedness (significant difference between actual and expected rewards).

[0045] 3. Trigger strategy optimization: When the number of new samples accumulated in the replay pool reaches 32, a batch is randomly sampled from the pool (priority weighted), the loss function is calculated, and a gradient descent update is performed on the deep Q network of the path planning module.

[0046] 4. Check termination condition: Determine whether the current learner's mastery probability of all core concepts has reached the target threshold. If it has, the state machine transitions to TERMINAL; otherwise, it returns to PLANNING.

[0047] 2.6 Abnormal State Handling Strategy The system defines the following abnormal states and configures corresponding handling actions: Timeout exception: The learner's total time spent in a single interaction exceeds twice the expected duration. Handling: Interrupt the current unit, record a "Learning Interrupted" flag, and resume from the breakpoint upon the next login.

[0048] Repeated failure anomaly: Learners fail the same concept three times consecutively on tests. Solution: Reduce the difficulty level of the concept, change the interactive format (e.g., from abstract discussion to concrete case analysis), and add a review session.

[0049] Invalid input exception: The learner's answer is empty, contains garbled characters, or is obviously irrelevant. Handling: Ignore the status update, ask the question again, and prompt "Please answer with a complete sentence".

[0050] Equipment malfunction: Eye tracker or VR headset disconnected. Solution: Downgrade to plain text or video mode, log the equipment malfunction, and prompt the learner to check the hardware.

[0051] III. Examples of Collaborative Operation of Semantic Recognition and Control Strategies The following complete example illustrates how semantic recognition and control strategies work together.

[0052] Upon first login, the system enters DIAGNOSIS mode. In the first round of quick screening, the learner answered a multiple-choice question about the concept of "practice" incorrectly. In the second round of refined diagnosis, the system presented an open-ended question: "Please explain in your own words what 'practice' is." The learner answered: "Practice is doing things, such as workers doing work and scientists conducting experiments." The semantic recognition module calculated the similarity between the answer vector and the "practice" concept vector to be 0.65, which is below the threshold of 0.75. However, it also found that the similarity between this answer and the misunderstanding vector "practice is a concrete activity" was significantly higher at 0.82. The misunderstanding type identifyr output: misunderstanding label "concretization prototype," with a strength of 0.82 and high confidence.

[0053] The control strategy evaluates the diagnostic result. Due to the high misunderstanding intensity (>0.7), the system decides not to immediately push standard teaching content, but instead to execute a "misunderstanding clarification" sub-process first. The system enters the TEACHING state, calls the branch narrative engine, and initiates a specially designed two-step clarification narrative: Step 1: Show a dialogue between two virtual characters. A says, "Practice is the material activity of people transforming the objective world." B says, "Then does thinking count as practice?" Then ask the learner, "Who do you think is right?" After the learner chooses, the system gives the correct answer and explains, "Practice is a material activity, and thinking does not belong to practice, but practice determines knowledge."

[0054] Step 2: A multiple-choice question is presented: "Which of the following is considered practice? A. Scientists propose hypotheses B. Workers produce parts C. Students memorize theories." The learner correctly selects B. The system records that the misunderstanding has been initially eliminated, and the probability of mastering the concept of "practice" has increased from 0.50 to 0.78.

[0055] After this interaction, the EVALUATING state calculated the reward: ΔP = 0.28, taking 3 minutes. The participation metric was good, resulting in a positive reward. This experience will be used for subsequent strategy optimization.

[0056] IV. System Initialization and Parameter Configuration The following control parameters need to be configured during the initial deployment of the system: Diagnostic phase: core concept threshold 0.85, advanced concept threshold 0.75; misunderstanding severity grading thresholds 0.5 and 0.8.

[0057] Planning phase: α, β, and γ have default values of 0.5, 0.3, and 0.2 respectively; distance from scroll window size is 10 concepts.

[0058] Teaching phase: Eye-tracking sampling rate 30Hz, confusion trigger threshold (key area fixation ratio <20% for 10 seconds); speech emotion classifier confidence threshold 0.7.

[0059] Evaluation phase: Exponentially weighted decay coefficient 0.4; replay pool capacity 10000; DQN batch size 32; learning rate 0.001.

[0060] Although embodiments of the present invention have been disclosed above, they are not limited to the applications listed in the specification and embodiments. They can be applied to various fields suitable for the present invention. For those skilled in the art, other modifications can be easily made. Therefore, without departing from the general concept defined by the claims and their equivalents, the present invention is not limited to the specific details and embodiments shown and described herein.

Claims

1. A theoretical education interactive system based on knowledge graphs, characterized in that, include: The knowledge graph construction module is used to extract entities, relationships, and attributes from classic theoretical literature, textbooks, and policy documents to construct a dynamic knowledge graph with theoretical concepts as nodes. The cognitive state diagnosis module, connected to the knowledge graph construction module, is used to collect learners’ text input through interactive question answering, use natural language processing technology to identify learners’ preconcept distribution and the deviation between them and scientific concepts, and generate personalized cognitive state vectors. The learning path planning module, connected to the cognitive state diagnosis module, is used to generate an adaptive learning path containing a sequence of learning units based on the difference between the personalized cognitive state vector and the preset educational goal using a path search algorithm. A multimodal interaction module, connected to the learning path planning module, is used to present learning content in the form of text, images, audio, video or virtual reality according to the adaptive learning path, and to collect learner feedback data in real time. The feedback evaluation and update module is connected to the multimodal interaction module and the knowledge graph construction module, respectively. It is used to parse the feedback data, update the personalized cognitive state vector, and dynamically adjust the weight of the knowledge graph according to the group learning effect.

2. The knowledge graph-based theoretical education interactive system according to claim 1, characterized in that, The knowledge graph construction module includes: The document parsing unit is used to perform named entity recognition and relation extraction on theoretical texts. Concept hierarchical annotation units are used to determine the hierarchical, parallel, and progressive relationships between concepts; The timeline annotation unit is used to annotate the time of introduction and development of each theoretical concept; The Misunderstanding Library building unit is used to collect common misunderstandings and compare them with correct concepts.

3. The knowledge graph-based theoretical education interactive system according to claim 1, characterized in that, The cognitive state diagnosis module employs a text encoder for the ideological and political domain based on a pre-trained language model to encode learners' text responses into semantic vectors. By calculating the similarity between the vectors and the vectors of each concept node in the knowledge graph, the module outputs the mastery probability of each concept. The cognitive state diagnosis module also includes a misunderstanding type identifier, which is used to identify the misunderstanding type matched by the learner's response.

4. The knowledge graph-based theoretical education interactive system according to claim 1, characterized in that, The learning path planning module employs a heuristic search algorithm or a reinforcement learning algorithm; the state space is defined as the learner's current cognitive state vector, the action space is defined as the set of next learning units that can be pushed, and the heuristic term in the cost function is the difference between the current concept mastery probability and the target mastery probability.

5. The knowledge graph-based theoretical education interactive system according to claim 1, characterized in that, The multimodal interaction module includes: The virtual historical scene reconstruction unit is used to construct a three-dimensional virtual environment, enabling learners to participate in interactive activities through role-playing. An eye-tracking unit is used to collect the learner's gaze coordinates and gaze duration; The voice interaction unit is used to convert spoken responses into text and extract paralinguistic features; A branching narrative engine used to provide story scenarios with multiple branching choices.

6. A theoretical education interaction method, based on the knowledge graph-based theoretical education interaction system described in any one of claims 1-5, characterized in that, include: S1: Construct a dynamic knowledge graph with core theoretical concepts as nodes, and pre-set an educational goal graph; S2: Perform initial cognitive diagnosis on learners and obtain basic cognitive state vectors; S3: Based on the difference between the basic cognitive state vector and the educational goal map, plan an adaptive learning path; S4: Following the adaptive learning path, learning units are presented to learners through multimodal interaction, and feedback data is collected in real time; S5: Analyze the feedback data and update the learner's cognitive state vector; S6: Determine whether the current cognitive state meets the termination condition. If not, return to S3; otherwise, output the learning report.