A Differentiated Learning Content Search Method Based on Knowledge State Assessment
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- XIAOBAO ONLINE HANGZHOU TECH CO LTD
- Filing Date
- 2026-04-02
- Publication Date
- 2026-06-30
Smart Images

Figure CN122309847A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent education technology, and more specifically to a differentiated learning content search method based on knowledge status assessment. Background Technology
[0002] In recent years, with the rapid development of online education platforms, knowledge resources have experienced explosive growth. To help students find the content they need from this vast amount of resources, various platforms have widely adopted keyword-matching search engine technology. These systems analyze the titles, descriptions, or full texts of learning resources to build inverted indexes, quickly returning a list of relevant learning resources when users enter search terms. At the same time, users' needs for learning resources have evolved from simply finding relevant content to precisely matching their own needs.
[0003] However, existing technologies fall short of meeting user needs in several ways: First, they lack a precise resource filtering mechanism based on user capabilities. Existing technologies largely rely on keyword matching for retrieval, failing to quantify the user's knowledge level or establish a matching logic between user capabilities and resource difficulty. This results in highly homogenized results when users with different knowledge levels search for the same keyword. Users with weak foundations are prone to encountering content beyond their cognitive capacity, while advanced users need to repeatedly filter the basic resources they have already mastered. Second, resource tags exhibit a uniformity, with tags often describing the general attributes of the resources themselves. These tags are not personalized based on different users' knowledge weaknesses and ability levels, failing to intuitively convey the suitability of resources for specific users. This leads to high user decision-making costs and low retrieval efficiency. Summary of the Invention
[0004] This invention provides a differentiated learning content search method based on knowledge state assessment, which solves the problems in the existing technology, such as the inability of resources to be accurately adapted to user capabilities and the lack of personalization due to fixed tags, resulting in high user decision-making costs.
[0005] To achieve the above objectives, embodiments of the present invention provide a differentiated learning content search method based on knowledge state assessment, comprising the following steps: Step S1: Based on a preset knowledge graph, analyze the content of learning resources, and label the knowledge point nodes in the associated knowledge graph for each learning resource in the learning resource library; construct a virtual coordinate space, and map the learning resources into the virtual coordinate space to generate multi-dimensional coordinates for each learning resource; Step S2: Obtain the user's historical information, generate the user's knowledge state profile through a profile analysis strategy, and generate the user's search vector based on the knowledge state profile; Step S3: Obtain user search information and Step S4: Generate search constraints and an initial search point, and control the user to move along the direction of the user search vector in the virtual coordinate space from the initial search point to search for learning resources, so as to capture learning resources under the search constraints and form a learning resource set; Step S5: For each learning resource in the learning resource set, generate a capture vector based on the coordinates of the search point where the user is located when it is captured and the coordinates of the learning resource, obtain the personalized tag that matches the capture vector of the learning resource and associate it with the learning resource; Step S6: Sort the learning resources in the learning resource set according to the user's knowledge state profile, so as to generate a sorted result list and output it.
[0006] Optionally, the step of linking each learning resource annotation in the learning resource library to a knowledge point node in the knowledge graph includes: performing text and semantic analysis on the content of the learning resource to identify the core knowledge point concept explained by the learning resource; calculating the semantic similarity between the textual expression of the core knowledge point concept and the standard name and alias of each knowledge point node in the knowledge graph; identifying knowledge point nodes with semantic similarity exceeding a preset threshold as knowledge point nodes associated with the learning resource; and storing the identifier of the identified knowledge point node as metadata associated with the learning resource.
[0007] Optionally, the user's historical information includes the user's historical exam scores for each subject and browsing history of learning content. The profiling analysis strategy includes: extracting exam score data related to each knowledge point node from the historical exam scores for each subject; calculating the user's historical average score rate for each knowledge point node based on the exam score data, as the mastery score for the corresponding knowledge point node; and analyzing the proportion of browsing of learning resources of different difficulty levels for each knowledge point node from the user's historical browsing history of learning content, to obtain the user's preference for the difficulty of learning content for each knowledge point node.
[0008] Optionally, the process of generating the knowledge state profile further includes: for knowledge point nodes in the knowledge graph, obtaining the mastery score and learning content difficulty preference of the node; based on the learning content difficulty preference, correcting the mastery score to generate the comprehensive ability assessment score of the node, and combining the comprehensive ability assessment scores of each knowledge point node into the knowledge state profile.
[0009] Optionally, the process of generating the user search vector includes: determining the user's search direction in the virtual coordinate space based on the comprehensive ability assessment score in the user's knowledge state profile, and generating a vector pointing in that direction as the user search vector.
[0010] Optionally, the generation of search constraints and the initial search point includes: semantically parsing the search information input by the user to identify the knowledge point concepts involved; mapping the knowledge point concepts to constraints on the values of corresponding dimensions in a virtual coordinate space to generate search constraints; finding the knowledge point node with the highest semantic relevance to the knowledge point concept from the knowledge graph; obtaining the coordinates of all learning resources belonging to the knowledge point node and calculating its center point coordinates; and determining the calculated center point coordinates as the initial search point.
[0011] Optionally, the generation of the personalized tag includes: for each learning resource in the learning resource set, performing natural language processing on the text content of the current learning resource to extract key entity words and topic words to form a candidate keyword set; selecting target keywords from the candidate keyword set based on the relative spatial relationship between the user and the resource represented by the capture vector; and combining the selected target keywords to generate the personalized tag.
[0012] Optionally, the learning resource capture process includes: calculating the Euclidean distance between the user's current position and the coordinates of each learning resource in real time as the user moves along the search vector direction; capturing the learning resource when the Euclidean distance of the learning resource is less than a preset threshold and meets the search constraint conditions.
[0013] Optionally, step S1 further includes: labeling each learning resource with a difficulty level based on a preset difficulty classification standard; the sorting of learning resources in the learning resource set includes: for each learning resource in the learning resource set, obtaining the user's mastery score of the knowledge point nodes associated with the learning resource, and the labeled difficulty level of the learning resource, and converting the mastery score into a corresponding baseline difficulty level; calculating the absolute value of the difference between the labeled difficulty level and the baseline difficulty level, and calculating a first matching degree based on the absolute value; calculating the historical interaction intensity value between the learning resource and the user based on historical learning content browsing records; calculating a personalized recommendation score for the learning resource by weighted summation based on the first matching degree and the historical interaction intensity value, and sorting the learning resources based on the personalized recommendation score.
[0014] Optionally, the calculation of the historical interaction intensity value includes: extracting multi-dimensional interaction behavior data of the user on the current learning resources from the user's historical learning content browsing records; and performing weighted calculation on the extracted interaction behavior data of each dimension according to preset weights to obtain the historical interaction intensity value.
[0015] This invention provides a differentiated learning content search method based on knowledge state assessment. It utilizes a knowledge graph to achieve structured association and virtual coordinate space mapping between learning resources and knowledge points. Combined with user historical data, it generates a quantified knowledge state profile and search vector, guiding users to conduct targeted searches and capture suitable resources in the virtual coordinate space. Simultaneously, it generates personalized tags for resources that fit the user's state, and then sorts and outputs resources based on the user profile. This solution constructs a comprehensive differentiated mechanism from resource annotation, targeted search, personalized tag generation to precise sorting. It achieves precise resource selection based on user capabilities, ensuring a high degree of match between resource difficulty and user knowledge level. Furthermore, it intuitively conveys resource value through personalized tags, effectively reducing user decision-making costs, significantly improving the accuracy and efficiency of resource retrieval, and optimizing the personalized learning experience. Attached Figure Description
[0016] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. In the drawings:
[0017] Figure 1 This is a flowchart of the differentiated learning content search method provided in this embodiment of the invention; Figure 2 This is a flowchart of user profile and search vector generation provided in an embodiment of the present invention; Figure 3 This is the personalized tag generation process provided in the embodiments of the present invention; Figure 4 This is a resource sorting flowchart provided in an embodiment of the present invention. Detailed Implementation
[0018] The specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are for illustration and explanation only and are not intended to limit the scope of the present invention.
[0019] It should be noted that the acquisition, transmission, storage, use, and processing of data in the technical solution of this application all comply with the relevant provisions of national laws and regulations. In the embodiments of this application, certain existing industry solutions such as software, components, and models may be mentioned. These should be considered exemplary, intended only to illustrate the feasibility of implementing the technical solution of this application, and do not imply that the applicant has already used or necessarily used such solutions.
[0020] As online education demands increasingly personalized learning experiences and precise teaching services, existing learning content search technologies suffer from several issues: single-keyword retrieval fails to match users' knowledge levels, traditional recommendations ignore dynamic knowledge states, deep knowledge graph-based retrieval is lacking, multi-dimensional fusion analysis is insufficient, and resource ranking does not comprehensively consider adaptability and interactive behavior. Therefore, developing a search solution that matches users' knowledge states, integrates multi-source information, and possesses differentiated retrieval and intelligent ranking capabilities is crucial.
[0021] To address this, this invention proposes a differentiated learning content search method based on knowledge state assessment. It constructs a knowledge graph and virtual coordinate space labeled with knowledge points as the retrieval benchmark. Based on user exam scores and browsing history, it generates a knowledge state profile and search vector. Semantic parsing determines search constraints and initial points, enabling targeted capture of suitable resources and generation of personalized tags. Finally, it integrates multiple factors such as difficulty matching degree and interaction intensity for weighted ranking, forming a closed-loop process from graph modeling, state assessment, resource capture to intelligent output. This significantly improves retrieval accuracy, personalization, and robustness to users' dynamic learning needs.
[0022] The following is combined Figures 1-4 This invention is described in detail.
[0023] like Figure 1As shown, this embodiment of the invention provides a differentiated learning content search method based on knowledge state assessment, including the following steps: Step S1: Based on a preset knowledge graph, analyze the content of learning resources, and label the knowledge point nodes in the knowledge graph associated with each learning resource in the learning resource library; construct a virtual coordinate space, and map the learning resources into the virtual coordinate space to generate multi-dimensional coordinates for each learning resource; Step S2: Obtain the user's historical information, generate the user's knowledge state profile through a profile analysis strategy, and generate the user's search vector based on the knowledge state profile; Step S3: Obtain user search information and generate search results. The system establishes constraints and an initial search point, and controls the user to move along the direction of the user's search vector in the virtual coordinate space to search for learning resources, thereby capturing learning resources under the search constraints and forming a learning resource set; Step S4: For each learning resource in the learning resource set, a capture vector is generated based on the coordinates of the user's search point when it is captured and the coordinates of the learning resource. A personalized tag that matches the capture vector is obtained and associated with the learning resource; Step S5: Based on the user's knowledge state profile, the learning resources in the learning resource set are sorted to generate a sorted result list and output it.
[0024] Among them, the knowledge graph refers to a structured semantic network built around a subject knowledge system, with knowledge points as nodes and logical relationships between knowledge points as edges, used to standardize the representation of the subject knowledge system. Each node contains metadata such as standard name, alias, and knowledge attributes. The virtual coordinate space refers to a three-dimensional space constructed with "knowledge point relevance, difficulty level, and content type" as dimensions to achieve quantitative mapping of resources. The knowledge state profile refers to a vector representation formed based on the user's mastery score of each knowledge point and difficulty preference, used to quantitatively describe the user's knowledge ability and learning needs. The user search vector refers to a spatial retrieval direction vector determined based on the knowledge state profile, used to guide targeted resource capture. The capture vector refers to a vector representing the relative relationship between the user's search position and resource coordinates, used for personalized tag generation. The personalized result set refers to a set of learning resources adapted to the user's knowledge state after targeted retrieval and constraint filtering. The personalized tag refers to a combination of keywords generated based on the capture vector and the core information of the resource, used to quickly reflect the resource's adaptation value.
[0025] The differentiated learning content search method provided in this invention first completes the structured annotation and quantitative mapping of resources through a knowledge graph and virtual coordinate space. Then, it constructs a precise knowledge state profile and search vector based on the user's multi-dimensional historical data. It achieves precise capture of suitable resources through constraints, initial search points, and directional vectors. Targeted tags are generated by combining the captured vectors, and finally, the results are output through multi-factor weighted ranking. This process deeply integrates the structured advantages of knowledge graphs, the quantitative retrieval capabilities of virtual space, and the personalized needs of user states, forming a closed-loop processing of resource modeling, state evaluation, directional capture, tag generation, and intelligent ranking. It breaks the limitations of traditional retrieval relying on single keywords, achieving precise matching of learning resources with users' knowledge capabilities and dynamic needs. This helps users quickly identify the core value of resources, improves the accuracy and robustness of retrieval, and effectively optimizes the learning experience and efficiency.
[0026] Preferably, the step of associating each learning resource annotation in the learning resource library with a knowledge point node in the knowledge graph includes: performing text and semantic analysis on the content of the learning resource to identify the core knowledge point concept explained by the learning resource; calculating the semantic similarity between the textual expression of the core knowledge point concept and the standard name and alias of each knowledge point node in the knowledge graph; determining the knowledge point node whose semantic similarity exceeds a preset threshold as the knowledge point node associated with the learning resource; and storing the identifier of the determined knowledge point node as metadata associated with the learning resource.
[0027] The core knowledge point concepts refer to the knowledge units in the learning resource text that carry the core teaching objectives. They are the core framework of the resource content and are selected through "keyword weight ranking + semantic core positioning," distinguishing them from secondary information such as auxiliary explanations and extended examples. Text and semantic analysis adopts a standardized NLP process of "preprocessing - semantic encoding - key information extraction," specifically including: Preprocessing: using the Jieba word segmentation tool to segment the resource text, filtering out meaningless words, and then using the spaCy tool for part-of-speech tagging; Semantic encoding: inputting the preprocessed text into a pre-trained BERT-base model to generate sentence-level semantic vectors, capturing the semantic relationships within the text context; Key information extraction: calculating the importance score of each word after segmentation based on the TextRank algorithm, and combining this with a subject-specific dictionary to select a set of words with high professional relevance. The preset threshold refers to a pre-set semantic similarity judgment threshold used to select knowledge point nodes that match the core knowledge point concepts, typically ranging from 0.7 to 0.9. Node identifier: refers to the unique identification symbol assigned to each knowledge point node in the knowledge graph.
[0028] Specifically, semantic similarity is calculated using the word vector cosine similarity algorithm, as shown in the following formula:
[0029] in, The semantic similarity between the core knowledge point concept text representation A and the knowledge point node name B in the knowledge graph is represented by a value ranging from [0,1]. The closer the value is to 1, the more consistent the semantics are. Word vectors representing the textual expression of core knowledge points and concepts; Word vectors representing the standard names of knowledge point nodes in a knowledge graph; They are respectively , The vector value in the i-th dimension; They are respectively , The L2 norm.
[0030] The preferred embodiment of this invention achieves automated and high-precision association annotation of learning resources and knowledge graph knowledge point nodes by combining a standardized text semantic analysis process with precise semantic similarity matching. This not only solves the semantic mismatch problem of traditional keyword matching, but also supports large-scale resource batch processing, greatly improving annotation efficiency and accuracy, and laying a precise resource annotation foundation for subsequent personalized search.
[0031] like Figure 2 As shown, preferably, the analysis process for the learning content difficulty preference includes: extracting exam score data related to each knowledge point node from the historical exam scores of each subject; calculating the user's historical average score rate on each knowledge point node based on the exam score data, as the mastery score of the corresponding knowledge point node; and analyzing the proportion of browsing of learning resources of different difficulty levels in each knowledge point node from the user's historical learning content browsing records to obtain the user's learning content difficulty preference for each knowledge point node.
[0032] More preferably, the process of generating the knowledge state profile includes: for knowledge point nodes in the knowledge graph, obtaining the mastery score and learning content difficulty preference of the node; based on the learning content difficulty preference, correcting the mastery score, generating the comprehensive ability assessment score of the node, and combining the comprehensive ability assessment scores of each knowledge point node into a knowledge state profile in vector form.
[0033] More preferably, the process of generating the user search vector includes: determining the user's search direction in the virtual coordinate space based on the comprehensive ability assessment score in the user's knowledge state profile, and generating a vector pointing in that direction as the user search vector.
[0034] Among them, the exam score data related to each knowledge point node refers to the score extracted from the user's historical exam structured data through a pre-constructed "question-knowledge point" mapping table based on a knowledge graph, and assigned to the score set of the corresponding knowledge point node. The mastery score is the ratio of the user's average historical exam score for a specific knowledge point node to the average total score of the corresponding questions for that knowledge point; it is a core indicator for quantifying the user's mastery of the knowledge point. Learning content difficulty preference refers to the user's browsing behavior towards resources of different difficulty levels under a specific knowledge point node. The comprehensive ability assessment score is the mastery score adjusted for difficulty preference, more closely reflecting the user's actual learning ability and needs. The user search vector refers to the retrieval direction vector mapped to the virtual coordinate space, used to guide the targeted capture of suitable resources.
[0035] Specifically, the formula for calculating degree fractions is as follows:
[0036] in, express This represents the score of the user for the question corresponding to knowledge point node i in the k-th exam. This represents the total score of the user for the question corresponding to knowledge point node i in the k-th exam; This indicates the number of times a user has participated in exams containing questions related to this knowledge point node. The formula for calculating the difficulty preference value of a knowledge point node is:
[0037] in, This represents the difficulty preference value of the knowledge point node, with a value range of [1,5]. The larger the value, the higher the preference for more difficult resources. express express The basic difficulty j=1 is assigned a weight of 1, intermediate j=2 is assigned a weight of 1.2, intermediate j=3 is assigned a weight of 1.5, advanced j=4 is assigned a weight of 1.8, and challenging j=5 is assigned a weight of 2.0, with a preference weight for higher difficulty resources. The formula for calculating the comprehensive ability assessment score is as follows:
[0038] in, This represents the comprehensive ability assessment score for each knowledge point node, with a value range of [0, 1.4]. This represents the correction factor, which is fixed at 0.1. This indicates the degree to which the difficulty preference deviates from the medium level. >3 indicates a preference for high difficulty; the score will increase after correction. A score less than 3 indicates a preference for basic difficulty; the score decreases slightly after adjustment. The quantification formula for user search vectors is:
[0039] in, This represents the user's personalized search vector; m represents the total number of knowledge point nodes in the knowledge graph. Represents the unit vector of knowledge point node i in the virtual coordinate space; This is a weighting coefficient; the lower the overall capability assessment score, the greater the weight and the higher the search priority.
[0040] For example, taking the knowledge point of "quadratic formula for solving equations in one variable" in junior high school mathematics as an example: User Xiaoming's total scores for the corresponding questions on this knowledge point in his last three exams were 10, 15, and 12 points respectively, with individual scores of 8, 12, and 9 points. Verification using the 3σ principle showed no abnormal data, so his mastery score is calculated as (8+12+9) / (10+15+12)×100%≈78.38%. In Xiaoming's browsing history for this knowledge point, he viewed resources of difficulty level 1 2 times, difficulty level 2 5 times, difficulty level 3 8 times, difficulty level 4 6 times, and difficulty level 5 3 times. The weighted difficulty preference value is (2×1+5×1.2+...). 8×1.5+6×1.8+3×2.0) / (2+5+8+6+3)=38.8 / 24≈1.617; then, using the comprehensive ability assessment score calculation formula, the comprehensive ability assessment score is calculated as 78.38%×(1+0.1×(1.617-3))≈78.38%×0.8617≈67.55%; finally, the 67.55% of this knowledge point is combined with the comprehensive ability assessment scores of other knowledge points such as "factorization" and "quadratic function graph" according to the knowledge graph to form Xiaoming's mathematical subject knowledge status vector portrait [67.55%,82.10%,59.32%,...]. Subsequently, assuming the unit vector for "quadratic equation root-solving formula" in the three-dimensional virtual coordinate space is (7.8, 3.5, 2.1), the unit vector for "factorization" is (6.2, 4.1, 1.8), and the unit vector for "quadratic function graph" is (9.1, 3.8, 2.3), calculate the weight coefficients for each knowledge point: weight coefficient for "quadratic equation root-solving formula" = 1 - 67.55% / 140% ≈ 1 - 0.4825 ≈ 0.5175, weight coefficient for "factorization" = 1 - 82.10% / 140% ≈ 0.4136, and weight coefficient for "quadratic function graph" = 1 - 59.32% / 140% ≈ 0.5763; calculate the contribution component of each node: "quadratic equation root-solving formula". The component is calculated as follows: Component = 0.5175 × (7.8, 3.5, 2.1) ≈ (4.04, 1.81, 1.09), "Factorization" component = 0.4136 × (6.2, 4.1, 1.8) ≈ (2.56, 1.69, 0.75), "Quadratic function graph" component = 0.5763 × (9.1, 3.8, 2.3) ≈ (5.24, 2.19, 1.32); The user search vector for Xiaoming's mathematics subject is calculated using the quantization formula of the user search vector: V = (4.04 + 2.56 + 5.24 + ..., 1.81 + 1.69 + 2.19 + ..., 1.09 + 0.75 + 1.32 + ...) ≈ (21.3, 13.5, 8.7).
[0041] The preferred embodiment of this invention collects users' historical exam scores and learning browsing records from multiple dimensions, accurately analyzes the mastery scores and learning difficulty preferences of each knowledge point node, and then generates a comprehensive ability assessment score that matches the user's actual ability after difficulty preference correction. This constructs a quantitative knowledge status profile and generates a targeted search vector, which not only solves the problems of the traditional user status assessment being singular and one-sided, but also achieves a precise binding between the user's knowledge ability and the search direction, making subsequent resource retrieval more targeted. It provides an accurate and reliable user ability benchmark for differentiated resource screening, personalized tag generation, and intelligent sorting, significantly improving the personalization adaptability and retrieval accuracy of the entire search method.
[0042] Preferably, the generation of the search constraints and the initial search point includes: semantically parsing the search information input by the user to identify the knowledge point concepts involved; mapping the knowledge point concepts to constraints on the values of corresponding dimensions in a virtual coordinate space to generate search constraints; finding the knowledge point node with the highest semantic relevance to the knowledge point concept from the knowledge graph; obtaining the coordinates of all learning resources belonging to the knowledge point node and calculating its center point coordinates; and determining the calculated center point coordinates as the initial search point.
[0043] Among them, user search information refers to the search keywords, phrases, or natural language queries entered by the user, which is the core input information that triggers the retrieval; semantic parsing refers to the process of extracting core knowledge point concepts by segmenting, tagging, identifying entities, and understanding the search information through natural language processing technology; knowledge point concept mapping refers to the process of semantically matching the parsed core concepts with knowledge point nodes in the knowledge graph and transforming them into the corresponding dimensional value range in the virtual coordinate space; search constraints refer to the rules that limit the retrieval range within the virtual coordinate space, used to filter irrelevant resources and focus on the user's target knowledge points; the initial search point refers to the starting position of the retrieval, which is the coordinate center point determined based on the user's search intent and the distribution of knowledge graph node resources; semantic relevance refers to the degree of semantic matching between the parsed knowledge point concepts and the standard names and aliases of knowledge graph nodes, used to filter core related nodes.
[0044] Specifically, a BERT pre-trained model is used to segment and semantically encode user search information. After extracting keyword entities, non-knowledge point entities are excluded through knowledge graph dictionary matching to accurately determine core knowledge point concepts. The semantic encoding vector of the core knowledge point concept is then compared with the semantic vector of each knowledge point node in the knowledge graph using cosine similarity calculation. The semantic relevance value ranges from 0 to 1, with higher similarity indicating a stronger association between the concept and the node. Subsequently, the core knowledge point concept is mapped to the "knowledge point relevance" dimension in the virtual coordinate space, with the value of this dimension constrained to be no less than 8.0. The system first calculates the number of points, and then adds corresponding dimension value restrictions based on the difficulty preference in the user's knowledge state profile, forming a multi-dimensional search constraint. Next, knowledge point nodes with a semantic relevance of no less than 0.9 are selected as core relevance nodes. All resource coordinates mapped to the virtual coordinate space under this node are collected. After counting the total number of resources, the sum of the coordinate values of all resources in the three dimensions of "knowledge point relevance, difficulty level, and content type" is calculated. Then, the sum of each dimension is divided by the total number of resources. The result is the center point coordinate of that dimension. This three-dimensional coordinate is the initial search point for subsequent retrieval.
[0045] For example, continuing the case of "quadratic formula for solving quadratic equations" from junior high school mathematics: User Xiaoming inputs the search information "practice problems on the application of quadratic formula for solving quadratic equations". After semantic parsing, the core knowledge point concept "quadratic formula for solving quadratic equations" is extracted. After calculating the semantic relevance of this concept with the knowledge graph nodes, the similarity of the "quadratic formula for solving quadratic equations" node is determined to be 0.96, which is the node with the highest relevance. Combining the virtual coordinate space dimension and Xiaoming's difficulty preference, search constraints are generated: "knowledge point relevance ≥ 8.0 points", "difficulty level ≤ 4.0 points", and "content type = 2 points". There are 15 practice problems under this node. After collecting the three-dimensional coordinates of all the practice problems, the sum of the coordinate values of the three dimensions is calculated and divided by 15. Finally, the coordinates of the center point are approximately (9.1, 3.6, 2), which is the initial search point.
[0046] In a preferred embodiment of this invention, core knowledge concepts are identified through precise semantic analysis of user search information. These concepts are then mapped to multi-dimensional search constraints in a virtual coordinate space. Simultaneously, the center point of the knowledge node resource coordinates with the highest relevance is selected as the initial search point. This solves the problem of invalid traversal caused by random starting points and broad search ranges in traditional retrieval methods. Furthermore, the constraints filter out irrelevant resources in advance, significantly improving the efficiency of targeted retrieval. More importantly, this technical solution ensures that the retrieval starting point closely matches the user's search intent, providing a scientific and precise starting point for accurately capturing suitable resources along the search vector. This further guarantees the accuracy and personalized adaptability of the entire differentiated search process.
[0047] like Figure 3As shown, preferably, the generation of the personalized tag includes: for each learning resource in the learning resource set, performing natural language processing on the text content of the current learning resource to extract key entity words and topic words to form a candidate keyword set; selecting target keywords from the candidate keyword set based on the relative spatial relationship between the user and the resource represented by the capture vector; and combining the selected target keywords to generate the personalized tag.
[0048] Among them, the candidate keyword set refers to the set of key entity words and topic words that can represent the core content of the resource after natural language processing of the resource text content; the capture vector refers to the relative spatial vector between the user's search position in the virtual coordinate space and the resource's coordinate position, which can reflect the adaptation relationship between the user's knowledge state and the resource; the target keyword refers to the keyword that is highly matched with the user's search needs and knowledge state, selected from the candidate keyword set in combination with the spatial relationship of the capture vector; and the personalized tag refers to a short label composed of target keywords, used to intuitively reflect the adaptation value and core content of the resource to the user.
[0049] Specifically, a word segmentation tool is used to segment and remove stop words from each resource in the resource set. After filtering out meaningless function words, key entity words and topic words that can represent the core content of the resources are extracted using the term frequency-inverse document frequency algorithm, and integrated to form a candidate keyword set. Then, the relative spatial relationship between the user and the resource represented by the capture vector is used for filtering. If the direction of the capture vector points to the coordinate area corresponding to the user's weak knowledge point, keywords that are strongly related to the weak knowledge point are selected first. If the distance between the capture vectors is less than a preset threshold, it means that the resource and the user's knowledge state have a high degree of fit, and keywords that reflect the difficulty and type of the resource are additionally selected. Finally, the selected target keywords are combined according to the logic of knowledge point + difficulty + fit to generate concise and clear personalized tags.
[0050] In a preferred embodiment of this invention, core keywords of resources are accurately extracted using natural language processing technology to form a candidate set. These keywords are then combined with the spatial relationship between users and resources represented by capture vectors to filter target keywords, ultimately generating personalized tags. This solves the problems of traditional general tags lacking specificity and failing to reflect the adaptability of resources to users' knowledge states. It also allows users to quickly and intuitively identify the core content, difficulty level, and suitability value of resources, effectively reducing the cost of ineffective browsing. Furthermore, this technical solution closely integrates with the capture results of targeted retrieval, ensuring that tag generation highly matches the user's search intent and knowledge gaps. This not only enhances the practical value of the tags but also further improves the user experience and learning assistance effect of the entire search method.
[0051] Preferably, the learning resource capture process includes: calculating the Euclidean distance between the user's current position and the coordinates of each learning resource in real time as the user moves along the search vector direction; capturing the learning resource when the Euclidean distance of the learning resource is less than a preset threshold and meets the search constraint conditions.
[0052] Among them, the user's current location refers to the real-time coordinates of the user in the virtual coordinate space during the directional retrieval process along the search vector, which is continuously updated as the retrieval progresses; the Euclidean distance refers to the straight-line distance between the user's current location coordinates and the resource coordinates in the three-dimensional virtual space, which is the core quantitative indicator for measuring the degree of adaptation between the user's retrieval location and the resource; the preset threshold refers to the pre-set Euclidean distance critical value, which is determined by the system based on the resource distribution density and personalized adaptation requirements, and is used to define whether the resource is within the effective retrieval range; resource capture refers to the process of filtering out resources that simultaneously meet the distance threshold and search constraints and incorporating them into the personalized result set.
[0053] Specifically, the user moves gradually along the generated personalized search vector within the virtual coordinate space. During this movement, the system retrieves the coordinates of all unfiltered learning resources in the virtual coordinate space in real time, calculates the Euclidean distance between the user's current position and each resource coordinate, and simultaneously verifies whether the resource meets the pre-set search constraints. Only when the Euclidean distance of a resource is less than a preset threshold and fully satisfies all dimensional restrictions in the search constraints will the system determine that the resource is a suitable resource and perform the capture operation. Resources that do not meet any of the conditions are skipped directly, and the system continues to move along the search vector to search, thereby ensuring the accuracy and suitability of the captured resources.
[0054] The preferred embodiment of the present invention achieves accurate capture of suitable resources, effectively filters irrelevant resources, reduces invalid searches, and significantly improves the efficiency and personalized matching of targeted searches through a dual determination mechanism of Euclidean distance threshold and search constraints.
[0055] like Figure 4As shown, preferably, step S1 further includes: labeling the difficulty level of each learning resource based on a preset difficulty classification standard; the sorting of learning resources in the learning resource set includes: for each learning resource in the learning resource set, obtaining the user's mastery score of the knowledge point nodes associated with the learning resource, and the labeled difficulty level of the learning resource, and converting the mastery score into the corresponding baseline difficulty level; calculating the absolute value of the difference between the labeled difficulty level and the baseline difficulty level, and calculating a first matching degree based on the absolute value; calculating the historical interaction intensity value between the learning resource and the user based on historical learning content browsing records; calculating the personalized recommendation score of the learning resource by weighted summation based on the first matching degree and the historical interaction intensity value, and sorting the learning resources based on the personalized recommendation score.
[0056] More preferably, the calculation of the historical interaction intensity value includes: extracting multi-dimensional interaction behavior data of the user on the current learning resource from the user's historical learning content browsing record; and performing weighted calculation on the extracted interaction behavior data of each dimension according to preset weights to obtain the historical interaction intensity value.
[0057] The first matching degree refers to a quantitative indicator that measures the degree of matching between the difficulty of the learning resource annotation and the user's baseline knowledge point difficulty. The smaller the absolute value of the difference, the higher the matching degree. A non-linear mapping is used to convert the absolute value into a score in the [0,1] interval to avoid insufficient discrimination caused by linear transformation. The historical interaction intensity value refers to the preference quantification value calculated based on the user's multi-dimensional interaction behavior with the learning resource, reflecting the user's degree of interest in this type of resource. The multi-dimensional interaction behavior data includes 5 core behavior indicators: browsing time, collection operation, liking behavior, learning completion rate, and sharing behavior. Different behaviors are assigned differentiated weights according to their importance.
[0058] Specifically, the preset difficulty classification standard is a five-level difficulty system based on the corresponding subject curriculum standards. The difficulty levels are divided into levels 1 to 5, corresponding to basic, intermediate, advanced, and challenging levels, respectively. Each level is defined by the cognitive level of knowledge points, the complexity of content, the number of cross-knowledge point connections, and the requirements for practical application. Level 1 is basic concept content that requires memorization without complex logical deduction and can be mastered with only a single knowledge point. Level 2 is simple application content of basic knowledge points that requires understanding the core connotation and involves 1-2 related basic knowledge points. Level 3 is comprehensive application content of knowledge points that requires mastering logical deduction and involves 3-4 related knowledge points and includes simple cross-module integration. Level 4 is in-depth variation application content of knowledge points that requires understanding the essence and extension and involves 5 or more related knowledge points and includes complex cross-module / cross-chapter integration. Level 5 is the final or extended application content of knowledge points that requires transfer and innovation capabilities and involves multi-dimensional cross-disciplinary integration. This difficulty level is labeled by extracting the text features of learning resources using the TextCNN+Softmax difficulty classification model.
[0059] Specifically, the absolute value of the difference is mapped to the first degree of matching using the Sigmoid function, with the following formula:
[0060] in, This represents the absolute value of the difference between the marked difficulty and the benchmark difficulty. The first match degree; When =0, ≈0.731, the highest matching degree; When =1, =0.5; When ≥3, A score ≤0.119 indicates a low match, with the match strength decreasing non-linearly as the difference increases. The formula for calculating historical interaction strength is:
[0061] in, This represents the historical interaction strength value, with a range of [0,1]. Indicates the completion rate of learning, weights =0.4; This represents a "favorite" operation, which, after normalization, is either 0 or 1, with a weight. =0.25; This represents a "like" action, which, after normalization, is either 0 or 1, with a weight. =0.1; This represents the percentage of browsing time, normalized to [0,1], with weights. =0.15; Represents sharing behavior, normalized to 0 or 1, with weights... =0.1.
[0062] The formula for calculating personalized recommendation scores is:
[0063] in, This represents the personalized recommendation score, with a value range of [0,1]. The higher the value, the higher the ranking. This represents the matching degree weight coefficient, which defaults to 0.8.
[0064] Taking the topic of "Application of Quadratic Equations in One Variable" in junior high school mathematics as an example: the mastery score is 0.7, corresponding to the benchmark difficulty level. =4, the annotation difficulty of a certain resource in the personalized results set is 4. First, calculate the absolute difference ΔD = |4-4| = 0, then the first matching degree. ≈0.731; the multidimensional interaction data of this resource represents the completion learning rate. =0.9, Favorites =1. Like =0, percentage of browsing time =0.8, Share =0, calculate historical interaction strength value by weight. =0.4×0.9+0.25×1+0.1×0+0.15×0.8+0.1×0=0.73; Final personalized recommendation score =0.8×0.731+0.2×0.73≈0.7308, this score will be used as the core basis for resource ranking.
[0065] In a preferred embodiment of the present invention, the resource difficulty matching degree is combined with the historical interaction intensity value quantified by the user's multi-dimensional interactive behavior, and a personalized recommendation score is calculated using dynamic weighting. This ensures the adaptability of resources to users' knowledge and abilities while also taking into account users' learning interests and preferences. At the same time, it relies on a distributed computing framework to achieve efficient batch processing, solving the problem that traditional recommendation ranking relies solely on popularity or keyword matching and lacks accuracy. This makes the recommendation results ranking more in line with users' personalized needs, improving the effectiveness of resource reach and the user's learning experience.
[0066] In summary, the differentiated learning content search method based on knowledge state assessment provided by this invention constructs a knowledge graph integrating semantic information of knowledge points and a three-dimensional virtual coordinate space to achieve structured annotation and quantitative mapping of learning resources. Then, based on the user's historical exam scores and browsing history, it accurately generates a knowledge state profile and personalized search vectors. Semantic parsing is used to determine search constraints and initial search points. A dual judgment of Euclidean distance threshold and constraint conditions is used to achieve targeted capture of suitable resources. Targeted personalized tags are generated based on the captured vectors. Finally, the difficulty matching degree and historical interaction intensity are integrated to calculate the recommendation score and complete resource ranking, forming a complete closed loop of resource modeling, state assessment, targeted retrieval, tag generation, and intelligent ranking. This effectively solves the problems of traditional learning content search relying on single keywords, mismatch between resource recommendations and user knowledge levels, low retrieval efficiency, and lack of personalized tags. It significantly improves the accuracy of search results, personalization, and user learning experience, providing a learning resource retrieval solution highly tailored to the needs of users with different knowledge levels.
[0067] The above description is merely a preferred embodiment of the technical solution of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A differentiated learning content search method based on knowledge state assessment, characterized in that, Includes the following steps: Step S1: Based on the preset knowledge graph, parse the content of the learning resources and label the knowledge point nodes in the knowledge graph associated with each learning resource in the learning resource library; construct a virtual coordinate space and map the learning resources into the virtual coordinate space to generate multi-dimensional coordinates for each learning resource; Step S2: Obtain the user's historical information, generate the user's knowledge state profile through a profile analysis strategy, and generate the user's search vector based on the knowledge state profile; Step S3: Obtain user search information and generate search constraints and initial search points, and control the user to move in the virtual coordinate space from the initial search point along the direction of the user search vector and search for learning resources, so as to capture learning resources under search constraints and form a learning resource set; Step S4: For each learning resource in the learning resource set, generate a capture vector based on the search point coordinates where the user is located when it is captured and the coordinates of the learning resource. Obtain the personalized tag that matches the capture vector for the learning resource and associate it with the learning resource. Step S5: Based on the user's knowledge status profile, sort the learning resources in the learning resource set to generate a sorted result list and output it.
2. The differentiated learning content search method according to claim 1, characterized in that, The knowledge point nodes in the knowledge graph associated with each learning resource annotation in the learning resource repository include: Text and semantic analysis is performed on the content of learning resources to identify the core knowledge points and concepts explained in the learning resources; Calculate the semantic similarity between the textual representation of the core knowledge point concept and the standard name and alias of each knowledge point node in the knowledge graph; Knowledge point nodes whose semantic similarity exceeds a preset threshold are identified as knowledge point nodes associated with the learning resources; The identifiers of the identified knowledge point nodes are stored as metadata and associated with the learning resources.
3. The differentiated learning content search method according to claim 1, characterized in that, The user's historical information includes their historical exam scores in various subjects and their browsing history of learning content. The profiling analysis strategy includes: Extract the exam score data related to each knowledge point from the exam scores of each subject in history; Based on exam score data, calculate the user's historical average score rate at each knowledge point node, which serves as the mastery score for the corresponding knowledge point node. By analyzing users' browsing history of learning content, we can determine the proportion of users browsing learning resources of different difficulty levels in each knowledge point node, and thus obtain users' difficulty preferences for learning content in each knowledge point node.
4. The differentiated learning content search method according to claim 3, characterized in that, The process of generating the knowledge state profile further includes: for knowledge point nodes in the knowledge graph, obtaining the mastery score and learning content difficulty preference of the node; based on the learning content difficulty preference, correcting the mastery score to generate the comprehensive ability assessment score of the node, and combining the comprehensive ability assessment scores of each knowledge point node to form the knowledge state profile.
5. The differentiated learning content search method according to claim 4, characterized in that, The process of generating the user search vector includes: determining the user's search direction in the virtual coordinate space based on the comprehensive ability assessment score in the user's knowledge state profile, and generating a vector pointing in that direction as the user search vector.
6. The differentiated learning content search method according to claim 4, characterized in that, The generation of the search constraints and the initial search point includes: Perform semantic analysis on the search information entered by the user to identify the knowledge points and concepts involved; The knowledge point concepts are mapped to constraints on the values of corresponding dimensions in a virtual coordinate space to generate search constraints. From the knowledge graph, find the knowledge point node with the highest semantic relevance to the concept of the knowledge point; Obtain the coordinates of all learning resources belonging to this knowledge point node, and calculate the coordinates of its center point; The calculated center point coordinates are used as the initial search point.
7. The differentiated learning content search method according to claim 1, characterized in that, The generation of the personalized tags includes: For each learning resource in the learning resource set, natural language processing is performed on the text content of the current learning resource to extract key entity words and topic words, forming a set of candidate keywords; Based on the relative spatial relationship between users and resources represented by the capture vector, target keywords are selected from the candidate keyword set; The selected target keywords are combined to generate the personalized tags.
8. The differentiated learning content search method according to claim 1, characterized in that, The process of capturing the learning resources includes: calculating the Euclidean distance between the user's current position and the coordinates of each learning resource in real time as the user moves along the search vector direction; capturing the learning resource when the Euclidean distance of the learning resource is less than a preset threshold and the search constraint condition is met.
9. The differentiated learning content search method according to claim 3, characterized in that, Step S1 further includes: labeling each learning resource with a difficulty level based on a preset difficulty classification standard; the sorting of learning resources in the learning resource set includes: For each learning resource in the learning resource set, obtain the user's mastery score of the knowledge point nodes associated with the learning resource, as well as the marked difficulty level of the learning resource, and convert the mastery score into the corresponding baseline difficulty level. Calculate the absolute value of the difference between the labeled difficulty level and the benchmark difficulty level, and calculate the first matching degree based on the absolute value; Based on the browsing history of historical learning content, calculate the historical interaction intensity value between the learning resource and the user; Based on the first matching degree and the historical interaction intensity value, the personalized recommendation score of the learning resource is calculated by weighted summation, and the learning resources are ranked based on the personalized recommendation score.
10. The differentiated learning content search method according to claim 9, characterized in that, The calculation of the historical interaction intensity value includes: extracting multi-dimensional interaction behavior data of the user on the current learning resources from the user's historical learning content browsing records; and performing weighted calculation on the extracted interaction behavior data according to preset weights to obtain the historical interaction intensity value.