Manual navigation method and system based on knowledge graph
By extracting node update events and user access behaviors from the knowledge graph manual navigation and combining them with Markov models to predict navigation paths, the problem of dynamic changes in node content updates and user behavior is solved. This enables accurate filtering and dynamic recommendation of navigation paths, improving navigation efficiency and user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA STATE SHIPBUILDING CORP LTD RESEARCH INSTITUTE 719
- Filing Date
- 2025-08-11
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies for knowledge graph navigation fail to adequately consider the dynamic changes in node content updates and user behavior, resulting in navigation paths failing to respond promptly to content updates, reduced accuracy and completeness of information retrieval, and a lack of specificity and flexibility.
By acquiring node update events and user access behavior data from the interactive electronic manual knowledge graph, the node update frequency and access volatility are calculated, behavior status labels are generated, Markov transition models are used to predict node behavior trends, the stability and reliability of navigation paths are evaluated, and the optimal navigation path sequence is output.
It enables dynamic responses to node content updates and user behavior, improving the accuracy and personalization of navigation paths, enhancing the adaptability of information organization and user experience, and significantly improving navigation efficiency and content matching.
Smart Images

Figure CN120975202B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of navigation technology, and in particular to a manual navigation method and system based on knowledge graphs. Background Technology
[0002] Navigation technology encompasses multiple areas such as information retrieval, knowledge organization, and route planning, and is a key technology used to help users efficiently obtain the information or resources they need. The core of this technology lies in building clear and efficient navigation logic, presenting information through structured methods such as classification, tags, and relationships, enabling users to quickly locate target content within a large amount of information. Navigation technology often combines semantic networks, visual interface design, and structured knowledge representation methods to provide human-computer interaction support in complex information environments. It is widely used in scenarios such as document management, database retrieval, and knowledge system construction, and is characterized by a high dependence on data organization structure and query strategies.
[0003] The knowledge graph-based manual navigation method and system refers to a method and system that utilizes a knowledge graph constructed from entities and their semantic relationships to achieve structured expression of manual content and construction of navigation paths. This patent addresses the problem of low information retrieval efficiency for users when using manuals. It employs entity extraction, relationship construction, semantic classification, and path visualization to organize entries, chapters, and topics in the manual into knowledge graph nodes. By setting query conditions and navigation entry points, the system displays and locates content according to preset relationships. The system generates multi-level query paths based on user intent and performs navigation jumps based on category hierarchy and relationships between entities, thereby achieving semantic-level navigation expression of the manual content.
[0004] Existing technologies for constructing knowledge graphs and providing manual navigation rely on static entity relationships and fixed semantic classification structures, failing to fully consider the impact of node content updates and the dynamic changes in user access behavior. Due to a lack of in-depth characterization of node update frequency and user behavior fluctuations, navigation paths may fail to respond promptly to content updates, leading to problems such as path failures and distorted node recommendations when users search for information. For example, if newly added or modified entries in the manual content are not reflected in the navigation path in a timely manner, users may still be guided to old nodes after significant updates, reducing the accuracy and completeness of information retrieval. Furthermore, existing technologies typically rely on only a single relationship chain for navigation, failing to fully integrate user interaction feedback and node state evolution. This results in a lack of specificity and flexibility in the overall navigation, affecting the performance of the navigation system in dynamic scenarios and potentially leading to increased user dwell time, higher bounce rates, and decreased navigation trust. Summary of the Invention
[0005] To address the technical problems existing in the prior art, embodiments of the present invention provide a manual navigation method and system for knowledge graphs. The technical solution is as follows:
[0006] To achieve the above objectives, the present invention adopts the following technical solution: a manual navigation method for knowledge graphs, comprising the following steps:
[0007] S1: Obtain the content version records of multiple nodes in the knowledge graph of the interactive electronic manual and extract node update events. Calculate the content update frequency based on the differences in node summary text, calculate the similarity of text structure changes, and generate a node update frequency sequence.
[0008] S2: Based on the node update frequency sequence, obtain the corresponding user access behavior data, calculate the node access behavior fluctuation, and quantify the fluctuation through sliding window variance to generate an access fluctuation score.
[0009] S3: Based on the access volatility score, obtain the user interaction behavior status to generate node behavior feature vectors, calculate the node response time, jump success rate and bounce rate, classify and attach labels to generate behavior status labels;
[0010] S4: Extract label transition pairs based on the behavior state labels, analyze the conditions and rules of label transitions through Markov transition models, calculate the evolution trend of labels and predict node behavior, and generate label evolution path rules.
[0011] S5: Evaluate the stability of each candidate path based on the access volatility score and the tag evolution path rule, calculate the path jump credibility score according to the current behavior state of the node in the interactive electronic manual, filter the paths, and output the optimal navigation path sequence.
[0012] As a further aspect of the present invention, the node update frequency sequence includes the number of content version changes, the magnitude of differences in summary text, and the similarity score of text structure changes; the access volatility score includes the standard deviation of node access frequency, the sliding window variance score, and the coefficient of variation of access behavior; the behavior status label specifically includes the response time interval label, the jump success rate level, the bounce rate level, and the behavior status category identifier; the label evolution path rule includes the state transition probability and the Markov transition path set; and the navigation path sequence specifically refers to the path stability score, the jump credibility score, and the optimal path node sequence.
[0013] As a further aspect of the present invention, step S1 specifically comprises:
[0014] S101: Obtain the content version records of all nodes in the interactive electronic manual knowledge graph and extract the node update association fields, including update timestamp, update summary text and corresponding node identifier, and perform aggregation and sorting. Call the sorted node update summary text data and obtain the node summary text difference value sequence.
[0015] S102: Based on the node summary text difference value sequence, count the number of differences value changes of multiple nodes in multiple time periods, combine with fixed time interval segments, call the difference value change count data and time period length for normalization processing, calculate the node content change ratio, and obtain the node content update frequency.
[0016] S103: Obtain the summary text of two adjacent versions based on the node content update frequency, calculate the consistency metric of the structural content, call the node version structure segmentation result and keyword distribution, calculate the structural change similarity of adjacent texts and traverse the nodes to generate a node update frequency sequence.
[0017] As a further aspect of the present invention, step S2 specifically comprises:
[0018] S201: Based on the node update frequency sequence, extract and aggregate the continuous access record data of users on the corresponding nodes, filter the access segments that meet the node frequency continuity condition, arrange them in time order and construct a continuous sequence of user access behavior under each node to obtain the node access continuous sequence.
[0019] S202: Divide the continuous sequence of node visits into sliding windows of fixed length, calculate the window adjustment variance of the frequency of visits within the window, analyze the dispersion of window visits, and combine the calculation results of all windows to form a set of visits fluctuations for multiple nodes, and obtain a set of node visits fluctuation values.
[0020] S203: Call the window variance data of each node in the set of access volatility values, normalize the mean of the volatility values of multiple nodes, calculate the difference between the normalized volatility value and the volatility score benchmark value, convert it into a value within the score interval, and generate an access volatility score.
[0021] As a further aspect of the present invention, the volatility scoring benchmark value is statistically analyzed by performing percentile analysis on the distribution of the sliding variance values of the volatility of all node access behaviors, and selecting the 75th percentile value in the distribution as the threshold benchmark.
[0022] As a further aspect of the present invention, step S3 specifically comprises:
[0023] S301: Based on the access volatility score, obtain the node click frequency, page dwell time and operation jump path in the user interaction behavior and normalize them. Calculate the normalized click frequency, normalized dwell time and normalized jump path depth, combine them into a multi-dimensional feature group and vectorize them to generate a node behavior vector group.
[0024] S302: Call the normalized jump path depth and normalized dwell time in the node behavior vector group, calculate the average response time and jump frequency of each node, combine the ratio of the node's normalized click frequency to the jump frequency, extract the corresponding jump success frequency and perform ratio conversion, and obtain the jump success rate set.
[0025] S303: Based on the numerical structure of the multiple nodes in the jump success rate set, the jump success rate is used as the first judgment condition and the bounce rate is used as the auxiliary condition for multi-interval classification judgment. The labels are assigned in layers according to the jump success rate interval threshold and combined with the bounce rate distribution to obtain the behavior status label group.
[0026] As a further aspect of the present invention, the jump success rate interval threshold is calculated by statistically analyzing the number of successful jumps and the total number of jumps for all users across all nodes, and the distribution interval of the jump success rate is calculated. The upper and lower quartiles or a set proportion interval in the distribution are used as the threshold setting.
[0027] As a further aspect of the present invention, step S4 specifically comprises:
[0028] S401: Based on the behavior state label sequence, extract any two adjacent labels to form a label transition pair, arrange multiple transition pairs in order according to the node behavior time axis, count the occurrence frequency of each type of label transition pair, and obtain the label transition frequency dataset.
[0029] S402: Call the label transition frequency dataset, normalize the state transition probabilities between transition pairs, calculate the conditional probability value of transitioning from any state to the target state through the Markov transition model, and generate the label state transition probability distribution value.
[0030] S403: Based on the label state transition probability distribution value, taking the current node behavior label as the initial state, calculate the cumulative probability value of the state transition path within a finite step range, select the path with the optimal cumulative probability value as the behavior evolution path, and generate label evolution path rules.
[0031] As a further aspect of the present invention, step S5 specifically comprises:
[0032] S501: Calculate the stability index of each path based on the access volatility score, and evaluate the stability of each path in combination with the tag evolution path rules to obtain the path stability score.
[0033] S502: Based on the path stability score, calculate the jump credibility score for each path according to the current behavior status of the node in the interactive manual, combined with the node's access frequency and bounce rate. The jump credibility score is calculated based on the node's behavior status and path matching degree to generate a path jump credibility score.
[0034] S503: Based on the path jump credibility score and path stability score, the candidate paths are filtered, and the path sequences with stability compliance indicators and jump credibility scores exceeding the credibility score benchmark value are selected and sorted to generate the optimal navigation path sequence.
[0035] The credibility score benchmark is set by statistically analyzing the mean of the jump stability of candidate paths under multiple node behavior states and the median of the credibility score distribution.
[0036] On the other hand, a manual navigation system for knowledge graphs is provided, which is applied to a manual navigation method for knowledge graphs. The system includes:
[0037] The content update module obtains the content version records of multiple nodes in the interactive manual knowledge graph and extracts node update events. It calculates the content update frequency based on the differences in node summary text, calculates the similarity of text structure changes, generates a node update frequency sequence, and transmits it to the fluctuation analysis module.
[0038] The fluctuation analysis module obtains the corresponding user access behavior data based on the node update frequency sequence, calculates the fluctuation of node access behavior, quantifies the fluctuation through sliding window variance, generates an access fluctuation score, and transmits it to the behavior label module.
[0039] The behavior labeling module obtains the user interaction behavior status based on the access volatility score, generates node behavior feature vectors, calculates the node's response time, jump success rate and bounce rate, classifies and adds labels, generates behavior status labels and transmits them to the evolution prediction module.
[0040] The evolution prediction module extracts label transition pairs based on the behavioral state labels, analyzes the conditions and rules of label transitions through a Markov transition model, calculates the evolution trend of labels and predicts node behavior, generates label evolution path rules and transmits them to the path filtering module.
[0041] The path filtering module evaluates the stability of each candidate path based on the access volatility score and the tag evolution path rule, calculates the path jump credibility score according to the current behavior state of the node in the interactive manual, filters the paths, and outputs the optimal navigation path sequence.
[0042] The beneficial effects of the technical solutions provided by the embodiments of the present invention include at least the following:
[0043] By extracting node content version records and update events, and combining the differences in node summary text with text structure similarity, the dynamic update characteristics of nodes can be accurately characterized, thereby quantifying the sensitivity and activity of node content changes. Simultaneously, fluctuations in user access behavior are mined, and sliding window variance is used for detailed analysis of access sequences, helping to reveal users' immediate responses to information updates and their behavioral preferences. In generating node behavior feature vectors, multi-dimensional features such as response time, jump success rate, and bounce rate are considered. Combined with classification and tag attachment, this allows for a more granular capture of users' behavioral intentions and state characteristics when migrating between nodes. Furthermore, by extracting tag transition pairs and using Markov transition probability analysis to analyze user behavior patterns, future node evolution trends can be predicted, making the dynamic adjustment of navigation paths more accurate and forward-looking. Combining access fluctuation scores and behavioral evolution paths, the stability and credibility of each candidate path are comprehensively scored, ultimately achieving accurate selection and dynamic recommendation of navigation paths. By continuously tracking node content updates and changes in user behavior, the overall solution has generated significant gains in terms of information organization adaptability, navigation path personalization, and node recommendation stability, thereby significantly improving navigation efficiency, content matching accuracy, and user experience. Attached Figure Description
[0044] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0045] Figure 1 This is a schematic diagram of the workflow of the present invention;
[0046] Figure 2 This is a system flowchart of the present invention. Detailed Implementation
[0047] The technical solution of the present invention will now be described with reference to the accompanying drawings.
[0048] In embodiments of the invention, the words "examplely," "for example," etc., are used to indicate that something is an example, illustration, or description. Any embodiment or design described as "example" in this invention should not be construed as being more preferred or advantageous than other embodiments or designs. Rather, the use of the word "example" is intended to present the concept in a specific manner. Furthermore, in embodiments of the invention, "and / or" may have both meanings, or either one of them.
[0049] In this embodiment of the invention, the terms "image" and "picture" may sometimes be used interchangeably. It should be noted that, without emphasizing the difference, they convey the same meaning. Similarly, the terms "of," "corresponding (relevant)," and "corresponding" may sometimes be used interchangeably. It should be noted that, without emphasizing the difference, they convey the same meaning.
[0050] In this embodiment of the invention, sometimes a subscript such as W1 may be written in a non-subscript form such as W1. When the difference is not emphasized, the meaning they express is the same.
[0051] To make the technical problems, technical solutions and advantages of the present invention clearer, a detailed description will be given below in conjunction with the accompanying drawings and specific embodiments.
[0052] Please see Figure 1 This invention provides a technical solution, a manual navigation method for knowledge graphs, comprising the following steps:
[0053] S1: Obtain the content version records of multiple nodes in the interactive manual knowledge graph and extract node update events. Calculate the content update frequency based on the differences in node summary text, calculate the similarity of text structure changes, and generate a node update frequency sequence.
[0054] S2: Obtain corresponding user access behavior data based on node update frequency sequence, calculate node access behavior fluctuation, and quantify the fluctuation through sliding window variance to generate access fluctuation score;
[0055] S3: Based on the access volatility score, obtain the user interaction behavior status to generate node behavior feature vectors, calculate the node response time, jump success rate and bounce rate, classify and attach labels to generate behavior status labels;
[0056] S4: Extract label transition pairs based on behavior state labels, analyze the conditions and rules of label transition through Markov transition model, calculate the evolution trend of labels and predict node behavior, and generate label evolution path rules.
[0057] S5: Evaluate the stability of each candidate path based on the access volatility score and the tag evolution path rule, calculate the path jump credibility score according to the current behavior state of the node in the interactive manual, filter the paths, and output the optimal navigation path sequence.
[0058] The node update frequency sequence includes the number of content version changes, the magnitude of differences in summary text, and the similarity score of text structure changes. The access volatility score includes the standard deviation of node access frequency, the sliding window variance score, and the coefficient of variation of access behavior. The behavior status labels specifically include response time interval labels, jump success rate level, bounce rate level, and behavior status category identifier. The label evolution path rules include state transition probability and Markov transition path set. The navigation path sequence specifically refers to the path stability score, jump credibility score, and optimal path node sequence.
[0059] Please see Figure 1 The specific steps of S1 are as follows:
[0060] S101: Obtain the content version records of all nodes in the interactive manual knowledge graph and extract the node update association fields, including update timestamp, update summary text and corresponding node identifier, and perform aggregation and sorting. Call the sorted node update summary text data and obtain the node summary text difference value sequence.
[0061] Obtain the version record table from the knowledge graph of the interactive electronic manual, extract the fields node-id, update-time, and update-summary, sort them in descending order by update-time, compare the adjacent version summary texts character by character after sorting, and count the total number of add, delete, and modify operations as the difference value. Taking the emergency evacuation route of the navigation node as an example, the summary of version v2.23.1 is the addition of the east side passage of B1 layer, and the summary of v2.23.0 is the closure of the maintenance passage of C area. After comparison and calculation, the difference value is calculated as: number of add operations (12) + number of delete operations (15) + number of modify operations (3) = 30. Set the difference threshold Δq = 25. When the difference value is ≥ 25, it is marked as a major change. The difference value sequence generation process is shown in Table 1.
[0062] Table 1: Calculation of Node Update Differences
[0063]
[0064] S102: Based on the node summary text difference value sequence, count the number of differences in multiple nodes over multiple time periods, combine a fixed time interval segment, call the difference value change count data and the time period length for normalization, calculate the node content change ratio, and obtain the node content update frequency.
[0065] A 7-day statistical period is set, and the difference value sequence is iterated. A single node is considered a valid change when its daily difference value is ≥ Δq, where Δq represents the difference threshold. Taking node NODE-457 as an example, the number of changes detected in the 2023-Q2 period is C=8, the total duration is T=91 days, and the update frequency is F=C / (T / 7)=8 / (91 / 7)=0.615. Frequency levels are set as follows: Low (0≤F<0.5), Medium (0.5≤F<1.0), High (F≥1.0). When F≥0.5, a content review mechanism is triggered; for example, NODE-457's F=0.615 requires a version rollback review.
[0066] S103: Obtain the summary text of two adjacent versions based on the node content update frequency, calculate the consistency metric of the structural content, call the node version structural segmentation results and keyword distribution, calculate the structural change similarity of adjacent texts and traverse the nodes to generate a node update frequency sequence.
[0067] Extract the summary text of adjacent versions v2.23.1 and v2.23.0 of node NODE-457, and perform structural element analysis: version number weight. Keyword matching rate Time interval Hours. Structural similarity Set a similarity threshold. ,when If the update is continuous, it is considered a continuous update; otherwise, it is marked as an abnormal change.
[0068] Please see Figure 1 The specific steps of S2 are as follows:
[0069] S201: Extract continuous access record data of users on corresponding nodes based on node update frequency sequence and aggregate it, filter access segments that meet the node frequency continuity condition, arrange them in time order and construct a continuous sequence of user access behavior under each node to obtain the node access continuous sequence.
[0070] Extract the node-id and access-time fields from user behavior logs, sort by timestamp, and filter access records with consecutive access intervals ≤ 30 minutes to construct a continuous sequence. Taking node NODE-112 as an example, user U123 generated the access record sequence 09:00, 09:25, 09:50, 10:20 on 2023-10-05. After removing 10:20 (interval > 30 minutes), the continuous segment 09:00, 09:25, 09:50 is formed. Set the continuity threshold K=3. When the number of consecutive accesses is ≥ K, the segment is retained, as shown in Table 2 for the access sequence filtering process:
[0071] Table 2: User access sequence list:
[0072]
[0073] S202: Divide the continuous sequence of node visits into sliding windows of fixed length, calculate the variance of the frequency of visits within the window, calculate the dispersion of the visits within each sliding window, and combine the calculation results of all windows to form a set of visits fluctuations for multiple nodes, and obtain a set of node visits fluctuation values.
[0074] Set the sliding window length n=5 and the step size s=1. When calculating the window adjustment variance, take the access frequency sequence of node NODE-112 in the time window W1-W5: 3, 5, 4, 6, 2. Represents the first in the window The measured access frequency of each node. Take the measured value sequence 3, 5, 4, 6, 2, and calculate the average value. , Represents the window position correction factor and the number of node updates. According to statistics , Represents the cross-window smoothing coefficient. To prevent the denominator from being too small, substitute into the formula. = The variance levels were set as low (0-1.0), medium (1.0-2.5), and high (>2.5). The result of 1.068 belongs to the medium level of fluctuation.
[0075] S203: Call the window variance data of each node in the set of access volatility values, normalize the mean of the volatility values of multiple nodes, calculate the difference between the normalized volatility values and the volatility score benchmark value, convert them into values within the score range, and generate the access volatility score.
[0076] For node NODE-112, with variances of 1.068, 1.342, 0.893, 1.576, and 1.210 for five windows, the mean μ = 1.218 and the standard deviation σ = 0.268. The normalized fluctuation value Z = (1.718 - μ) / σ = 0.852. Assuming a baseline value B = 0.8, the scoring conversion formula Score = 50 + 20 × (ZB) = 50 + 20 × (0.852 - 0.8) = 3.64. The scoring range is set as: Excellent (≥6), Good (4-5), Poor (<4). A score of 3.64 is classified as Poor, triggering the access path optimization mechanism.
[0077] Please see Figure 1 The specific steps of S3 are as follows:
[0078] S301: Based on the access volatility score, obtain the node click frequency, page dwell time and operation jump path in user interaction behavior and normalize them. Calculate the normalized click frequency, normalized dwell time and normalized jump path depth, combine them into a multi-dimensional feature group and vectorize them to generate a node behavior vector group.
[0079] Extract raw data for node NODE-215 from the user behavior database in the interactive electronic manual knowledge graph: click frequency. (maximum value of the entire system) minimum value ), stay time seconds (maximum value) minimum value ), Jump path depth Layer (maximum value) Perform max-min normalization: , , Constructing behavior vectors The feature vector generation process is shown in Table 3:
[0080] Table 3: Node Behavior Normalization Parameters
[0081]
[0082] S302: Call the normalized jump path depth and normalized dwell time in the node behavior vector group, calculate the average response time and jump frequency of each node, combine the ratio of the node's normalized click frequency to the jump frequency, extract the corresponding jump success frequency and perform ratio conversion to obtain the jump success rate set;
[0083] The response time metric for node NODE-215 was calculated using the time series of 10 valid jumps: 5.3, 6.1, 4.9, 5.8, 7.2, 6.5, 5.9, 6.3, 5.7, and 6.0 seconds. The average response time was then calculated. Seconds. Count the number of bounce events. Total visits (times), jump frequency Redirect success rate , This represents the redirect success rate, calculated using the following formula: Statistics show that 7 out of 10 redirects were successful. , This represents the success rate threshold, set to 0.6 based on the 75th percentile of the data. If the redirect is deemed valid, a success threshold is set. ,when Data is retained.
[0084] S303: Based on the numerical structure of multiple nodes in the jump success rate set, the jump success rate is used as the first judgment condition and the bounce rate is used as the auxiliary condition for multi-interval classification judgment. The labels are assigned in layers according to the jump success rate interval threshold and combined with the bounce rate distribution to obtain the behavior status label group.
[0085] Define redirect success rate ranges: Low (0-0.5), Medium (0.5-0.75), High (0.75-1.0), combined with bounce rate classification: Excellent (… ), good (0.1< ),Difference( ), This represents the bounce frequency, calculated using the following formula: 3 out of 25 visits resulted in a bounce rate error. Node NODE-215 (Medium level) (Good grade), in generating composite tags - Good, represents the success rate of redirection. Set tag conversion rules: when and When crossing two levels, a review flag is triggered, including combinations of high and low levels that require manual review.
[0086] Please see Figure 1 The specific steps of S4 are as follows:
[0087] S401: Based on the behavior state label sequence, extract any two adjacent labels to form a label transition pair, arrange multiple transition pairs in order according to the node behavior time axis, count the occurrence frequency of each type of label transition pair, and obtain the label transition frequency dataset;
[0088] Transition pairs were extracted from the tag sequences -Good, High-Excellent, Medium-Good, and Low-Poor, and their occurrence frequency was counted: Medium-Good → High-Excellent = 2 times, High-Excellent → Medium-Good = 1 time, Medium-Good → Low-Poor = 1 time. Taking node NODE-501 as an example, the transition sequence AA→AB→BA→AA was generated within 7 days. The frequency of AA→AB was 3 times, AB→BA was 2 times, and BA→AA was 1 time, as shown in Table 4.
[0089] Table 4: Tag Status Transition Frequency Table
[0090]
[0091] S402: Call the label transition frequency dataset, normalize the state transition probabilities between transition pairs, calculate the conditional probability value of transitioning from any state to the target state through the Markov transition model, and generate the label state transition probability distribution value.
[0092] Representing state arrive The frequency of the transitions was obtained directly by statistically analyzing the records of the label transition pairs. From Table 4, the number of transitions from AA to AB was found to be 3. Representing state The total transfer frequency is calculated by summing all transfers from... The number of transitions from the starting point is obtained, including the total transition frequency of state AA. (AA→AB and AA→Other paths in Table 4). The transfer path weight factor is defined as follows: ,in The path length (number of jumps), including the length of the path AA→AB→BA. ,but , Represents the mean of the weighting factors, calculated by all The arithmetic mean is obtained. If a weight sequence of 0.333, 0.5, 0.25 exists, then... , Represents the variance of the weighting factors, calculated using the following formula: Taking the above weight sequence as an example, , The smoothing adjustment coefficient is defined as follows: ,in The path length, including the path length. hour, , Represents the average transition frequency across all states, through all The mean values are calculated as follows: if the total transition frequencies of states AA, AB, and BA are 4, 2, and 1 respectively, then... , The index base is represented by the jump success rate. ,include hour, Substitute into the formula, Calculated Higher than the transition probability threshold This indicates that the transition from AA to AB is statistically significant.
[0093] S403: Based on the label state transition probability distribution value, with the current node behavior label as the initial state, calculate the cumulative probability value of the state transition path within a finite step size, select the path with the optimal cumulative probability value as the behavior evolution path, and generate label evolution path rules.
[0094] Starting from the current label AA, the cumulative probability of the 3-step path AA→AB→BA→AA is calculated as 0.172 × 0.214 × 0.108 = 0.00397, and the cumulative probability of the path AA→AB→AA is calculated as 0.172 × 0.318 = 0.0547. An optimal path threshold is then set. The path AA→AB→AA is selected as the evolution path, and the rule ID is generated as R-AA-AB-AA.
[0095] Please see Figure 1 The specific steps of S5 are as follows:
[0096] S501: Calculate the stability index of each path based on the access volatility score, and evaluate the stability of each path in combination with the tag evolution path rules to obtain the path stability score.
[0097] Stability data for path R-AA-AB-AA was extracted from the path rule base. The 30-day visit volatility score sequence (34, 28, 41, 37) was calculated, and the standard deviation was determined. Set stability indicators Taking the path R-BB-BC-BB as an example, the standard deviations of the volatility score series 22, 19, and 25 are... ,but Set stability levels: Poor (0-50), Medium (50-80), Excellent (80-100), as shown in Table 5 for path stability assessment:
[0098] Table 5 Path Stability Scoring Table:
[0099]
[0100] S502: Based on path stability scoring, calculate the jump credibility score for each path according to the current behavior status of the node in the interactive manual, combined with the node's access frequency and bounce rate. The jump credibility score is calculated based on the node's behavior status and path matching degree to generate a path jump credibility score.
[0101] Take the access frequency of path R-BB-BC-BB Bounce rate (times / day) Current node behavior label matching degree Calculate the redirect credibility score. Set the scoring benchmark value ,when It is determined to be a trusted path.
[0102] S503: Based on the path jump credibility score and path stability score, candidate paths are filtered, and path sequences with stability compliance indicators and jump credibility scores exceeding the credibility score benchmark value are selected, sorted, and the optimal navigation path sequence is generated.
[0103] The credibility score baseline is set by statistically analyzing the mean of the jump stability of candidate paths under multiple node behavior states and the median of the credibility score distribution;
[0104] For the candidate path set R-AA-AB-AA ( R-BB-BC-BB The filtering criteria are: and , As a stability indicator, the ranking formula , The jump credibility score is calculated as R-BB-BC-BB. Generate the optimal path sequence R-BB-BC-BB.
[0105] Please see Figure 2 A knowledge graph manual navigation system, used to execute the aforementioned knowledge graph manual navigation method, the system comprising:
[0106] The content update module obtains the content version records of multiple nodes in the interactive manual knowledge graph and extracts node update events. It calculates the content update frequency based on the differences in node summary text, calculates the similarity of text structure changes, generates a node update frequency sequence, and transmits it to the fluctuation analysis module.
[0107] The fluctuation analysis module obtains corresponding user access behavior data based on the node update frequency sequence, calculates the fluctuation of node access behavior, quantifies the fluctuation through sliding window variance, generates an access fluctuation score, and transmits it to the behavior label module.
[0108] The behavior labeling module obtains the user interaction behavior status based on the access volatility score, generates node behavior feature vectors, calculates the node response time, jump success rate and bounce rate, classifies and adds labels, generates behavior status labels and passes them to the evolution prediction module.
[0109] The evolution prediction module extracts label transition pairs based on behavioral state labels, analyzes the conditions and rules of label transitions through Markov transition models, calculates the evolution trend of labels and predicts node behavior, generates label evolution path rules and passes them to the path filtering module.
[0110] The path filtering module evaluates the stability of each candidate path based on access volatility scores and tag evolution path rules, calculates path jump credibility scores based on the current behavior status of nodes in the interactive manual, filters paths, and outputs the optimal navigation path sequence.
[0111] It should be understood that the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. A and B can be singular or plural. Additionally, the character " / " in this article generally indicates an "or" relationship between the preceding and following related objects, but it may also indicate an "and / or" relationship. Please refer to the context for a more detailed understanding.
[0112] In this invention, "at least one" means one or more, and "more than one" means two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or multiple items. For example, at least one of a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple.
[0113] It should be understood that, in various embodiments of the present invention, the order of the above-mentioned process numbers does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
[0114] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use multiple methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.
[0115] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the devices, apparatuses, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0116] In the embodiments provided by this invention, it should be understood that the disclosed devices, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.
[0117] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0118] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0119] If a function is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0120] The above are merely specific embodiments of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. A method for manual navigation based on a knowledge graph, characterized in that, The method includes: S1: Obtain the content version records of multiple nodes in the knowledge graph of the interactive electronic manual and extract node update events. Calculate the content update frequency based on the differences in node summary text, calculate the similarity of text structure changes, and generate a node update frequency sequence. S2: Based on the node update frequency sequence, obtain the corresponding user access behavior data, calculate the node access behavior fluctuation, and quantify the fluctuation through sliding window variance to generate an access fluctuation score. The specific steps of S2 are as follows: S201: Based on the node update frequency sequence, extract and aggregate the continuous access record data of users on the corresponding nodes, filter the access segments that meet the node frequency continuity condition, arrange them in time order and construct a continuous sequence of user access behavior under each node to obtain the node access continuous sequence. S202: Divide the continuous sequence of node visits into sliding windows of fixed length, calculate the window adjustment variance of the frequency of visits within the window, analyze the dispersion of window visits, and combine the calculation results of all windows to form a set of visits fluctuations for multiple nodes, and obtain a set of node visits fluctuation values. S203: Call the window variance data of each node in the set of access fluctuation values, normalize the mean of the fluctuation values of multiple nodes, calculate the difference between the normalized fluctuation values and the fluctuation score benchmark value, convert them into values within the score range, and generate an access fluctuation score. S3: Based on the access volatility score, obtain the user interaction behavior status to generate node behavior feature vectors, calculate the node response time, jump success rate and bounce rate, classify and attach labels to generate behavior status labels; The specific steps in S3 are as follows: S301: Based on the access volatility score, obtain the node click frequency, page dwell time and operation jump path in the user interaction behavior and normalize them. Calculate the normalized click frequency, normalized dwell time and normalized jump path depth, combine them into a multi-dimensional feature group and vectorize them to generate a node behavior vector group. S302: Call the normalized jump path depth and normalized dwell time in the node behavior vector group, calculate the average response time and jump frequency of each node, combine the ratio of the node's normalized click frequency to the jump frequency, extract the corresponding jump success frequency and perform ratio conversion, and obtain the jump success rate set. S303: Based on the numerical structure of the multiple nodes in the jump success rate set, the jump success rate is used as the first judgment condition and the bounce rate is used as the auxiliary condition to perform multi-interval classification judgment. According to the jump success rate interval threshold and combined with the bounce rate distribution, the labels are assigned in layers to obtain the behavior status label group. S4: Extract label transition pairs based on the behavior state labels, analyze the conditions and rules of label transitions through Markov transition models, calculate the evolution trend of labels and predict node behavior, and generate label evolution path rules. The specific steps of S4 are as follows: S401: Based on the behavior state label sequence, extract any two adjacent labels to form a label transition pair, arrange multiple transition pairs in order according to the node behavior time axis, count the occurrence frequency of each type of label transition pair, and obtain the label transition frequency dataset; S402: Call the label transition frequency dataset, normalize the state transition probabilities between transition pairs, calculate the conditional probability value of transitioning from any state to the target state through the Markov transition model, and generate the label state transition probability distribution value. S403: Based on the label state transition probability distribution value, with the current node behavior label as the initial state, calculate the cumulative probability value of the state transition path within a finite step range, select the path with the optimal cumulative probability value as the behavior evolution path, and generate label evolution path rules. S5: Evaluate the stability of each candidate path based on the access volatility score and the tag evolution path rule, calculate the path jump credibility score according to the current behavior state of the node in the interactive electronic manual, filter the paths, and output the optimal navigation path sequence. 2.The knowledge graph manual navigation method of claim 1, wherein, The node update frequency sequence includes the number of content version changes, the magnitude of differences in summary text, and the similarity score of text structure changes. The access volatility score includes the standard deviation of node access frequency, the sliding window variance score, and the coefficient of variation of access behavior. The behavior status labels specifically include response time interval labels, jump success rate levels, bounce rate levels, and behavior status category identifiers. The label evolution path rules include state transition probabilities and Markov transition path sets. The navigation path sequence specifically refers to the path stability score, jump credibility score, and optimal path node sequence. 3.The manual navigation method of a knowledge graph according to claim 1, characterized in that, The specific steps of S1 are as follows: S101: Obtain the content version records of all nodes in the interactive electronic manual knowledge graph and extract the node update association fields, including update timestamp, update summary text and corresponding node identifier, and perform aggregation and sorting. Call the sorted node update summary text data and obtain the node summary text difference value sequence. S102: Based on the node summary text difference value sequence, count the number of differences value changes of multiple nodes in multiple time periods, combine with fixed time interval segments, call the difference value change count data and time period length for normalization processing, calculate the node content change ratio, and obtain the node content update frequency. S103: Obtain the summary text of two adjacent versions based on the node content update frequency, calculate the consistency metric of the structural content, call the node version structure segmentation result and keyword distribution, calculate the structural change similarity of adjacent texts and traverse the nodes to generate a node update frequency sequence. 4.The knowledge graph manual navigation method of claim 1, wherein, The volatility score benchmark is determined by statistically analyzing the distribution of the moving variance of the volatility of all node access behaviors and performing percentile analysis on the variance distribution, and selecting the 75th percentile value in the distribution as the threshold benchmark. 5.The manual navigation method of a knowledge graph according to claim 1, characterized in that, The threshold for the success rate interval is calculated by statistically analyzing the number of successful jumps and the total number of jumps for all users across all nodes, and then using the upper and lower quartiles or a set proportion interval of the distribution as the threshold. 6.The manual navigation method of a knowledge graph according to claim 1, characterized in that, The specific steps of S5 are as follows: S501: Calculate the stability index of each path based on the access volatility score, and evaluate the stability of each path in combination with the tag evolution path rules to obtain the path stability score. S502: Based on the path stability score, calculate the jump credibility score for each path according to the current behavior status of the node in the interactive manual, combined with the node's access frequency and bounce rate. The jump credibility score is calculated based on the node's behavior status and path matching degree to generate a path jump credibility score. S503: Based on the path jump credibility score and path stability score, the candidate paths are filtered, and the path sequences with stability compliance indicators and jump credibility scores exceeding the credibility score benchmark value are selected and sorted to generate the optimal navigation path sequence. The credibility score benchmark is set by statistically analyzing the mean of the jump stability of candidate paths under multiple node behavior states and the median of the credibility score distribution.
7. A manual navigation system for a knowledge graph, characterized in that The system is used to implement the manual navigation method for knowledge graphs according to any one of claims 1-6, and the system comprises: The content update module obtains the content version records of multiple nodes in the interactive manual knowledge graph and extracts node update events. It calculates the content update frequency based on the differences in node summary text, calculates the similarity of text structure changes, generates a node update frequency sequence, and transmits it to the fluctuation analysis module. The fluctuation analysis module obtains the corresponding user access behavior data based on the node update frequency sequence, calculates the fluctuation of node access behavior, quantifies the fluctuation through sliding window variance, generates an access fluctuation score, and transmits it to the behavior label module. The behavior labeling module obtains the user interaction behavior status based on the access volatility score, generates node behavior feature vectors, calculates the node's response time, jump success rate and bounce rate, classifies and adds labels, generates behavior status labels and transmits them to the evolution prediction module. The evolution prediction module extracts label transition pairs based on the behavioral state labels, analyzes the conditions and rules of label transitions through a Markov transition model, calculates the evolution trend of labels and predicts node behavior, generates label evolution path rules and transmits them to the path filtering module. The path filtering module evaluates the stability of each candidate path based on the access volatility score and the tag evolution path rule, calculates the path jump credibility score according to the current behavior state of the node in the interactive manual, filters the paths, and outputs the optimal navigation path sequence.
Citation Information
Patent Citations
Scientific and technological achievement intelligent matching and transaction recommendation method and system based on knowledge graph
CN120144870A
Index selection method for cross-domain multi-dimensional query features
CN120296207A