An accounting archive data management method and system based on a knowledge graph
By constructing a permission knowledge graph and utilizing the HAKE method, CompGCN network, UDGNet network, LayoutLMv3 model, and ELECTRE algorithm, the problems of insufficient permission expression and inaccurate target file location in accounting file management are solved, achieving efficient and accurate file management and permission determination.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SUZHOU LIANGBO INFORMATION TECH CO LTD
- Filing Date
- 2026-03-25
- Publication Date
- 2026-06-19
AI Technical Summary
Existing accounting record management methods are unable to effectively express the relationships between record data, management rules, and operating entities and permissions. This results in coarse-grained permission determination, insufficient utilization of record associations, untimely updates of management results, and difficulty in fully utilizing multimodal features, thus affecting the accuracy of management results.
By employing the HAKE method, CompGCN network, UDGNet network, LayoutLMv3 model, and ELECTRE algorithm, a permission knowledge graph is constructed to perform correlation modeling, permission matching, file mapping, and dynamic updates of accounting file data. By uniformly processing accounting file data, management rule data, and operation request data, the accuracy of permission determination and the traceability of the management process are achieved.
It has improved the structuring level and correlation management capabilities of accounting records management, enhanced the accuracy of authorization determination and the reliability of management results, and improved the traceability and overall efficiency of the management process.
Smart Images

Figure CN122240855A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of records management technology, and in particular to a method and system for managing accounting records data based on knowledge graphs. Background Technology
[0002] With the advancement of accounting informatization, enterprises have generated a large amount of electronic accounting archive data during the financial management process, including voucher data, ledger data, report data, and related approval records. In existing technologies, accounting archives are typically archived, retrieved, borrowed, and audited through electronic archive management platforms, and access control for different users is managed using preset permission rules to meet daily accounting archive management needs.
[0003] Most existing accounting record management methods use database table structures, directory structures, or fixed field mapping to organize and retrieve record data. While these methods can accomplish basic record storage and access control, the relationships between different record data, between record data and management rules, and between operating entities and permissions are usually established through static configuration. This results in weak relationship expression capabilities and makes it difficult to reflect the correspondence between multiple types of objects in a unified structure.
[0004] On the other hand, existing technologies for determining the legality of accounting record operations typically rely on comparisons of account information, role and permission information, and preset rules. This approach is prone to problems such as coarse-grained permission determination, insufficient utilization of record associations, and untimely updates to management results when faced with multi-level permission constraints, complex operation requests, and scenarios involving the joint access of multi-source record data. It also struggles to dynamically determine the legality based on the relationships between the current operating entity, target record data, and management rules.
[0005] Furthermore, with the widespread use of scanned invoices, electronic forms, and image-based archives, accounting archive data now simultaneously contains text, layout, and image information. In existing technologies, many solutions still rely on single-field matching or single-content recognition to determine the target archive, failing to fully utilize the multimodal features within the archive. This impacts the accuracy of the archive mapping results and subsequent management and processing outcomes.
[0006] Therefore, how to provide a knowledge graph-based accounting record data management method and system is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention
[0007] One objective of this invention is to propose a knowledge graph-based accounting record data management method and system. This invention comprehensively employs the HAKE method, CompGCN network, UDGNet network, LayoutLMv3 model, and ELECTRE algorithm to perform association modeling, permission matching, record mapping, legality determination, and dynamic updating of accounting record data, management rule data, identity data, and operation request data. It details the implementation process of permission knowledge graph construction, identity recognition, target record mapping, management instruction generation, and graph updating in the accounting record management process, and has the advantages of strong record association capabilities, high accuracy of permission determination, strong traceability of the management process, and high management efficiency.
[0008] An accounting record data management method based on knowledge graph according to an embodiment of the present invention includes the following steps: S1. Acquire accounting record data and management rule data and preprocess them to generate record data sequences and rule data sequences; S2. Extract nodes and relationships from the rule data sequence, perform hierarchical embedding on the extracted results using the HAKE method, and use the CompGCN network to perform relationship combination propagation and association aggregation on the embedded results to construct a permission knowledge graph. S3. Obtain the identity data of the current operating entity, use the UDGNet network to perform identity feature extraction and update operations on the identity data, and match it in the permission knowledge graph according to the update results to construct a permission relationship sequence; S4. Obtain the current operation request, extract the target archive data from the archive data sequence according to the current operation request, and perform multimodal information fusion operation through the LayoutLMv3 model. Map the fusion result to the permission knowledge graph to form an archive relationship sequence. S5. Construct a set of legality judgment indicators based on the permission relationship sequence and the file relationship sequence. Use the ELECTRE algorithm to perform transcendence relationship judgment and filtering operations on the set of legality judgment indicators to generate a management instruction sequence. S6. Manage the corresponding archive data in the archive data sequence according to the management instruction sequence, record the corresponding operation information, and update the permission knowledge graph according to the operation information.
[0009] Optionally, the accounting record data represents a set of raw data related to accounting business and used for archiving management; the management rule data represents a set of rule-based data used to constrain the accounting record management process; the preprocessing includes format unification, field validation, time alignment, and deduplication; the current operating subject represents the identity object that initiates the current operation request; the identity data represents a set of data used to characterize the identity features of the current operating subject; and the current operation request represents the current management operation instruction information initiated for the accounting record data.
[0010] Optionally, S2 specifically includes: S21. Divide the rule data sequence into rule units and extract the rule subject, rule object, rule constraint, and rule association from each rule unit. Treat the rule subject and rule object as nodes and the rule constraint and rule association as associations. S22. The HAKE method is used to perform hierarchical position mapping operation on nodes and hierarchical semantic mapping operation on relationships. The position mapping results and semantic mapping results are then subjected to hierarchical embedding operation to obtain node representations and relationship representations. S23. In the CompGCN network, the relationship combination propagation operation is performed on the node representation according to the association relationship, and the relationship representation is subjected to association aggregation processing during the propagation process to generate the updated node representation and relationship representation; S24. Write the updated node representation and relationship representation into the corresponding nodes and associations respectively to construct the permission knowledge graph.
[0011] Optionally, S22 specifically includes: S221. Traverse each rule unit, count the number of times each node appears as the subject of a rule and the number of times it appears as the object of a rule, and subtract the two to obtain the position difference of each node in the corresponding rule unit. S222. Accumulate the position differences of each node in each rule unit, and divide the accumulated result by the total number of times the node appears in all rule units to obtain the position mapping value of each node; S223. Traverse each association relationship, record the order of appearance of the nodes connected by each association relationship in the corresponding rule unit, subtract the order of appearance of the next node from the order of appearance of the previous node, and obtain the order difference value corresponding to each association relationship. S224. Accumulate the order difference of each association, multiply the accumulated result by the number of times the association appears in all rule units, and then divide the product by the total number of all associations to obtain the semantic mapping value of each association. S225. The HAKE method is used to perform hierarchical embedding operations on the positional mapping values and semantic mapping values, specifically including: Perform a numerical sorting operation on the position mapping value of each node, determine the hierarchical order of the corresponding nodes according to the numerical size rule, and concatenate the position mapping value of each node with the corresponding hierarchical order to form a node representation; Perform a numerical sorting operation on the semantic mapping value of each association, determine the hierarchical order of the corresponding associations according to the numerical size rule, and concatenate the semantic mapping value of each association with the corresponding hierarchical order to form a relation representation.
[0012] Optionally, S23 specifically includes: S231. In the CompGCN network, the relationship representations corresponding to each association are sequentially represented, and the node representations corresponding to the previous node connected to the association are multiplied by position to obtain the relationship combination values of each association. S232. Perform a numerical summation operation on the relationship combination values corresponding to each relationship, and divide the summation result by the number of values in the corresponding relationship representation to obtain the relationship weight value of each relationship. S233. Multiply the relationship combination value of each relationship by the corresponding relationship weight value to obtain the weighted combination value of each relationship; S234. Read the weighted combination value corresponding to all the associations that each node is connected to as the previous node, and sum them up. Divide the summed value by the number of associations that the node is connected to as the previous node to obtain the node propagation value of the node. S235. Subtract the propagation value of each node from its corresponding node representation to obtain the propagation difference of each node. S236. Add the propagation value of each node to the corresponding propagation difference to obtain the updated node representation; S237. Accumulate the relation representations corresponding to all associations connected to the same node, and divide by the number of associations connected to that node to obtain the relation aggregation value of that node; S238. Perform addition operations on the aggregated relation value and the relation representations corresponding to each association connected to the node to obtain the updated relation representation.
[0013] Optionally, S3 specifically includes: S31. Obtain the identity data of the current operating entity, perform missing item removal, field alignment, numerical normalization and feature sorting operations on each identity field in the identity data, divide the processed identity data into basic identity vector, voucher identity vector and operation identity vector according to the preset field categories, and concatenate them in a preset order to obtain the identity vector. S32. In the UDGNet network, perform identity feature extraction operation on the identity vector, perform layer-by-layer mapping operation on each dimension of the identity vector, extract the local identity representation corresponding to each dimension, and integrate all local identity representations to obtain the initial identity representation. S33. Perform an identity representation update operation on the initial identity representation. Read the representation value of each position in the initial identity representation, calculate the difference between the representation values of adjacent positions, and perform a weighted adjustment operation on the representation value of each position according to the difference. Summarize the weighted results to obtain the updated identity representation. S34. Read the node representation of each node in the permission knowledge graph, calculate the difference between the updated identity representation and each node representation, and perform distance comparison and size sorting operations according to the difference results to determine the target node representation corresponding to the updated identity representation; S35. Read the corresponding relationships of the target node, perform permission type identification operation on each relationship, extract the permission relationship corresponding to the current operation subject, and arrange them in a preset order to construct a permission relationship sequence.
[0014] Optionally, S4 specifically includes: S41. Obtain the current operation request, and perform field splitting, sequence alignment and unified encoding operations on the request category, request object, request time and request association fields in the current operation request to form a request feature sequence; S42. Extract the target archive data from the archive data sequence according to the requested feature sequence, and perform segmentation and sorting operations on the text content, page position and image content in the target archive data to form an archive text sequence, an archive page sequence and an archive image sequence. S43. In the LayoutLMv3 model, feature extraction operations are performed on the archival text sequence, archival layout sequence, and archival image sequence to obtain text representation, layout representation, and image representation; S44. Perform multimodal information fusion operations on text representation, layout representation, and image representation, specifically including: Read the layout representation corresponding to each text representation and perform the splicing operation in a fixed order to obtain the preliminary fusion result; Read the image representations corresponding to each preliminary fusion result, and perform the stitching operation in a fixed order to obtain the intermediate fusion result; Perform a step-by-step accumulation operation on the values in each intermediate fusion result, and divide the accumulation result by the total number of values to obtain the fusion value corresponding to each intermediate fusion result; Perform a sequential sorting operation on each fused value according to the arrangement order of the archive text sequence, and then concatenate all the sorted fused values in sequence to obtain the fused result; S45. Based on the fusion results, read the corresponding nodes and relationships in the permission knowledge graph, and perform the association mapping operation. Based on the mapping results, establish the file association relationship between the target file data and the permission knowledge graph to form a file relationship sequence.
[0015] Optionally, S5 specifically includes: S51. Read the permission relationship sequence and file relationship sequence in the order of arrangement. Perform a subtraction operation on the permission relationship and file relationship at the same position. Add up the values of each item in the subtraction result and divide the sum by the total number of corresponding values to construct a set of legal judgment indicators. S52. Read the current operation request, perform a unified encoding operation on the current management operation instruction information in the current operation request, perform a subtraction operation on each item of the encoding result and each legal judgment value in the legal judgment index set, and sort the values in the subtraction result according to the size of the values. S53. Use the ELECTRE algorithm to perform a transcendence relation determination operation on the sorting results to obtain the relation determination results; S54. Filter the relationship determination results, retain the sorting results corresponding to relationship determination results greater than or equal to 0, remove the sorting results corresponding to relationship determination results less than 0, and sort the retained sorting results in descending order of numerical value to generate a management instruction sequence.
[0016] Optionally, S53 specifically includes: S531. Using the ELECTRE algorithm, read the values in the sorting results in the order of arrangement, and select the previous sorting result and the next sorting result as a set of comparison objects in turn. S532. Perform item-by-item comparison operation on the values in each group of comparison objects, record the values in the previous sorting result that are greater than the next sorting result as satisfied values, record the values in the previous sorting result that are less than the next sorting result as unsatisfied values, and record the values in the previous sorting result that are equal to the next sorting result as kept values. S533. Count the number of satisfied values, the number of unsatisfied values, and the number of held values in each group of comparison objects. Subtract the number of unsatisfied values from the number of satisfied values, and add the subtraction result to the number of held values to obtain the judgment value of each group of comparison objects. S534. Write each judgment value into the relation judgment result according to the order of the corresponding comparison objects.
[0017] An accounting record data management system based on a knowledge graph according to an embodiment of the present invention includes: The data acquisition module is used to acquire and preprocess accounting archive data and management rule data, and generate archive data sequences and rule data sequences. The graph construction module is used to extract nodes and relationships based on rule-based data sequences, perform hierarchical embedding operations on the extracted results using the HAKE method, and perform relationship combination, propagation, and association aggregation processing on the embedded results through the CompGCN network to construct a permission knowledge graph. The permission matching module is used to obtain the identity data of the current operating entity, and uses the UDGNet network to perform identity feature extraction and update operations on the identity data. Based on the update results, it performs matching in the permission knowledge graph to construct a permission relationship sequence. The relationship mapping module is used to obtain the current operation request, extract the target archive data from the archive data sequence according to the current operation request, and perform multimodal information fusion operation through the LayoutLMv3 model, and map the fusion result to the permission knowledge graph to form an archive relationship sequence; The instruction generation module is used to construct a set of legality judgment indicators based on the permission relationship sequence and the file relationship sequence, and to perform transcendence relationship judgment and filtering operations on the set of legality judgment indicators using the ELECTRE algorithm to generate a sequence of management instructions. The management update module is used to manage the corresponding archive data in the archive data sequence according to the management instruction sequence, record the corresponding operation information, and update the permission knowledge graph according to the operation information.
[0018] The beneficial effects of this invention are: First, this invention unifies the processing of accounting record data and management rule data, and combines the HAKE method and CompGCN network to construct a permission knowledge graph. This integrates the scattered stored record data, rule relationships, and permission constraints into the same associated structure. Compared with the processing methods in the prior art that rely on table structure, directory structure, and static field mapping, this invention can more clearly express the correspondence between accounting record data, operating entities, and management rules, thereby improving the structuring degree and association management capabilities of accounting record management.
[0019] Secondly, this invention uses the UDGNet network to extract and update the identity features of the current operating entity's identity data, combines the LayoutLMv3 model to perform multimodal information fusion processing on the target file data, and uses the ELECTRE algorithm to comprehensively determine the permission relationship sequence and the file relationship sequence. Compared with the existing technology, which mainly relies on account information, role permission information and fixed rule comparison, this invention can improve the accuracy of permission determination and the stability of target file mapping, thereby improving the rationality of management instruction generation and the reliability of management execution results.
[0020] Finally, after completing the file management, the present invention can update the permission knowledge graph according to the corresponding operation information, so that the operation results, permission changes and file relationship changes can be continuously recorded and dynamically reflected. Compared with the existing technology where file data management and permission rule management are separated and the updates are delayed, it can improve the timeliness and consistency of permission relationships and file relationships in the subsequent management process, thereby enhancing the traceability of the accounting file management process and the overall management efficiency. Attached Figure Description
[0021] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings: Figure 1 This is a flowchart of a knowledge graph-based accounting record data management method proposed in this invention; Figure 2 This is a flowchart illustrating the process of determining the legality of accounting record requests and updating management in a knowledge graph-based accounting record data management method proposed in this invention. Figure 3 This is a module structure diagram of an accounting record data management system based on knowledge graphs proposed in this invention. Detailed Implementation
[0022] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.
[0023] refer to Figures 1-2 A knowledge graph-based method for managing accounting record data includes the following steps: S1. Acquire accounting record data and management rule data and preprocess them to generate record data sequences and rule data sequences; S2. Extract nodes and relationships from the rule data sequence, perform hierarchical embedding on the extracted results using the HAKE method, and use the CompGCN network to perform relationship combination propagation and association aggregation on the embedded results to construct a permission knowledge graph. S3. Obtain the identity data of the current operating entity, use the UDGNet network to perform identity feature extraction and update operations on the identity data, and match it in the permission knowledge graph according to the update results to construct a permission relationship sequence; S4. Obtain the current operation request, extract the target archive data from the archive data sequence according to the current operation request, and perform multimodal information fusion operation through the LayoutLMv3 model. Map the fusion result to the permission knowledge graph to form an archive relationship sequence. S5. Construct a set of legality judgment indicators based on the permission relationship sequence and the file relationship sequence. Use the ELECTRE algorithm to perform transcendence relationship judgment and filtering operations on the set of legality judgment indicators to generate a management instruction sequence. S6. Manage the corresponding archive data in the archive data sequence according to the management instruction sequence, record the corresponding operation information, and update the permission knowledge graph according to the operation information.
[0024] In this embodiment, accounting archive data represents the original data set related to accounting business and used for archiving management, management rule data represents the rule-based data set used to constrain the accounting archive management process, preprocessing includes format unification, field validation, time alignment and deduplication, current operation subject represents the identity object that initiates the current operation request, identity data represents the data set used to characterize the identity features of the current operation subject, and current operation request represents the current management operation instruction information initiated for accounting archive data.
[0025] In this embodiment, S2 specifically includes: S21. Divide the rule data sequence into rule units and extract the rule subject, rule object, rule constraint, and rule association from each rule unit. Treat the rule subject and rule object as nodes and the rule constraint and rule association as associations. S22. The HAKE method is used to perform hierarchical position mapping operation on nodes and hierarchical semantic mapping operation on relationships. The position mapping results and semantic mapping results are then subjected to hierarchical embedding operation to obtain node representations and relationship representations. S23. In the CompGCN network, the relationship combination propagation operation is performed on the node representation according to the association relationship, and the relationship representation is subjected to association aggregation processing during the propagation process to generate the updated node representation and relationship representation; S24. Write the updated node representation and relationship representation into the corresponding nodes and associations respectively to construct the permission knowledge graph.
[0026] In this embodiment, S21 specifically includes: S211. Perform a segmentation operation on each rule data in the rule data sequence according to a preset delimiter mark to obtain multiple rule units; S212. Arrange the data in each rule unit according to the order of appearance, determine the data at the beginning as the rule subject, and determine the data at the end as the rule object; S213. Count the number of occurrences of each data in each rule unit except for the rule subject and the rule object. Use the number of times each data appears together with the rule subject as the constraint value and the number of times each data appears together with the rule object as the association value. S214. When the constraint value is greater than the associated value, the corresponding data is marked as a rule constraint item; when the constraint value is less than or equal to the associated value, the corresponding data is marked as a rule associated item. S215. Compare the rule subjects and rule objects in all rule units item by item, merge the rule subjects with the same comparison result, merge the rule objects with the same comparison result, and use the merged rule subjects and rule objects as nodes. S216. Compare each rule constraint item and rule association item in all rule units one by one. Perform a merge operation on rule constraint items with the same comparison result, and perform a merge operation on rule association items with the same comparison result. Then, use the merged rule constraint items and rule association items as the association relationship.
[0027] In this embodiment, S22 specifically includes: S221. Traverse each rule unit, count the number of times each node appears as the subject of a rule and the number of times it appears as the object of a rule, and subtract the two to obtain the position difference of each node in the corresponding rule unit. S222. Accumulate the position differences of each node in each rule unit, and divide the accumulated result by the total number of times the node appears in all rule units to obtain the position mapping value of each node; S223. Traverse each association relationship, record the order of appearance of the nodes connected by each association relationship in the corresponding rule unit, subtract the order of appearance of the next node from the order of appearance of the previous node, and obtain the order difference value corresponding to each association relationship. S224. Accumulate the order difference of each association, multiply the accumulated result by the number of times the association appears in all rule units, and then divide the product by the total number of all associations to obtain the semantic mapping value of each association. S225. The HAKE method is used to perform hierarchical embedding operations on the positional mapping values and semantic mapping values, specifically including: Perform a numerical sorting operation on the position mapping value of each node, determine the hierarchical order of the corresponding nodes according to the numerical size rule, and concatenate the position mapping value of each node with the corresponding hierarchical order to form a node representation; Perform a numerical sorting operation on the semantic mapping value of each association, determine the hierarchical order of the corresponding associations according to the numerical size rule, and concatenate the semantic mapping value of each association with the corresponding hierarchical order to form a relation representation.
[0028] In this embodiment, S23 specifically includes: S231. In the CompGCN network, the relationship representations corresponding to each association are sequentially represented, and the node representations corresponding to the previous node connected to the association are multiplied by position to obtain the relationship combination values of each association. S232. Perform a numerical summation operation on the relationship combination values corresponding to each relationship, and divide the summation result by the number of values in the corresponding relationship representation to obtain the relationship weight value of each relationship. S233. Multiply the relationship combination value of each relationship by the corresponding relationship weight value to obtain the weighted combination value of each relationship; S234. Read the weighted combination value corresponding to all the associations that each node is connected to as the previous node, and sum them up. Divide the summed value by the number of associations that the node is connected to as the previous node to obtain the node propagation value of the node. S235. Subtract the propagation value of each node from its corresponding node representation to obtain the propagation difference of each node. S236. Add the propagation value of each node to the corresponding propagation difference to obtain the updated node representation; S237. Accumulate the relation representations corresponding to all associations connected to the same node, and divide by the number of associations connected to that node to obtain the relation aggregation value of that node; S238. Perform addition operations on the aggregated relation value and the relation representations corresponding to each association connected to the node to obtain the updated relation representation.
[0029] In this embodiment, S3 specifically includes: S31. Obtain the identity data of the current operating entity, perform missing item removal, field alignment, numerical normalization and feature sorting operations on each identity field in the identity data, divide the processed identity data into basic identity vector, voucher identity vector and operation identity vector according to the preset field categories, and concatenate them in a preset order to obtain the identity vector. S32. In the UDGNet network, perform identity feature extraction operation on the identity vector, perform layer-by-layer mapping operation on each dimension of the identity vector, extract the local identity representation corresponding to each dimension, and integrate all local identity representations to obtain the initial identity representation. S33. Perform an identity representation update operation on the initial identity representation. Read the representation value of each position in the initial identity representation, calculate the difference between the representation values of adjacent positions, and perform a weighted adjustment operation on the representation value of each position according to the difference. Summarize the weighted results to obtain the updated identity representation. S34. Read the node representation of each node in the permission knowledge graph, calculate the difference between the updated identity representation and each node representation, and perform distance comparison and size sorting operations according to the difference results to determine the target node representation corresponding to the updated identity representation; S35. Read the corresponding relationships of the target node, perform permission type identification operation on each relationship, extract the permission relationship corresponding to the current operation subject, and arrange them in a preset order to construct a permission relationship sequence.
[0030] In this embodiment, S4 specifically includes: S41. Obtain the current operation request, and perform field splitting, sequence alignment and unified encoding operations on the request category, request object, request time and request association fields in the current operation request to form a request feature sequence; S42. Extract the target archive data from the archive data sequence according to the requested feature sequence, and perform segmentation and sorting operations on the text content, page position and image content in the target archive data to form an archive text sequence, an archive page sequence and an archive image sequence. S43. In the LayoutLMv3 model, feature extraction operations are performed on the archival text sequence, archival layout sequence, and archival image sequence to obtain text representation, layout representation, and image representation; S44. Perform multimodal information fusion operations on text representation, layout representation, and image representation, specifically including: Read the layout representation corresponding to each text representation and perform the splicing operation in a fixed order to obtain the preliminary fusion result; Read the image representations corresponding to each preliminary fusion result, and perform the stitching operation in a fixed order to obtain the intermediate fusion result; Perform a step-by-step accumulation operation on the values in each intermediate fusion result, and divide the accumulation result by the total number of values to obtain the fusion value corresponding to each intermediate fusion result; Perform a sequential sorting operation on each fused value according to the arrangement order of the archive text sequence, and then concatenate all the sorted fused values in sequence to obtain the fused result; S45. Based on the fusion results, read the corresponding nodes and relationships in the permission knowledge graph, and perform the association mapping operation. Based on the mapping results, establish the file association relationship between the target file data and the permission knowledge graph to form a file relationship sequence.
[0031] In this embodiment, S5 specifically includes: S51. Read the permission relationship sequence and file relationship sequence in the order of arrangement. Perform a subtraction operation on the permission relationship and file relationship at the same position. Add up the values of each item in the subtraction result and divide the sum by the total number of corresponding values to construct a set of legal judgment indicators. S52. Read the current operation request, perform a unified encoding operation on the current management operation instruction information in the current operation request, perform a subtraction operation on each item of the encoding result and each legal judgment value in the legal judgment index set, and sort the values in the subtraction result according to the size of the values. S53. Use the ELECTRE algorithm to perform a transcendence relation determination operation on the sorting results to obtain the relation determination results; S54. Filter the relationship determination results, retain the sorting results corresponding to relationship determination results greater than or equal to 0, remove the sorting results corresponding to relationship determination results less than 0, and sort the retained sorting results in descending order of numerical value to generate a management instruction sequence.
[0032] In this embodiment, S53 specifically includes: S531. Using the ELECTRE algorithm, read the values in the sorting results in the order of arrangement, and select the previous sorting result and the next sorting result as a set of comparison objects in turn. S532. Perform item-by-item comparison operation on the values in each group of comparison objects, record the values in the previous sorting result that are greater than the next sorting result as satisfied values, record the values in the previous sorting result that are less than the next sorting result as unsatisfied values, and record the values in the previous sorting result that are equal to the next sorting result as kept values. S533. Count the number of satisfied values, the number of unsatisfied values, and the number of held values in each group of comparison objects. Subtract the number of unsatisfied values from the number of satisfied values, and add the subtraction result to the number of held values to obtain the judgment value of each group of comparison objects. S534. Write each judgment value into the relation judgment result according to the order of the corresponding comparison objects.
[0033] In this embodiment, S6 specifically includes: S61. Read each management instruction in the management instruction sequence according to the order of arrangement, and extract the corresponding archive data in the archive data sequence according to each management instruction; S62. Write the values in each management instruction into the same position in the corresponding file data, and combine all the written values in the original order to obtain the managed file data. S63. Perform a step-by-step subtraction operation on the corresponding file data before management and the file data after management, add up the values of each item in the subtraction result, and divide the sum by the total number of corresponding value items to obtain the operation value corresponding to each management instruction. S64. Record each management instruction, each operation value, and each managed file data in the order of arrangement to form operation information; S65. Update the corresponding nodes and their relationships in the permission knowledge graph based on the operation information.
[0034] refer to Figure 3 A knowledge graph-based accounting record data management system includes: The data acquisition module is used to acquire and preprocess accounting archive data and management rule data, and generate archive data sequences and rule data sequences. The graph construction module is used to extract nodes and relationships based on rule-based data sequences, perform hierarchical embedding operations on the extracted results using the HAKE method, and perform relationship combination, propagation, and association aggregation processing on the embedded results through the CompGCN network to construct a permission knowledge graph. The permission matching module is used to obtain the identity data of the current operating entity, and uses the UDGNet network to perform identity feature extraction and update operations on the identity data. Based on the update results, it performs matching in the permission knowledge graph to construct a permission relationship sequence. The relationship mapping module is used to obtain the current operation request, extract the target archive data from the archive data sequence according to the current operation request, and perform multimodal information fusion operation through the LayoutLMv3 model, and map the fusion result to the permission knowledge graph to form an archive relationship sequence; The instruction generation module is used to construct a set of legality judgment indicators based on the permission relationship sequence and the file relationship sequence, and to perform transcendence relationship judgment and filtering operations on the set of legality judgment indicators using the ELECTRE algorithm to generate a sequence of management instructions. The management update module is used to manage the corresponding archive data in the archive data sequence according to the management instruction sequence, record the corresponding operation information, and update the permission knowledge graph according to the operation information.
[0035] Example 1: To verify the feasibility of this invention in practice, it was applied to the electronic accounting record management scenario of a large organization. Over a long period of operation, this organization generated a large number of accounting vouchers, accounting voucher attachments, expense reports, invoice images, scanned copies of contracts, payment approval records, archived report files, and historical borrowing records. These data originated from the financial processing end, the approval workflow end, and the record collection end, respectively. As the scale of archives continues to increase, the original management methods have gradually revealed several problems. On the one hand, archive data from different sources differ in format, field naming, time recording methods, and association identifiers. When retrieving and verifying data, managers often need to first search in multiple data interfaces separately, and then rely on manual experience to confirm whether it belongs to the same business chain. The retrieval process is cumbersome and the association efficiency is low. On the other hand, the permission boundaries of the operating entities mainly rely on static permission tables for control. When the number of archive categories increases and the approval levels increase, the original method is unable to accurately reflect complex constraints such as "who can view which archives, who can initiate borrowing, who can perform archive adjustments, and who needs additional approval." This easily leads to problems such as insufficiently granular permission judgment, inaccurate archive retrieval, and incomplete operation traceability. Especially when dealing with mixed archives containing text content, page structure, and image pages, simply relying on field matching to determine the target archive can easily lead to inaccurate target archive extraction and unstable association mapping results, thereby affecting the accuracy of subsequent management operations.
[0036] In this scenario, historically accumulated accounting archive data and management rule data are first centrally organized. Accounting archive data includes archived vouchers, attached images, reimbursement records, approval records, borrowing records, and audit traces. Management rule data includes identity category constraints, job authorization constraints, archive category access constraints, approval workflow constraints, and operational restriction constraints. After standardizing the format, validating fields, aligning dates, and deduplicating all raw data, an archive data sequence and a rule data sequence are formed. Subsequently, rule subjects, rule objects, rule constraints, and rule associations are extracted from the rule data sequences. These are transformed into nodes and relationships, and the HAKE method is used to hierarchically embed these nodes and relationships. Then, the CompGCN network is used to perform relationship combination propagation and association aggregation processing to construct a permission knowledge graph. After this processing, the permission relationships, originally scattered across rule tables and configuration documents, are organized into a unified graph structure, creating stable associations between identity, permissions, archive categories, and approval constraints.
[0037] In daily use, when an operator initiates a request to access, borrow, supplement, or adjust archives, the operator's identity data is first collected. This identity data includes standardized basic identity fields, credential identity fields, and operational identity fields. The UDGNet network is used to extract and update identity features from the identity data. After obtaining the updated identity representation, it is matched against the permission knowledge graph to construct a sequence of permission relationships. This step solves the problem that traditional static role-based control methods are difficult to adapt to complex permission constraints. It ensures that the current operator no longer corresponds to a fixed role, but rather obtains a more granular permission mapping result in the graph based on their identity features, historical operations, and current constraint relationships.
[0038] Upon receiving a current operation request, the system first segments and uniformly encodes the request category, request object, request time, and request-related fields. Then, it extracts the target archive data from the archive data sequence based on the request feature sequence. Since the target archive data often contains text, layout, and image content simultaneously, the LayoutLMv3 model is used to perform multimodal information fusion processing on the target archive data. The fusion result is then mapped to the permission knowledge graph to form an archive relationship sequence. The advantage of this approach is that the system no longer relies on a single field to lock the archive, but instead incorporates the text content, document layout, and image structure of the archive into the judgment scope, thereby improving the accuracy of target archive extraction and association mapping. This is particularly suitable for accounting archive scenarios with many supporting documents and a high proportion of scanned copies.
[0039] After generating the permission relationship sequence and the file relationship sequence, the system constructs a set of legality judgment indicators and uses the ELECTRE algorithm to perform transcendence relationship judgment and filtering processing, generating a management instruction sequence. If the current operation request meets the permission constraints, file association constraints, and approval constraints, an executable management instruction is generated; otherwise, a restrictive management instruction or a supplementary approval management instruction is generated. The management instruction sequence is used to drive subsequent file management operations, including file retrieval, file borrowing, file archiving adjustment, and file status write-back. After the operation is completed, the system automatically records the corresponding operation information and updates the association relationships in the permission knowledge graph based on the operation information, so that subsequent similar requests can complete the legality judgment based on a more realistic operating state. In this way, the entire management process forms a closed loop of "rule graph construction - identity matching - file mapping - legality judgment - instruction execution - graph update", which not only solves the problem of insufficient expression of permission relationships in accounting file management, but also solves the problem of unstable target file retrieval and management judgment.
[0040] To further verify the practical effect of the present invention, multiple batches of continuously collected accounting archive data were selected for operational testing in this scenario. During the test, the results of manual review were retained for each type of operation request as a reference to ensure that the data has strong authenticity and reference value. The specific test data is shown in the table below: Table 1 Comparative Analysis of the Implementation Effects of Accounting Archives Data Management
[0041] As shown in Table 1, the method of this invention significantly outperforms the original management method in several core indicators. The accuracy rate of target document extraction increased from 91.38% to 98.64%, with more significant improvements in the extraction accuracy of multi-attachment documents and image-based invoice documents. This indicates that the multimodal information fusion method based on the LayoutLMv3 model can more effectively utilize the text, layout, and image features in accounting documents, solving the problem of insufficient recognition capability of single-field retrieval methods for complex documents.
[0042] The accuracy of permission determination improved from 89.57% to 98.12%, while the false interception rate and the missed interception rate decreased to 1.42% and 1.87%, respectively. This indicates that updating identity features through the UDGNet network and matching based on the permission knowledge graph can more accurately reflect the true relationship between the operating subject and permission constraints, reducing erroneous granting and restriction.
[0043] The completeness of the mapping of file relationships, the completeness of the borrowing approval chain restoration, and the completeness of operation traceability all increased by more than 12%, indicating that the present invention can organize and update file data, permission rules, and operation information in a unified graph structure, significantly enhancing the traceability and relevance of the accounting file management process.
[0044] Meanwhile, the average processing time per request and the average processing time under high concurrency conditions decreased by 42.61% and 43.41% respectively, and the proportion of manual review intervention decreased significantly, indicating that the present invention not only improved management accuracy but also enhanced overall processing efficiency.
[0045] In summary, this invention has a good implementation effect in solving problems such as insufficient expression of permissions, inaccurate positioning of target files and untimely updates of management results in accounting file management, and can meet the comprehensive requirements of accuracy, traceability and processing efficiency in practical application scenarios.
[0046] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. A knowledge graph-based method for managing accounting record data, characterized in that, Includes the following steps: S1. Acquire accounting record data and management rule data and preprocess them to generate record data sequences and rule data sequences; S2. Extract nodes and relationships from the rule data sequence, perform hierarchical embedding on the extracted results using the HAKE method, and use the CompGCN network to perform relationship combination propagation and association aggregation on the embedded results to construct a permission knowledge graph. S3. Obtain the identity data of the current operating entity, use the UDGNet network to perform identity feature extraction and update operations on the identity data, and match it in the permission knowledge graph according to the update results to construct a permission relationship sequence; S4. Obtain the current operation request, extract the target archive data from the archive data sequence according to the current operation request, and perform multimodal information fusion operation through the LayoutLMv3 model. Map the fusion result to the permission knowledge graph to form an archive relationship sequence. S5. Construct a set of legality judgment indicators based on the permission relationship sequence and the file relationship sequence. Use the ELECTRE algorithm to perform transcendence relationship judgment and filtering operations on the set of legality judgment indicators to generate a management instruction sequence. S6. Manage the corresponding archive data in the archive data sequence according to the management instruction sequence, record the corresponding operation information, and update the permission knowledge graph according to the operation information.
2. The accounting record data management method based on knowledge graph according to claim 1, characterized in that, The accounting archive data refers to the original data set related to accounting business and used for archiving management. The management rule data refers to the rule-based data set used to constrain the accounting archive management process. The preprocessing includes format unification, field validation, time alignment and deduplication. The current operation subject refers to the identity object that initiates the current operation request. The identity data refers to the data set used to characterize the identity features of the current operation subject. The current operation request refers to the current management operation instruction information initiated for the accounting archive data.
3. The accounting record data management method based on knowledge graphs according to claim 1, characterized in that, S2 specifically includes: S21. Divide the rule data sequence into rule units and extract the rule subject, rule object, rule constraint, and rule association from each rule unit. Treat the rule subject and rule object as nodes and the rule constraint and rule association as associations. S22. The HAKE method is used to perform hierarchical position mapping operation on nodes and hierarchical semantic mapping operation on relationships. The position mapping results and semantic mapping results are then subjected to hierarchical embedding operation to obtain node representations and relationship representations. S23. In the CompGCN network, the relationship combination propagation operation is performed on the node representation according to the association relationship, and the relationship representation is subjected to association aggregation processing during the propagation process to generate the updated node representation and relationship representation; S24. Write the updated node representation and relationship representation into the corresponding nodes and associations respectively to construct the permission knowledge graph.
4. The accounting record data management method based on knowledge graphs according to claim 3, characterized in that, S22 specifically includes: S221. Traverse each rule unit, count the number of times each node appears as the subject of a rule and the number of times it appears as the object of a rule, and subtract the two to obtain the position difference of each node in the corresponding rule unit. S222. Accumulate the position differences of each node in each rule unit, and divide the accumulated result by the total number of times the node appears in all rule units to obtain the position mapping value of each node; S223. Traverse each association relationship, record the order of appearance of the nodes connected by each association relationship in the corresponding rule unit, subtract the order of appearance of the next node from the order of appearance of the previous node, and obtain the order difference value corresponding to each association relationship. S224. Accumulate the order difference of each association, multiply the accumulated result by the number of times the association appears in all rule units, and then divide the product by the total number of all associations to obtain the semantic mapping value of each association. S225. The HAKE method is used to perform hierarchical embedding operations on the positional mapping values and semantic mapping values, specifically including: Perform a numerical sorting operation on the position mapping value of each node, determine the hierarchical order of the corresponding nodes according to the numerical size rule, and concatenate the position mapping value of each node with the corresponding hierarchical order to form a node representation; Perform a numerical sorting operation on the semantic mapping value of each association, determine the hierarchical order of the corresponding associations according to the numerical size rule, and concatenate the semantic mapping value of each association with the corresponding hierarchical order to form a relation representation.
5. The accounting record data management method based on knowledge graph according to claim 3, characterized in that, S23 specifically includes: S231. In the CompGCN network, the relationship representations corresponding to each association are sequentially represented, and the node representations corresponding to the previous node connected to the association are multiplied by position to obtain the relationship combination values of each association. S232. Perform a numerical summation operation on the relationship combination values corresponding to each relationship, and divide the summation result by the number of values in the corresponding relationship representation to obtain the relationship weight value of each relationship. S233. Multiply the relationship combination value of each relationship by the corresponding relationship weight value to obtain the weighted combination value of each relationship; S234. Read the weighted combination value corresponding to all the associations that each node is connected to as the previous node, and sum them up. Divide the summed value by the number of associations that the node is connected to as the previous node to obtain the node propagation value of the node. S235. Subtract the propagation value of each node from its corresponding node representation to obtain the propagation difference of each node. S236. Add the propagation value of each node to the corresponding propagation difference to obtain the updated node representation; S237. Accumulate the relation representations corresponding to all associations connected to the same node, and divide by the number of associations connected to that node to obtain the relation aggregation value of that node; S238. Perform addition operations on the aggregated relation value and the relation representations corresponding to each association connected to the node to obtain the updated relation representation.
6. The accounting record data management method based on knowledge graph according to claim 1, characterized in that, S3 specifically includes: S31. Obtain the identity data of the current operating entity, perform missing item removal, field alignment, numerical normalization and feature sorting operations on each identity field in the identity data, divide the processed identity data into basic identity vector, voucher identity vector and operation identity vector according to the preset field categories, and concatenate them in a preset order to obtain the identity vector. S32. In the UDGNet network, perform identity feature extraction operation on the identity vector, perform layer-by-layer mapping operation on each dimension of the identity vector, extract the local identity representation corresponding to each dimension, and integrate all local identity representations to obtain the initial identity representation. S33. Perform an identity representation update operation on the initial identity representation. Read the representation value of each position in the initial identity representation, calculate the difference between the representation values of adjacent positions, and perform a weighted adjustment operation on the representation value of each position according to the difference. Summarize the weighted results to obtain the updated identity representation. S34. Read the node representation of each node in the permission knowledge graph, calculate the difference between the updated identity representation and each node representation, and perform distance comparison and size sorting operations according to the difference results to determine the target node representation corresponding to the updated identity representation; S35. Read the corresponding relationships of the target node, perform permission type identification operation on each relationship, extract the permission relationship corresponding to the current operation subject, and arrange them in a preset order to construct a permission relationship sequence.
7. The accounting record data management method based on knowledge graph according to claim 1, characterized in that, S4 specifically includes: S41. Obtain the current operation request, and perform field splitting, sequence alignment and unified encoding operations on the request category, request object, request time and request association fields in the current operation request to form a request feature sequence; S42. Extract the target archive data from the archive data sequence according to the requested feature sequence, and perform segmentation and sorting operations on the text content, page position and image content in the target archive data to form an archive text sequence, an archive page sequence and an archive image sequence. S43. In the LayoutLMv3 model, feature extraction operations are performed on the archival text sequence, archival layout sequence, and archival image sequence to obtain text representation, layout representation, and image representation; S44. Perform multimodal information fusion operations on text representation, layout representation, and image representation, specifically including: Read the layout representation corresponding to each text representation and perform the splicing operation in a fixed order to obtain the preliminary fusion result; Read the image representations corresponding to each preliminary fusion result, and perform the stitching operation in a fixed order to obtain the intermediate fusion result; Perform a step-by-step accumulation operation on the values in each intermediate fusion result, and divide the accumulation result by the total number of values to obtain the fusion value corresponding to each intermediate fusion result; Perform a sequential sorting operation on each fused value according to the arrangement order of the archive text sequence, and then concatenate all the sorted fused values in sequence to obtain the fused result; S45. Based on the fusion results, read the corresponding nodes and relationships in the permission knowledge graph, and perform the association mapping operation. Based on the mapping results, establish the file association relationship between the target file data and the permission knowledge graph to form a file relationship sequence.
8. The accounting record data management method based on knowledge graph according to claim 1, characterized in that, S5 specifically includes: S51. Read the permission relationship sequence and file relationship sequence in the order of arrangement. Perform a subtraction operation on the permission relationship and file relationship at the same position. Add up the values of each item in the subtraction result and divide the sum by the total number of corresponding values to construct a set of legal judgment indicators. S52. Read the current operation request, perform a unified encoding operation on the current management operation instruction information in the current operation request, perform a subtraction operation on each item of the encoding result and each legal judgment value in the legal judgment index set, and sort the values in the subtraction result according to the size of the values. S53. Use the ELECTRE algorithm to perform a transcendence relation determination operation on the sorting results to obtain the relation determination results; S54. Filter the relationship determination results, retain the sorting results corresponding to relationship determination results greater than or equal to 0, remove the sorting results corresponding to relationship determination results less than 0, and sort the retained sorting results in descending order of numerical value to generate a management instruction sequence.
9. The accounting record data management method based on knowledge graph according to claim 8, characterized in that, S53 specifically includes: S531. Using the ELECTRE algorithm, read the values in the sorting results in the order of arrangement, and select the previous sorting result and the next sorting result as a set of comparison objects in turn. S532. Perform item-by-item comparison operation on the values in each group of comparison objects, record the values in the previous sorting result that are greater than the next sorting result as satisfied values, record the values in the previous sorting result that are less than the next sorting result as unsatisfied values, and record the values in the previous sorting result that are equal to the next sorting result as kept values. S533. Count the number of satisfied values, the number of unsatisfied values, and the number of held values in each group of comparison objects. Subtract the number of unsatisfied values from the number of satisfied values, and add the subtraction result to the number of held values to obtain the judgment value of each group of comparison objects. S534. Write each judgment value into the relation judgment result according to the order of the corresponding comparison objects.
10. A knowledge graph-based accounting record data management system, comprising the knowledge graph-based accounting record data management method according to any one of claims 1 to 9, characterized in that, include: The data acquisition module is used to acquire and preprocess accounting archive data and management rule data, and generate archive data sequences and rule data sequences. The graph construction module is used to extract nodes and relationships based on rule-based data sequences, perform hierarchical embedding operations on the extracted results using the HAKE method, and perform relationship combination, propagation, and association aggregation processing on the embedded results through the CompGCN network to construct a permission knowledge graph. The permission matching module is used to obtain the identity data of the current operating entity, and uses the UDGNet network to perform identity feature extraction and update operations on the identity data. Based on the update results, it performs matching in the permission knowledge graph to construct a permission relationship sequence. The relationship mapping module is used to obtain the current operation request, extract the target archive data from the archive data sequence according to the current operation request, and perform multimodal information fusion operation through the LayoutLMv3 model, and map the fusion result to the permission knowledge graph to form an archive relationship sequence; The instruction generation module is used to construct a set of legality judgment indicators based on the permission relationship sequence and the file relationship sequence, and to perform transcendence relationship judgment and filtering operations on the set of legality judgment indicators using the ELECTRE algorithm to generate a sequence of management instructions. The management update module is used to manage the corresponding archive data in the archive data sequence according to the management instruction sequence, record the corresponding operation information, and update the permission knowledge graph according to the operation information.