An automated financial bill of lading system integrating AI Agent and headless browser technology
By building an automated financial bill of lading system that integrates artificial intelligence agents and headless browser technology, the problems of unstable data collection and lack of context awareness in accounting processing in existing technologies have been solved. This system achieves high robustness and high flexibility in complex business scenarios and provides an end-to-end unmanned intervention solution.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- MYRON INTELLIGENT TECH (SHANGHAI) CO LTD
- Filing Date
- 2026-03-18
- Publication Date
- 2026-06-19
AI Technical Summary
In existing financial bill of lading systems, the collaboration mechanism between AI Agent and headless browser is insufficient, resulting in instability in data collection, difficulty in adapting to dynamic webpage structure changes and interactive obstacles such as CAPTCHAs, and the lack of context awareness and exception rollback mechanism in accounting processing, which limits the system's generalization ability in complex business scenarios.
An automated financial bill of lading system integrating AI agent and headless browser technologies is constructed, including a task scheduling hub, intelligent agent engine, headless browser executor, context-aware module, and anomaly self-healing manager. This system enables deep collaboration, dynamically generates operational strategies, monitors task status in real time, automatically generates vouchers that comply with corporate accounting standards, and automatically repairs itself in case of anomalies.
It improves the stability of data collection and the task completion rate, supports highly personalized bill of lading processes, ensures the accuracy and compliance of accounting vouchers, and achieves highly available closed-loop execution with no human intervention throughout the entire process, reducing reliance on manual intervention.
Smart Images

Figure CN122240265A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of interdisciplinary technology of artificial intelligence and financial automation, and in particular to an automated financial bill of lading system that integrates AI Agent and headless browser technology. Background Technology
[0002] As enterprises deepen their digital transformation, the automation of financial bills of lading has become a key element in improving the efficiency and accuracy of financial management. In recent years, the integration of AI agents and headless browser technologies has provided a new technological path for financial process automation. However, existing technological solutions still have significant shortcomings in terms of system integration depth, adaptability to dynamic environments, and end-to-end automation loops, making it difficult to meet the demands of complex and ever-changing enterprise financial scenarios for high robustness, high flexibility, and fully automated processes.
[0003] A search revealed a patent, CN120876129A, entitled "Intelligent Financial Bill of Lading System Based on AI Agent and Browserless Technology," published on October 31, 2025. This patent proposes a financial bill of lading system combining AI Agent and Browserless technology, including modules for data collection, intelligent review, and accounting processing. It utilizes a headless browser for automatic data capture and reviews data through a rule engine and machine learning model. However, in this technical solution, the AI Agent operates only as an independent review module, failing to form a deep collaborative mechanism with the headless browser. It lacks autonomous decision-making and adaptive adjustment capabilities when facing dynamic webpage structure changes or interactive obstacles such as CAPTCHAs and login redirects, resulting in insufficient data collection stability. Furthermore, its accounting processing relies on preset rules and fails to implement context-aware intelligent voucher generation and anomaly rollback mechanisms, limiting the system's generalization ability in complex business scenarios. Summary of the Invention
[0004] The purpose of this invention is to provide an automated financial bill of lading system that integrates artificial intelligence agent and headless browser technology, which can effectively solve the problems mentioned in the background technology.
[0005] To achieve the above objectives, the technical solution adopted by the present invention is as follows: An automated financial bill of lading system integrating artificial intelligence agent and headless browser technology includes a task scheduling hub, an intelligent agent engine, a headless browser executor, a context-aware module, a voucher generation unit, and an anomaly self-healing manager, wherein: The task scheduling center is used to receive bill of lading task instructions from the enterprise's financial system, and to parse and distribute tasks according to task type, priority and target platform characteristics, and coordinate the sequential execution of various functional components. The intelligent agent engine is configured to dynamically generate operational strategies for the current financial bill of lading task based on natural language understanding and business rule reasoning, monitor the task execution status in real time, and adjust the subsequent action sequence according to environmental feedback. The headless browser executor is used to simulate user operation behavior in an isolated operating environment, access the target financial platform webpage, perform interactive actions such as page loading, element recognition, form filling, button clicking and data submission, and send the page state and operation results back to the intelligent agent engine during the execution process. The context awareness module is used to collect and analyze the business context information of the current task in real time, including but not limited to user identity, reimbursement type, expense details, approval process stage and historical operation trajectory, and provide the structured context data to the intelligent agent engine to support its decision optimization. The voucher generation unit is used to automatically generate accounting vouchers that comply with enterprise accounting standards based on the completed bill of lading data and the associated accounting rule base, and to ensure the consistency of voucher accounts, amounts, auxiliary accounting items and original business data. The anomaly self-healing manager is used to automatically trigger preset recovery strategies when operation failure, page anomaly, or logical conflict is detected during task execution. These strategies include retry mechanisms, path switching, CAPTCHA recognition requests, or manual intervention markers. When conditions permit, the manager can also complete the autonomous repair and continuation of the task.
[0006] Preferably, the intelligent agent engine has a built-in multi-level decision model, including a rule-based deterministic reasoning layer and a machine learning-based probabilistic prediction layer, which can dynamically select or fuse the two reasoning methods according to the task complexity to improve the operational robustness in unstructured web page environments.
[0007] Furthermore, the headless browser executor integrates a dynamic element positioning mechanism, which can automatically identify and locate key operation elements when the target webpage structure changes through semantic similarity matching and layout topology analysis, thus avoiding task interruption caused by page redesign.
[0008] Furthermore, the context-aware module establishes a real-time connection with the enterprise's organizational structure and financial system database, enabling it to dynamically obtain the current user's permission scope, budget balance, and compliance constraints. This constraint information is then embedded into the decision input of the intelligent agent engine to ensure that bill of lading behavior always remains within compliance boundaries.
[0009] Preferably, the voucher generation unit supports multi-dimensional mapping rule configuration, which can automatically match the corresponding accounting subjects and auxiliary accounting dimensions according to the expense type, business department, project number and supplier information, and trigger an early warning mechanism when subject conflicts or data missing are detected.
[0010] Furthermore, the anomaly self-healing manager is configured with a hierarchical response strategy library, which sets differentiated processing priorities for different types of anomalies. For minor anomalies that can be automatically recovered, local retry or parameter fine-tuning strategies are adopted. For severe anomalies involving security verification or logical ambiguity, a context snapshot is automatically recorded and transferred to the manual review queue.
[0011] Furthermore, the task scheduling hub has the ability to model task dependencies, which can identify the temporal or data dependencies between multiple bills of lading tasks and dynamically adjust the task execution order accordingly to ensure the consistency of cross-task data and the continuity of the process.
[0012] Preferably, the intelligent agent engine establishes a two-way communication channel with the headless browser executor, which not only receives the page status returned by the executor, but also sends fine-grained operation instructions to the executor, including parameters such as scroll position, waiting time, and input delay, in order to simulate a behavior pattern closer to that of a real user and avoid the interception of anti-automation mechanisms.
[0013] Furthermore, the system is deployed entirely in an enterprise private cloud environment, and all data transmission between components is encrypted. The headless browser executor automatically clears the local cache and session information after each task, ensuring the security and privacy of financial data.
[0014] The beneficial effects of this invention are as follows: The provided automated financial bill of lading system, which integrates artificial intelligence agents and headless browser technology, achieves strong adaptability to dynamic web page environments by constructing a deep collaborative mechanism between the intelligent agent engine and the headless browser executor, significantly improving the stability of data collection and task completion rate; the system introduces a context-aware module, enabling operational decisions to be closely integrated with specific business scenarios, thereby supporting highly personalized bill of lading processes; the voucher generation unit, based on a context-driven intelligent mapping mechanism, ensures the accuracy and compliance of accounting vouchers; the introduction of an anomaly self-healing manager effectively solves the problem that traditional automated systems are prone to getting stuck in a deadlock state when facing unexpected obstacles, achieving highly available closed-loop execution without human intervention throughout the entire process. Overall, this system maintains high flexibility while significantly reducing reliance on manual intervention, providing a highly robust and intelligent end-to-end solution for enterprise financial digital transformation. Attached Figure Description
[0015] Figure 1This is a schematic diagram of the overall technical architecture of an automated financial bill of lading system that integrates AI Agent and headless browser technology according to an embodiment of this application; Figure 2 This is a schematic diagram illustrating the core principle framework of bidirectional collaboration between the intelligent agent engine and the headless browser executor in an automated financial bill of lading system that integrates AI Agent and headless browser technology, according to an embodiment of this application. Figure 3 This is a flowchart illustrating the task execution logic of an automated financial bill of lading system that integrates AI Agent and headless browser technologies, based on business context awareness and an anomaly self-healing mechanism, according to an embodiment of this application. Detailed Implementation
[0016] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0017] Specific implementation examples are given below. Example 1
[0018] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to specific embodiments.
[0019] An automated financial bill of lading system integrating artificial intelligence agent and headless browser technology includes a task scheduling center, an intelligent agent engine, a headless browser executor, a context-aware module, a voucher generation unit, and an anomaly self-healing manager. The task scheduling center is connected to the intelligent agent engine and the context-aware module, respectively, and is used to receive bill of lading task instructions from the enterprise's financial system, and to parse and distribute tasks according to task type, priority, and target platform characteristics, coordinating the sequential execution of various functional components. The intelligent agent engine is connected to the headless browser executor, the context-aware module, and the anomaly self-healing manager, and is configured to dynamically generate operation strategies for the current financial bill of lading task based on natural language understanding and business rule reasoning, and to monitor the task execution status in real time, adjusting the subsequent action sequence according to environmental feedback. The headless browser executor is connected to the anomaly self-healing manager and is used to simulate user operation behavior in an isolated operating environment, access the target financial platform webpage, and perform interactive actions such as page loading, element recognition, form filling, button clicking, and data submission. It also sends the page state and operation results back to the intelligent agent engine during the execution process. The context awareness module is connected to the voucher generation unit and is used to collect and analyze the business context information of the current task in real time, and provide structured context data to the intelligent agent engine to support its decision optimization. The voucher generation unit is used to automatically generate accounting vouchers that comply with enterprise accounting standards based on the completed bill of lading data and the associated accounting rule base. The anomaly self-healing manager is used to automatically trigger a preset recovery strategy when operation failure, page anomaly, or logical conflict is detected during task execution.
[0020] The task scheduling hub includes a task receiving submodule, a task parsing engine, a priority sorting unit, a task dependency modeler, and a global coordinator. The task receiving submodule is configured to receive pending bill of lading task requests in real time from external enterprise resource planning systems, expense management systems, or office automation platforms via standard application programming interfaces (APIs) or message queues. Upon receiving a request, the task parsing engine uses a pre-defined financial business ontology model to perform semantic extraction on the task text, identifying the financial category to which the bill of lading task belongs, such as travel expense reimbursement, purchase payment, or fixed asset receipt. The priority sorting unit calculates a comprehensive priority score for each task based on its urgency coefficient, the amount involved, and the set deadline using a multi-factor weighted algorithm, placing tasks with scores higher than a preset threshold at the front of the execution queue. The task dependency modeler is capable of identifying temporal or data dependencies between multiple bill of lading tasks. It describes the logical relationships between tasks by constructing a directed acyclic graph (DAG). For example, before executing a payment bill of lading task, it must ensure that the corresponding invoice verification task is completed, thereby dynamically adjusting the task execution order to ensure data consistency and process continuity across tasks. The global coordinator acts as the command pylon for the entire system, responsible for sending start commands to the intelligent agent engine based on the task queue status and monitoring the resource usage of each module to achieve load balancing.
[0021] The intelligent agent engine includes a multi-level decision model, a natural language processing layer, a business rule inference engine, and a state feedback listener. The multi-level decision model consists of a rule-based deterministic inference layer and a machine learning-based probabilistic prediction layer. The rule-based deterministic inference layer stores a large amount of expert experience logic to handle standardized order procedures with high structure and fixed processes. The machine learning-based probabilistic prediction layer utilizes deep neural networks to model unstructured web page element interactions, predicting the optimal click position or input order when facing a new page layout based on historical operation trajectories. The system can dynamically select or fuse the two inference methods based on the complexity score of the current order task. When the task complexity score is less than a preset low complexity threshold, the deterministic inference layer is invoked first to ensure execution speed. When the task involves complex form validation logic or is in a highly volatile web page environment, the system uses a fusion mechanism to use the output of the probabilistic prediction layer as a supplementary constraint to the deterministic inference, thereby improving operational robustness in unstructured web page environments. The natural language processing layer is configured to convert business instructions into machine-understandable operational primitives. The business rule inference engine maintains a financial knowledge graph in real time to ensure that the generated operational strategies do not violate the company's internal control policies. The status feedback listener establishes a two-way communication channel with the headless browser executor, not only receiving the page status returned by the executor, but also issuing fine-grained operation instructions to the executor, including parameters such as scroll position, waiting time, and input delay, to simulate a behavior pattern closer to that of a real user, thereby effectively circumventing the financial platform's anti-automation defense mechanisms.
[0022] The headless browser executor includes an isolated environment container, a dynamic element locator, a simulated behavior generator, and a page state analyzer. The isolated environment container assigns an independent lightweight sandbox to each order task, ensuring that session information, cached data, and token credentials between different tasks do not interfere with each other. The dynamic element locator integrates a semantic similarity matching algorithm and a layout topology analysis engine. During execution, if the target webpage structure changes, causing traditional path positioning to fail, the dynamic element locator extracts the surrounding text features, coordinate attributes, and parent-child node relationships of key operation elements. By calculating the cosine similarity between the element to be matched and the target features, it automatically identifies and relocates the key operation elements, effectively avoiding task interruptions caused by page redesigns. The simulated behavior generator is responsible for executing specific webpage interactions. It supports asynchronous execution logic and can simulate complex actions such as keyboard input, mouse hovering, drag-and-drop verification, and file uploads based on the instruction stream issued by the intelligent agent engine. The page state analyzer scans the document object model rendered by the browser in real time, converting the page's visual features, prompts, and hidden form states into structured data, which is then sent back to the intelligent agent engine for further decision-making.
[0023] The context-aware module includes a real-time data collector, a business semantic mapping unit, and an external policy connector. The real-time data collector gathers all environmental information relevant to the current task, including but not limited to the current operator's access level, the type of expense report, the detailed amount of each expense, the specific node in the approval flow, and the task's historical operation logs. The business semantic mapping unit maps this raw context information into a unified business feature vector, providing the intelligent agent engine with high-dimensional decision-making support. The external policy connector establishes a real-time connection with the enterprise's organizational structure database and financial policy rule base, dynamically acquiring the current user's travel allowance standards, remaining departmental budget, and compliance constraints. For example, when processing a travel expense reimbursement task exceeding the departmental budget, the context-aware module can identify the compliance conflict and embed this constraint information into the intelligent agent engine's decision input, prompting the agent engine to automatically generate warnings or adjust the reimbursement strategy to ensure that the reimbursement behavior always remains within compliance boundaries.
[0024] The voucher generation unit includes an accounting rule matching engine, a subject mapping processor, an amount validator, and an automated voucher synthesizer. The voucher generation unit supports multi-dimensional mapping rule configuration and can automatically match corresponding accounting subjects and auxiliary accounting dimensions from the accounting subject library based on expense type, business department, associated project number, and supplier information. The accounting rule matching engine integrates enterprise accounting standards and specific accounting methods; upon task completion, it retrieves the final data successfully submitted by the headless browser executor. The subject mapping processor determines the debit and credit directions and corresponding details through multi-level association logic. The amount validator executes two-way reconciliation logic to ensure that the total amount of the generated accounting voucher is completely consistent with the amount on the original business document and the receipt after successful webpage submission. If subject conflicts, missing auxiliary accounting items, or imbalances in debit and credit amounts are detected during the mapping process, the voucher generation unit immediately triggers an early warning mechanism, stores the abnormal snapshot in the log, suspends the voucher download process, and awaits manual review or strategy correction.
[0025] The anomaly self-healing manager includes anomaly monitoring probes, a tiered response strategy library, a self-healing execution engine, and a manual intervention guide. The anomaly monitoring probes are deployed at various execution stages of the system to capture anomalies such as network timeouts, page crashes, CAPTCHA pop-ups, logic errors, or service interruptions in real time. The tiered response strategy library sets differentiated processing priorities and recovery paths for different types of anomalies. For minor anomalies that can be automatically recovered, such as page loading failures due to network fluctuations, the self-healing execution engine uses local retry or parameter fine-tuning strategies, attempting to repair the issue by increasing the waiting time or refreshing the page. For anomalies involving security verification (such as sliding CAPTCHAs), the anomaly self-healing manager sends a request to a dedicated identification service, obtains the verification result, and feeds it back to the headless browser executor for continued execution. For severe anomalies involving business logic ambiguity, account locking, etc., the anomaly self-healing manager automatically records the current system state snapshot, stack information, and page screenshots, transfers them to a manual review queue, and sends instant messaging notifications to relevant personnel to ensure timely manual intervention.
[0026] Furthermore, the entire system is deployed in an enterprise private cloud environment, and all data transmission between components uses a highly encrypted transmission protocol. The headless browser executor has a thorough cleanup mechanism that automatically destroys all temporary files, session information, and sensitive caches in the current container after each financial order task is completed and the result is confirmed and reported, ensuring the security and privacy protection of enterprise financial data from the underlying architecture level. Example 2
[0027] Based on the automated financial bill of lading system that integrates artificial intelligence agent and headless browser technology as described in Embodiment 1, this embodiment provides a system implementation method based on a distributed cluster architecture to meet the needs of ultra-large-scale enterprises for extremely high concurrency processing capabilities and high system availability.
[0028] In an automated financial bill of lading system based on a distributed cluster architecture, the task scheduling hub is configured as a distributed cluster of central nodes, employing a master-slave architecture to ensure high availability of scheduling. The task receiving submodule integrates a distributed message middleware, such as a high-throughput stream processing platform, to receive reimbursement requests from branch offices worldwide. Since large enterprises' financial platforms are often distributed across different geographical regions or subnets, the task scheduling hub can intelligently distribute tasks to the nearest execution node based on the geographic location tag of the target financial platform, thereby reducing network latency and improving bill of lading efficiency.
[0029] In the distributed embodiment, the intelligent agent engine's multi-level decision-making model is broken down into independent decision microservices. These microservices are deployed on a containerized platform that supports dynamic scaling. When the system detects that the backlog of tasks exceeds a preset task alarm threshold, it automatically expands the number of instances of the decision microservices to improve the parallelism of logical reasoning. At this time, the machine learning-based probabilistic prediction layer utilizes a distributed model inference acceleration engine, calling upon GPU computing resource pools to provide low-latency operation prediction support for multiple concurrent tasks.
[0030] The headless browser executor, in a distributed architecture, manifests as a large-scale cluster of browser instances. Each instance runs within a lightweight container and is uniformly managed by a container orchestration engine. To address potential IP access restrictions on the financial platform, the headless browser executor also integrates a distributed proxy resource management unit. This unit maintains a trusted pool of proxy addresses and dynamically allocates different egress network addresses to each browser instance based on the target platform's access frequency restriction policy. Furthermore, the headless browser executor has a health self-check function. If a container instance runs for too long, causing memory overflow or slow response, the container orchestration engine automatically removes it and starts a new instance, ensuring the stability of the execution environment.
[0031] In the distributed embodiment, the context-aware module employs a state storage mechanism based on distributed caching technology. Since bill of lading tasks may flow between different physical execution nodes (e.g., switching nodes under a retry mechanism), the context-aware module ensures that all contextual feature data, including user permissions, budget status, and operation history, are synchronized in real-time to a globally shared distributed cache. This ensures that regardless of which intelligent agent instance takes over the task, it can immediately obtain the complete business semantic context, guaranteeing the consistency of decision-making. Simultaneously, the external policy connector addresses the query pressure on the enterprise policy database under high concurrency by establishing a high-performance read-only copy database, ensuring that compliance verification does not become a performance bottleneck for the system.
[0032] In this embodiment, the voucher generation unit introduces a distributed transaction consistency management mechanism. During the generation of accounting vouchers, the system needs to ensure strong consistency between the submission status of the financial bill of lading platform and the write status of the local voucher repository. The voucher generation unit uses a two-phase commit protocol or a compensating transaction mechanism to only persist the accounting voucher in the local voucher system after confirming "submission successful" on the financial webpage and obtaining a compliant receipt number. If a network interruption occurs during the generation process, the system will automatically use reverse reversal logic or retry logic to ensure that financial data does not deviate between systems.
[0033] The aforementioned anomaly self-healing manager possesses global collaborative features in a distributed environment. It maintains a global anomaly feature library. When an execution node encounters new pop-up interference or page structure changes while accessing a specific financial page, the anomaly self-healing manager synchronizes the anomaly's feature description and successful repair strategy to all other nodes in the cluster in real time. This group learning mechanism enables the system to quickly adapt to updates to the financial platform. Simultaneously, the tiered response strategy library supports differentiated configurations for different business lines. For example, for high-value fund disbursement tasks, the anomaly self-healing level is set to "high," and the system will more actively try multiple self-healing paths; while for ordinary reimbursement tasks, it relies more on standard retry strategies.
[0034] The distributed system in this embodiment also includes a centralized monitoring and alarm dashboard. This dashboard aggregates performance metrics, task success rates, anomaly distribution frequencies, and resource utilization of all execution nodes in real time, providing operations and maintenance personnel with a panoramic view of the system's operational status and supporting rapid location of faulty nodes and manual intervention. Example 3
[0035] Based on the foregoing embodiments, this embodiment describes in detail the specific configuration logic and technical implementation of the machine learning-based probabilistic prediction layer in the intelligent agent engine when facing an unstructured web page environment.
[0036] The probabilistic prediction layer incorporates a multimodal perceptual neural network model, configured to simultaneously process both text flow features and visual layout features of a webpage. When processing a specific order step, the headless browser executor first transforms the document object model tree of the current page into a graph structure. The probabilistic prediction layer then uses the graph neural network to extract features from this webpage structure. The graph neural network is configured to use 128-dimensional vectors as node features, which include the tag names, class names, depth levels, and associated textual semantics of webpage elements.
[0037] Simultaneously, the probabilistic prediction layer also includes a computer vision sub-model. This sub-model is configured to analyze real-time page images captured by the headless browser executor. It extracts visual saliency features of various functional areas on the page through a convolutional neural network, thereby identifying button areas with visual features of "Submit," "Next," or "Save," even if these buttons have extremely messy code structures in the document object model tree or have been encrypted and obfuscated.
[0038] The intelligent agent engine uses a weighted fusion unit to fuse the structured feature vector output by the graph neural network with the visual feature vector output by the convolutional neural network. The fused feature vector is then input into a reinforcement learning-based action decision-maker. This action decision-maker has a large number of pre-set financial operation templates. It compares the current comprehensive feature vector with the feature vectors of historical successful operations to calculate the expected reward score for each possible operation (such as clicking element A or filling data C in input box B).
[0039] When the expected return score exceeds a preset operation confidence threshold, the intelligent agent engine translates the action into a specific instruction and sends it to the headless browser executor. If the confidence score of the highest-scoring action is still below the threshold, the system determines that it is in an unpredictable and unknown environment. At this point, the anomaly self-healing manager intervenes, triggering a deep page scan or suspending the task and requesting manual assistance. This probabilistic prediction mechanism enables the system to exhibit human-like intuitive judgment when facing complex financial system web pages with frequent updates, greatly improving the flexibility of automation.
[0040] In the data processing flow, the probabilistic prediction layer also possesses self-evolution capabilities. After each successful execution of a bill of lading task, the system stores the task's operation sequence, the corresponding page state evolution, and the final feedback result in a dedicated incremental learning database. The intelligent agent engine periodically triggers model fine-tuning training, adjusting the neural network's weight parameters using the backpropagation algorithm by comparing the deviation between predicted actions and actual successful actions. This process allows the system's operational accuracy to continuously optimize as the number of tasks processed increases, enabling it to learn the unique interaction logic and implicit rules specific to certain financial platforms.
[0041] To ensure the efficiency of model inference, the probabilistic prediction layer also employs model quantization and pruning strategies. High-dimensional floating-point operations are converted into low-bit-width fixed-point operations, significantly reducing the consumption of computing resources without substantially lowering prediction accuracy. This keeps the generation time of each decision action within tens of milliseconds, thus ensuring the smoothness of automated order processing. Example 4
[0042] This embodiment focuses on the engineering implementation details of the dynamic element positioning mechanism in the headless browser executor, and how it solves the task failure problem caused by page redesign through semantic similarity matching and layout topology analysis.
[0043] During the execution of financial bills of lading, the dynamic element locator maintains a feature archive of target elements. This archive not only records traditional element location identifiers (such as element name, sequence path, or unique identifier), but also deeply extracts the element's "semantic fingerprint" and "layout fingerprint".
[0044] The "semantic fingerprint" includes the Chinese keywords contained in the element itself and its neighboring tags. For example, for a reimbursement amount input box, its semantic fingerprint might contain core words such as "reimbursement amount," "please enter," and "amount (yuan)." When the headless browser executor detects that the original identifier is invalid, it uses a preset word vector model (such as a pre-trained Chinese language model) to vectorize the associated text of all inputtable elements on the current page. The locator calculates the cosine distance between the semantic vector of each candidate element on the page and the semantic vector of the target element in the feature archive. If the cosine distance value of a candidate element is less than a preset similarity deviation threshold, then the element is considered a potential substitute for the target element semantically.
[0045] To further confirm the accuracy of the location, the dynamic element locator initiates "layout topology analysis." This analysis mechanism treats the webpage as a coordinate space, extracting the relative positional and hierarchical topological relationships between the target element and other key anchor elements on the page (such as fixed navigation bars, page titles, logos, etc.). For example, the target input box is always located to the right of the text label "Reimbursement Amount," and its depth in the document object model tree is the same as the depth of the "Reimbursement Type" selection box above it. The locator constructs a local topology tree and compares whether the topological correlation between the candidate element and its surrounding elements is consistent with the archive.
[0046] The dynamic element locator only identifies a candidate element as a new operation target when its weighted total score (semantic similarity score and layout topology score) exceeds a preset compliance score. Subsequently, the locator automatically updates the element's feature archive, achieving adaptive evolution of the location logic. This mechanism significantly reduces the workload of manually maintaining automated scripts, enabling the system to smoothly navigate the front-end code refactoring period of the financial platform.
[0047] Furthermore, the headless browser executor is configured with a "visual alignment" function. When encountering extremely complex dynamic pages, it invokes an image recognition unit to compare screenshots of the page before and after the redesign. Through pixel-level feature point matching, it identifies functional components that are visually closest in shape, color, and position. This multi-dimensional positioning strategy ensures that even in highly dynamic and unstructured web page environments, the system can still accurately locate the operation entry point and complete data entry and submission. Example 5
[0048] This embodiment further illustrates the interaction logic between the voucher generation unit and the context-aware module, and how to achieve highly personalized and compliant financial accounting.
[0049] The context-aware module is configured with deep permission and policy analysis capabilities. When the task scheduling center assigns a bill of lading task, the context-aware module first retrieves the job level information of the employee to whom the task belongs and the budget execution status of the cost center in real time through an external policy connector. For example, the system identifies that the current expense claimant belongs to the "Sales Department" and has a job level of "Manager". The context-aware module then retrieves the specific accounting policies for this level of personnel under the "Travel Expenses" category from the financial policy database, including the daily accommodation allowance limit and transportation selection criteria.
[0050] Upon receiving these context constraints, the voucher generation unit does not simply perform template filling; instead, it initiates "intelligent strategy mapping." Based on these dynamic parameters, it selects the most suitable voucher generation template from the accounting rule base. For example, if the reimbursement amount is within the budget limit, the generated voucher's accounting subject may be directly mapped to "Sales Expenses - Travel Expenses"; if it identifies that the expense is associated with a specific R&D project, the voucher generation unit will automatically add an auxiliary accounting dimension of "R&D Expenditure" and automatically fill in the auxiliary item based on the project code.
[0051] During the amount verification process, the voucher generation unit automatically converts amounts using exchange rate information provided by the context-aware module (for cross-border reimbursements). It obtains the official exchange rate on the day of financial settlement in real time, converts non-local currency amounts, and automatically generates an adjustment entry for "Financial Expenses - Exchange Gains and Losses" (if necessary) to ensure the balance of debits and credits on the voucher and the rigor of accounting logic.
[0052] Once the headless browser executor reports a successful bill of lading, the voucher generation unit encapsulates the final bill of lading receipt number, approval flow status bit, and index information of all original voucher image attachments into the extended fields of the accounting voucher. This business context-driven generation mechanism ensures that every automatically generated voucher has a complete audit trail, fully complies with the compliance requirements of corporate financial audits, and also achieves end-to-end unmanned flow of financial accounting from the business end to the accounting end. Example 6
[0053] This embodiment elaborates in detail the internal hierarchical response strategy of the anomaly self-healing manager and its execution logic when handling complex financial business conflicts.
[0054] The hierarchical response strategy library of the anomaly self-healing manager is configured into four levels: basic retry level, logical switching level, environment reconstruction level, and manual takeover level.
[0055] In the basic retry level, when the anomaly monitoring probe detects transient network fluctuations or temporary server unresponsiveness, the self-healing execution engine is configured to execute an exponential backoff retry strategy. This means waiting 2 seconds after the first failure, 4 seconds after the second failure, and so on, until the preset maximum number of retries is reached. This approach effectively avoids short-term server failure peaks without causing continuous impact on the financial platform.
[0056] At the logic switching level, when the system detects that a specific operation path is blocked, such as the "Quick Reimbursement" entry on a finance page being temporarily offline, the exception self-healing manager will send a path redirection request to the intelligent agent engine. The intelligent agent engine will then guide the headless browser executor to attempt to enter through alternative entry points such as "Comprehensive Business Order" based on alternative operation strategy flows, achieving the same business objective through different operation sequences.
[0057] At the environment reconstruction level, if the headless browser executor reports persistent page rendering anomalies or a serious memory leak risk, the self-healing execution engine will immediately forcibly terminate the current browser container process. It will then launch a new, identically configured isolated environment image and, using the task snapshot saved in the context-aware module, restore the task progress to the state of the last successful synchronization, re-executing the current steps from the beginning. This "hot start" capability greatly improves the fault tolerance of long-running financial tasks.
[0058] At the manual takeover level, for anomalies requiring advanced logic judgment (such as a financial platform notification stating "Payment is impossible due to credit risk of this supplier") or situations where the system's self-healing attempts have repeatedly failed, the anomaly self-healing manager will activate a "site preservation policy." This locks the current headless browser session from being destroyed and generates a "remote assistance request" containing all execution parameters, historical trajectory graphs, and real-time control permissions for the current page. Human reviewers can directly operate the browser instance in takeover mode via a dedicated monitoring terminal to resolve the issue. Once manual intervention completes the critical operations, the system can revert to automated operation based on personnel instructions.
[0059] Furthermore, the anomaly self-healing manager also possesses "experience feedback logic." For each successful self-healing operation, the correlation between its anomaly characteristics and the repair path is recorded and fed back to the intelligent agent engine. As system runtime accumulates, this self-healing logic gradually becomes internalized as a routine operating strategy of the intelligent agent engine, thereby continuously reducing the frequency of anomalies and achieving a closed-loop improvement in system robustness. Example 7
[0060] In another embodiment, the system provided by the present invention can adopt an architecture based on edge computing and central cloud collaboration.
[0061] In this architecture, the headless browser executor is deployed on an edge server close to the enterprise's office location. This is because many large enterprises' internal financial systems have strict geographical access restrictions; by deploying the executor at the edge, more direct access can be gained to the financial platforms within these local area networks. Meanwhile, the intelligent agent engine and context-aware module are deployed on a central public cloud or a group's private cloud to leverage powerful cloud computing capabilities for large-scale natural language processing and decision model inference.
[0062] The task scheduling hub possesses cross-cloud edge task distribution capabilities. When a bill of lading instruction is issued, the central cloud scheduling hub sends the anonymized execution instruction stream to the corresponding edge server based on the network region of the subsystem involved in the task. The headless browser executor on the edge side executes the specific web page interaction locally and sends back the encrypted visual summary of the page to the cloud engine. This architecture satisfies the security requirements of the financial system for intranet access while also enabling highly intelligent decision-making by leveraging cloud resources.
[0063] In this embodiment, the data transmission undergoes multi-layered tunnel encryption and authentication. A persistent, bidirectional secure socket layer connection is established between the edge actuator and the central cloud engine. All sensitive financial data undergoes pre-formatting and privacy masking by the credential generation unit before leaving the edge environment and entering the central cloud, transmitting only feature vectors that do not contain sensitive personal information. This collaborative architecture further enhances the system's applicability and security in complex enterprise network environments. Example 8
[0064] This embodiment describes a security-enhanced implementation method applied to specific industry compliance requirements.
[0065] In this implementation, the system integrates a hardware security module specifically for storing and managing various digital certificates, electronic signatures, and high-level encryption keys required to access the financial platform. Before executing critical fund payment or high-authority contract order actions, the headless browser executor must initiate a call request to this hardware security module. Only after passing multi-factor authentication (such as combining hardware tokens and dynamic biometric features) can the hardware module inject a digital signature into the executor's communication stream.
[0066] Furthermore, the anomaly self-healing manager in this embodiment includes a "compliance audit recorder." It is configured to record the original instruction stream of all automated operations and the corresponding video playback data in an immutable manner. Each automatically generated financial voucher has a unique hash storage address in the blockchain evidence storage system, which is associated with the entire process log of generating the voucher. This mechanism ensures that in industries such as finance and government, where data integrity and traceability are extremely critical, every second of the automated financial voucher process is under strict supervision and audit, technically eliminating the risk of malicious tampering or exploitation of the automated scripts.
[0067] Meanwhile, the context-aware module is configured with real-time compliance monitoring capabilities. It can pre-filter every instruction generated by the intelligent agent engine. If the instruction sequence contains actions that may violate industry-specific regulatory policies (such as cross-domain transmission of sensitive data or non-compliant account misappropriation), the context-aware module will immediately suspend communication and force the system into a locked state. This proactive security strategy provides a solid foundation for the application of automation technology in serious financial scenarios.
[0068] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. An automated financial bill of lading system integrating artificial intelligence agent and headless browser technology, characterized in that, include: The task scheduling hub is used to receive bill of lading task instructions from the enterprise's financial system, and to parse and distribute tasks according to task type, priority and target platform characteristics, and coordinate the sequential execution of various functional components. The intelligent agent engine is configured to dynamically generate operational strategies for the current financial bill of lading task based on natural language understanding and business rule reasoning, and monitor the task execution status in real time, adjusting the subsequent action sequence according to environmental feedback. The headless browser executor is used to simulate user operation behavior in an isolated operating environment, access the target financial platform webpage, perform page loading, element recognition, form filling, button clicking and data submission interaction actions, and send the page state and operation results back to the intelligent agent engine during the execution process. The context-aware module is used to collect and analyze the business context information of the current task in real time, and provide the structured context data to the intelligent agent engine to support decision optimization. The voucher generation unit is used to automatically generate accounting vouchers that comply with enterprise accounting standards based on the completed bill of lading data and the associated accounting rule base. The anomaly self-healing manager is used to automatically trigger preset recovery strategies when operation failures, page anomalies, or logical conflicts are detected during task execution.
2. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology as described in claim 1, characterized in that, The task scheduling center includes: The task receiving submodule is configured to obtain pending order task requests in real time from external enterprise resource planning systems, expense management systems, or office automation platforms via application programming interfaces or message queues. The task parsing engine is configured to use a preset financial business ontology model to perform semantic extraction on the bill of lading task request and identify the financial business category to which the bill of lading task request belongs, wherein the financial business category includes travel reimbursement, purchase payment and fixed asset warehousing. The priority sorting unit is configured to calculate the comprehensive priority score of each task based on the urgency coefficient of the task, the amount involved, and the set deadline using a multi-factor weighted algorithm, and place tasks with a comprehensive priority score higher than a preset priority threshold at the front of the execution queue. The task dependency modeler is configured to identify the temporal or data dependencies between multiple bill of lading tasks. It describes the logical relationships between tasks by constructing a directed acyclic graph and dynamically adjusts the task execution order accordingly to ensure the consistency of cross-task data and the continuity of the process. The global coordinator is configured as the system's command center, sending start commands to the intelligent agent engine based on the status of the execution queue, and monitoring the resource usage of each module.
3. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology as described in claim 1, characterized in that, The intelligent agent engine includes: The multi-level decision-making model consists of a rule-based deterministic reasoning layer and a machine learning-based probabilistic prediction layer. The natural language processing layer is configured to translate business instructions into machine-understandable operation primitives. The business rule inference engine is configured to maintain a financial knowledge graph in real time to ensure that the generated operational strategies do not violate the company's internal control policies. A status feedback listener is configured to establish a bidirectional communication channel with the headless browser executor. It not only receives the page status returned by the executor, but also sends fine-grained operation instructions to the headless browser executor. The fine-grained operation instructions include scroll position, waiting time and input delay parameters. The rule-based deterministic reasoning layer stores expert experience logic to handle highly structured standardized bill of lading steps. The machine learning-based probabilistic prediction layer uses deep neural networks to model unstructured web page element interactions. The intelligent agent engine dynamically selects or fuses two reasoning methods based on the complexity score of the current bill of lading task. When the complexity score is less than a preset low complexity threshold, the deterministic reasoning layer is invoked first. When the complexity score is not less than the preset low complexity threshold, the output of the probabilistic prediction layer is used as a supplementary constraint for the deterministic reasoning layer through a fusion mechanism.
4. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology according to claim 3, characterized in that, The machine learning-based probabilistic prediction layer incorporates a multimodal sensing neural network model, which includes: A graph neural network is configured to use vectors of a preset dimension as node features to extract features from a web page document object model tree transformed by the headless browser executor. The node features include the tag name, class name, depth level, and associated text semantics of web page elements. The computer vision sub-model is configured to use a convolutional neural network to analyze the real-time page images captured by the headless browser executor, extract the visual saliency features of each functional area on the page, and identify button areas with specific functional attributes. The weighted fusion unit is configured to fuse the structured feature vector output by the graph neural network with the visual feature vector output by the convolutional neural network, and input the fused feature vector into the action decision-maker based on reinforcement learning. The action decision-maker is pre-set with financial operation templates. By comparing the fused feature vector with the feature vector of historical successful operations, it calculates the expected return score of each candidate operation and generates the operation strategy when the expected return score is greater than the preset operation confidence threshold.
5. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology according to claim 1, characterized in that, The headless browser executor includes: Isolate the environment container and configure each bill of lading task to be assigned an independent lightweight sandbox to ensure that session information, cached data and token credentials between different tasks do not interfere with each other. The dynamic element locator integrates a semantic similarity matching algorithm and a layout topology analysis engine to identify and locate key operational elements when the target webpage structure changes. The simulated behavior generator is configured to support asynchronous execution logic and simulates keyboard input, mouse hover, drag-and-drop verification, and file upload actions according to the instruction stream issued by the intelligent agent engine. The page state analyzer is configured to scan the document object model rendered by the browser in real time, convert the page's visual features, prompts, and hidden form states into structured data, and send it back to the intelligent agent engine.
6. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology according to claim 5, characterized in that, The dynamic element locator maintains a feature archive of target elements, which records the semantic fingerprint and layout fingerprint of the elements. The dynamic element locator uses a preset word vector model to vectorize the associated text of all candidate elements on the current page, and calculates the cosine distance between the semantic vector of each candidate element and the semantic vector of the target element in the feature archive. When the cosine distance is less than the preset similarity deviation threshold, the dynamic element locator starts layout topology analysis and constructs a local topology tree by extracting the relative positional relationship and hierarchical topology relationship between the candidate element and the anchor element in the page. The dynamic element locator compares the local topology tree with the topological correlation in the feature archive. When the weighted total score of semantic similarity score and layout topology score exceeds the preset compliance score, the candidate element is determined as the new operation target and the feature archive is updated.
7. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology according to claim 1, characterized in that, The context-aware module includes: The real-time data collector is used to collect environmental information of the current task. The environmental information includes the operator's identity and permission level, the type of expense report, the detailed amount of each expense, the specific node of the approval flow, and historical operation logs. The business semantic mapping unit is configured to map the environmental information into business feature vectors to provide decision-making basis for the intelligent agent engine. The external policy connector is configured to establish a real-time connection with the enterprise's organizational structure database and financial policy rule base to dynamically obtain the current user's travel allowance standards, departmental budget remaining amount, and compliance constraints. The context-aware module identifies compliance conflicts during business execution and embeds the compliance constraints into the decision input of the intelligent agent engine, prompting the intelligent agent engine to automatically generate early warning prompts or adjust bill of lading strategies.
8. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology according to claim 1, characterized in that, The voucher generation unit includes: The accounting rule matching engine, which integrates enterprise accounting standards and accounting methods, is used to retrieve the final data successfully submitted by the headless browser executor after the task is completed. The account mapping processor is configured to determine the debit and credit directions and corresponding accounting subjects and auxiliary accounting dimensions based on the expense type, the business department to which it belongs, the associated project number, and the supplier information through multi-level association logic. The amount validator is configured to perform two-way reconciliation logic to ensure that the total amount of the generated accounting voucher is consistent with the amount of the original business document and the amount of the receipt after successful web submission. The voucher automation synthesizer is configured to immediately trigger an early warning mechanism and suspend the voucher download process when it detects account conflicts, missing auxiliary accounting items, or imbalances in debit and credit amounts. The voucher generation unit uses the exchange rate information provided by the context-aware module to automatically convert the data, generate the corresponding exchange gain / loss adjustment entries, and automatically add auxiliary accounting dimensions based on the project code.
9. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology according to claim 1, characterized in that, The anomaly self-healing manager includes: Anomaly monitoring probes are deployed at various execution stages of the system to capture network timeouts, page crashes, CAPTCHA pop-ups, logic errors, or business interruption anomalies in real time. A tiered response strategy library sets differentiated processing priorities and recovery paths for different types of anomalies. The recovery paths are divided into basic retry level, logical switching level, environmental reconstruction level, and manual takeover level. The self-healing execution engine is configured to execute an exponential backoff retry strategy at the basic retry level; send a path redirection request to the intelligent proxy engine at the logic switching level to guide the headless browser executor to try an alternative entry point; and forcibly terminate the current browser container process and restart the image environment at the environment reconstruction level, using task snapshots to restore task progress. The human intervention guide is configured to lock the current headless browser session at the human takeover level and generate a remote assistance request containing execution parameters, historical trajectory graphs, and real-time page control permissions for human intervention.
10. The automated financial bill of lading system integrating artificial intelligence agent and headless browser technology according to claim 1, characterized in that, The system adopts a distributed cluster architecture and is deployed in an enterprise private cloud environment: The task scheduling hub is configured as a distributed central node cluster, which distributes tasks to the nearest execution node based on the geographical location tag of the target financial platform. The decision-making function of the intelligent agent engine is broken down into multiple decision-making microservices and deployed on a containerized platform that supports dynamic scaling, automatically expanding the number of instances based on the backlog of tasks to be processed. The headless browser executor manifests as a large-scale cluster of browser instances, managed by a container orchestration engine, and integrated with a distributed proxy resource management unit to dynamically allocate different egress network addresses. The system also includes a hardware security module for storing digital certificates, electronic signatures, and encryption keys required to access the financial platform. Before performing critical operations, the headless browser executor sends an authentication request to the hardware security module, which then injects a digital signature into the communication stream. The headless browser executor has a thorough cleanup mechanism that automatically destroys temporary files, session information, and sensitive caches in the container after each task, and all data transmission between components uses an encrypted protocol.