Data processing methods, apparatus, computer equipment and readable storage media

By using the master control model and visual verification big model of the geographic intelligent agent to perform multi-dimensional verification and correction of the initial scene image, the problem of inconsistent visual presentation in the geographic intelligent agent is solved and the accuracy of data processing is improved.

CN121999092BActive Publication Date: 2026-06-30BEIJING INSTITUTE OF SURVEYING AND MAPPING

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING INSTITUTE OF SURVEYING AND MAPPING
Filing Date
2026-04-10
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In traditional technologies, when geographic intelligent agents perform tasks, network latency or rendering bottlenecks can cause inconsistencies between real-time visual images and the execution status reported by the API, resulting in poor data processing accuracy.

Method used

The initial scene image is verified layer by layer through the master control model and the visual verification model in the geographic intelligent agent. Abnormal information is identified, and the target correction logic is determined according to the preset correction strategy. The updated scene image is iteratively generated until the conditions are met.

Benefits of technology

It improves the accuracy of geographic intelligent agent data processing, ensures the accuracy of the visual presentation and content of the initial scene images, and avoids subsequent reasoning based on erroneous results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121999092B_ABST
    Figure CN121999092B_ABST
Patent Text Reader

Abstract

This application relates to a data processing method, apparatus, computer equipment, and readable storage medium. The method includes: acquiring an initial scene image generated by a geographic information system (GIS) in response to a target operation command; controlling a large visual verification model based on a master control model to perform multi-dimensional image verification on the initial scene image layer by layer, obtaining a structured verification result of the initial scene image; if the structured verification result indicates the presence of abnormal information, determining the target correction logic corresponding to the abnormal information in each layer of image verification dimensions based on the master control model, the abnormal information, and a preset correction strategy; controlling the GIS to generate an updated scene image based on the master control model and the target correction logic, until the updated scene image meets preset iteration conditions, obtaining a target scene image; and performing data processing based on the target scene image to obtain a data processing result. This method can improve the accuracy of data processing by a geographic intelligent agent.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the fields of geographic information systems and artificial intelligence, and in particular to a data processing method, apparatus, computer equipment, and readable storage medium. Background Technology

[0002] With the development of artificial intelligence agent technology, geographic agents have emerged. Geographic agents have the ability to perceive spatial location, make spatial behavior decisions, and interact in space, and can automatically execute user commands.

[0003] In traditional technologies, after a geographic agent receives natural language commands input by a user, it maps these commands into target commands that the GIS (Geographic Information System) can process through the master control model (LLM). The GIS then executes the task according to the target commands, updates the current view, and sends a task completion notification to the geographic agent after successful task execution. Finally, the geographic agent determines that the task has been successfully executed based on this completion notification, thus achieving data processing such as inspection or monitoring of the target scene.

[0004] However, in current traditional technologies, GIS engines are affected by network latency or rendering bottlenecks during task execution. The visual images they present in real time are often inconsistent with the execution status reported by the API, causing the geographic agent to make subsequent inferences based on incorrect execution results, which in turn leads to poor accuracy in geographic agent data processing. Summary of the Invention

[0005] Therefore, it is necessary to provide a data processing method, apparatus, computer equipment, and readable storage medium to address the aforementioned technical problems.

[0006] Firstly, this application provides a data processing method applied to a geographic intelligent agent, the geographic intelligent agent comprising a master control model and a large-scale visual verification model, including:

[0007] Acquire the initial scene image generated by the geographic information system in response to the target operation command;

[0008] Based on the master control model, the visual verification big model is controlled to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image;

[0009] If the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information in each level of image verification dimension is determined based on the master control model, the abnormal information, and the preset correction strategy.

[0010] Based on the master control model and the target correction logic, the geographic information system is controlled to generate updated scene images until the updated scene images meet the preset iteration conditions, thereby obtaining the target scene image;

[0011] Data processing is performed on the target scene image to obtain the data processing result.

[0012] In one embodiment, the step of controlling the visual verification big model based on the master control model to perform multi-dimensional image verification on the initial scene image layer by layer to obtain the structured verification result of the initial scene image includes:

[0013] The system acquires natural language commands input by the user and view status information from the geographic information system, including camera parameters, view frustum information, and layer status information.

[0014] The main control model performs data processing on the natural language instructions and the view state information to dynamically synthesize the visual verification task description information corresponding to the initial scene image;

[0015] According to the preset priority corresponding to each image verification dimension, the initial scene image and the natural language instruction are verified layer by layer based on the visual verification big model and the visual verification task description information to obtain the structured verification result.

[0016] In one embodiment, the structured verification result includes a rendering integrity status result and a semantic consistency result; the step of performing multi-dimensional image verification on the initial scene image and the natural language instruction according to the preset priority corresponding to each image verification dimension, based on the visual verification big model and the visual verification task description information, to obtain the structured verification result includes:

[0017] Based on the visual verification big model, semantic extraction is performed on the initial scene image and the visual verification task description information to obtain the visual semantic features corresponding to the initial scene image and the text semantic features corresponding to the visual verification task description information.

[0018] Based on the underlying visual patterns in the visual semantic features, identify whether there are rendering anomalies in the initial scene image, and obtain the rendering integrity status result;

[0019] Based on the attention alignment mechanism between the visual semantic features and the text semantic features, the semantic matching degree between the initial scene image and the visual verification task description information is calculated to obtain the semantic consistency result.

[0020] In one embodiment, the step of controlling the visual verification big model based on the master control model to perform multi-dimensional image verification on the initial scene image layer by layer to obtain the structured verification result of the initial scene image includes:

[0021] The initial scene image is sampled multiple times according to a preset dynamic sampling strategy and multi-dimensional verification types to generate verification images corresponding to the verification types in each dimension.

[0022] For each level dimension of the verification type, the image to be verified is performed on the image corresponding to the current verification type based on the visual verification big model to obtain the initial verification result of the current dimension of the verification type;

[0023] Structured verification results are generated based on the initial verification results of the verification types for each dimension.

[0024] In one embodiment, the structured verification result includes the anomaly priority corresponding to the anomaly information; if the structured verification result indicates the presence of anomaly information, determining the target correction logic corresponding to the anomaly information in each level of image verification dimension based on the master control model, the anomaly information, and a preset correction strategy includes:

[0025] For the structured verification results of each level of image verification dimension, if the structured verification result indicates the presence of abnormal information, the abnormality priority and preset correction strategy are matched according to the master control model to determine the target correction logic corresponding to the abnormal information.

[0026] In one embodiment, for the structured verification results of each level of image verification dimension, if the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information is determined by matching the abnormality priority and the preset correction strategy according to the master control model, including:

[0027] If the structured verification result indicates the presence of rendering anomalies, the target correction logic is determined to control the geographic information system to re-render based on the anomaly priority of the rendering anomalies.

[0028] If the structured verification result indicates that there is an image content anomaly, the target correction logic is matched according to the type of the anomaly content in the preset correction strategy based on the anomaly priority of the image content anomaly.

[0029] Secondly, this application also provides a data processing apparatus applied to a geographic intelligent agent, the geographic intelligent agent including a master control model and a large visual verification model, comprising:

[0030] The acquisition module is used to acquire the initial scene image generated by the geographic information system in response to the target operation command;

[0031] The verification module is used to control the visual verification big model based on the main control model to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image.

[0032] The determination module is used to determine the target correction logic corresponding to the abnormal information in each level of image verification dimension if the structured verification result is that there is abnormal information, based on the master control model, the abnormal information and the preset correction strategy.

[0033] The correction module is used to control the geographic information system to generate updated scene images according to the main control model and the target correction logic, until the updated scene images meet the preset iteration conditions to obtain the target scene image;

[0034] The data processing module is used to process data based on the target scene image to obtain data processing results.

[0035] In one embodiment, the verification module is specifically used to obtain natural language commands input by the user and view status information in the geographic information system, wherein the view status information includes camera parameters, view frustum information, and layer status information.

[0036] The main control model performs data processing on the natural language instructions and the view state information to dynamically synthesize the visual verification task description information corresponding to the initial scene image;

[0037] According to the preset priority corresponding to each image verification dimension, the initial scene image and the natural language instruction are verified layer by layer based on the visual verification big model and the visual verification task description information to obtain the structured verification result.

[0038] In one embodiment, the structured verification result includes a rendering integrity status result and a semantic consistency result; the verification module is specifically used to perform semantic extraction on the initial scene image and the visual verification task description information based on the visual verification big model, to obtain the visual semantic features corresponding to the initial scene image and the text semantic features corresponding to the visual verification task description information;

[0039] Based on the underlying visual patterns in the visual semantic features, identify whether there are rendering anomalies in the initial scene image, and obtain the rendering integrity status result;

[0040] Based on the attention alignment mechanism between the visual semantic features and the text semantic features, the semantic matching degree between the initial scene image and the visual verification task description information is calculated to obtain the semantic consistency result.

[0041] In one embodiment, the verification module is specifically used to perform multiple sampling processes on the initial scene image according to a preset dynamic sampling strategy and multi-dimensional verification types, to generate verification images corresponding to the verification types in each dimension.

[0042] For each level dimension of the verification type, the image to be verified is performed on the image corresponding to the current verification type based on the visual verification big model to obtain the initial verification result of the current dimension of the verification type;

[0043] Structured verification results are generated based on the initial verification results of the verification types for each dimension.

[0044] In one embodiment, the structured verification result includes the anomaly priority corresponding to the anomaly information; the determining module is specifically used to determine the target correction logic corresponding to the anomaly information by matching the anomaly priority and the preset correction strategy according to the master control model for the structured verification result of each level of image verification dimension.

[0045] In one embodiment, the determining module is specifically used to determine the target correction logic as controlling the geographic information system to re-render if the structured verification result indicates that there is a rendering anomaly, based on the anomaly priority of the rendering anomaly.

[0046] If the structured verification result indicates that there is an image content anomaly, the target correction logic is matched according to the type of the anomaly content in the preset correction strategy based on the anomaly priority of the image content anomaly.

[0047] Thirdly, this application also provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:

[0048] Acquire the initial scene image generated by the geographic information system in response to the target operation command;

[0049] Based on the master control model, the visual verification big model is controlled to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image;

[0050] If the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information in each level of image verification dimension is determined based on the master control model, the abnormal information, and the preset correction strategy.

[0051] Based on the master control model and the target correction logic, the geographic information system is controlled to generate updated scene images until the updated scene images meet the preset iteration conditions, thereby obtaining the target scene image;

[0052] Data processing is performed on the target scene image to obtain the data processing result.

[0053] Fourthly, this application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the following steps:

[0054] Acquire the initial scene image generated by the geographic information system in response to the target operation command;

[0055] Based on the master control model, the visual verification big model is controlled to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image;

[0056] If the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information in each level of image verification dimension is determined based on the master control model, the abnormal information, and the preset correction strategy.

[0057] Based on the master control model and the target correction logic, the geographic information system is controlled to generate updated scene images until the updated scene images meet the preset iteration conditions, thereby obtaining the target scene image;

[0058] Data processing is performed on the target scene image to obtain the data processing result.

[0059] Fifthly, this application also provides a computer program product, including a computer program that, when executed by a processor, performs the following steps:

[0060] Acquire the initial scene image generated by the geographic information system in response to the target operation command;

[0061] Based on the master control model, the visual verification big model is controlled to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image;

[0062] If the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information in each level of image verification dimension is determined based on the master control model, the abnormal information, and the preset correction strategy.

[0063] Based on the master control model and the target correction logic, the geographic information system is controlled to generate updated scene images until the updated scene images meet the preset iteration conditions, thereby obtaining the target scene image;

[0064] Data processing is performed on the target scene image to obtain the data processing result.

[0065] The aforementioned data processing method, apparatus, computer equipment, and readable storage medium, through a large-scale visual verification model in the geographic agent, perform multi-level and multi-dimensional image verification on the initial scene image fed back by the geographic information system in response to the target command. This effectively identifies visual presentation errors in the initial scene image and matches and determines the target correction logic adapted to the structured verification results under different dimensions according to a preset correction strategy. This enables iterative generation of updated scene images according to the target correction logic generated in each iteration, until the structured verification results of the regenerated initial scene image are all normal information. This ensures the accuracy of the target scene image processed by the geographic agent and improves the accuracy of geographic agent data processing. Attached Figure Description

[0066] To more clearly illustrate the technical solutions in the embodiments of this application or related technologies, the drawings used in the description of the embodiments of this application or related technologies will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0067] Figure 1 This is a flowchart illustrating a data processing method in one embodiment;

[0068] Figure 2 This is a flowchart illustrating multi-level verification of an initial scene image in one embodiment;

[0069] Figure 3 This is a flowchart illustrating the process of verifying the initial scene image for rendering state and content consistency in one embodiment.

[0070] Figure 4 This is a schematic diagram of the process of targeted sampling for verification of the initial scene image at various levels in one embodiment;

[0071] Figure 5 This is a flowchart illustrating the logic for determining the target correction in one embodiment;

[0072] Figure 6 This is a flowchart illustrating an example of a data processing method in one embodiment;

[0073] Figure 7 This is a structural block diagram of a data processing device in one embodiment;

[0074] Figure 8 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0075] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0076] It should be noted that the terms "first," "second," etc., used in this application can be used to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from the second element. The terms "comprising" and "having," and any variations thereof, used in this application, are intended to cover non-exclusive inclusion. The term "multiple" used in this application refers to two or more. The term "and / or" used in this application refers to one of the embodiments, or any combination of multiple embodiments.

[0077] In one embodiment, such as Figure 1 As shown, a data processing method is provided. This embodiment illustrates the application of this method to a geographic intelligent agent. It can be understood that a geographic intelligent agent refers to a computational entity in a geospatial environment that possesses spatial location perception, spatial behavior decision-making, and spatial interaction capabilities. Compared to general intelligent agents, geographic intelligent agents not only possess autonomy and responsiveness but also explicitly include geographic features (such as coordinates and geometry) and spatial constraints (such as terrain, obstacles, and administrative boundaries). Geographic intelligent agents can simulate the evolution of individuals (such as pedestrians, vehicles, and land use units) in complex geographic environments. They are a core component in constructing digital twin cities and geo-simulation systems. This geographic intelligent agent can be applied to terminals, servers, or systems including both terminals and servers, and is implemented through interaction between the terminals and servers. In this embodiment, the application of this method to a geographic intelligent agent is used as an example. The geographic intelligent agent includes a master control model and a large-scale visual verification model. The method includes the following steps:

[0078] Step 102: Obtain the initial scene image generated by the geographic information system in response to the target operation command.

[0079] In this embodiment, the target operation instruction can be obtained by a geographic agent parsing and generating the geographic operation intent input by the user. The geographic agent sends the target operation instruction to the geographic information system through interaction with the geographic information system. The geographic information system generates an initial scene image based on the target operation instruction, and then the geographic agent obtains the initial scene image. For example, the geographic information agent sends the target operation instruction captured by the rendering window to the geographic information system (e.g., a two-dimensional geographic information system or a three-dimensional geographic information system) through a master control model. The two-dimensional geographic information system obtains a two-dimensional scene screenshot as the initial scene image based on the target operation instruction captured by the two-dimensional scene. Alternatively, for a three-dimensional geographic information system, the target operation instruction captured by the three-dimensional scene is sent to obtain a three-dimensional scene screenshot as the initial scene image.

[0080] Step 104: Based on the master control model, the visual verification big model is controlled to perform multi-dimensional image verification of the initial scene image layer by layer to obtain the structured verification result of the initial scene image.

[0081] Among them, the visual verification big model can be a VLM (Vision-Language Models, an artificial intelligence model that combines a big language model with a visual encoder to achieve comprehensive processing of images, text and video) model.

[0082] In this embodiment, the geographic agent controls a large-scale visual verification model based on a master control model to perform multi-level image verification on the initial scene image. Each level corresponds to image verification in different dimensions, achieving multi-dimensional image verification at each level and obtaining a structured verification result of the initial scene image corresponding to the current command. Specifically, the large-scale visual verification model of the geographic agent is pre-set with a multi-level verification strategy. According to the multi-level verification strategy, the large-scale visual verification model verifies the initial scene image at each level. If the verification result of the current level is that there is no anomaly, the verification of the next level is performed until the image verification of each level is completed, and a structured verification result is constructed based on the verification results of each level.

[0083] In a specific embodiment, the visual verification big data model can first verify the scene rendering status in the initial scene image. For example, in scene rendering implemented by the geographic information system in response to the master control model's instructions, network latency in the geographic information system may cause slow loading of 3D tile data, resulting in "black blocks" or low-resolution "white models" appearing in the captured initial scene image. In this case, the visual verification big data model first verifies whether the initial scene image is rendered normally. If the initial scene image is rendered normally, the visual verification big data model further verifies the content in the initial scene image to determine whether the target object to be inspected in the initial scene image is occluded, or whether there is a problem of offset between the 2D scene map information and the 3D image coordinates. Through layer-by-layer multi-dimensional image detection, the initial scene image is verified in different dimensions to ensure that the displayed content in the initial scene image is clear and correct.

[0084] In an optional embodiment, the geographic agent can also be built based on a multimodal large model, such as Gemini, ChatGPT4o, etc. The multimodal large model takes into account the information processing capabilities of natural language processing and visual modality. Therefore, the geographic agent can directly perform multi-dimensional image verification on the initial scene image layer by layer through the multimodal large model to obtain the structured verification result of the initial scene image without the need for an additional master control model to issue instructions for multi-dimensional image verification.

[0085] Step 106: If the structured verification result indicates the presence of abnormal information, determine the target correction logic corresponding to the abnormal information in each level of image verification dimension based on the master control model, the abnormal information, and the preset correction strategy.

[0086] In this embodiment of the application, a preset correction strategy is configured in the master control model of the geographic intelligent agent. If there is abnormal information in the structured verification result fed back by the visual verification big model to the master control model, the master control model first analyzes the abnormal information, determines the verification level corresponding to the abnormal information, and matches the type of abnormal information in the verification level with the preset correction strategy to determine the target correction strategy that matches the abnormal information.

[0087] In an optional embodiment, the master control model can perform semantic abstraction and classification processing on abnormal information, determine the type of abnormal information, and then process the workflow by analyzing the root cause through the abnormal type, and perform similarity calculation or graph reasoning in the preset correction strategy according to the root cause to determine the target correction logic.

[0088] Step 108: Based on the master control model and target correction logic, control the geographic information system to generate updated scene images until the updated scene images meet the preset iteration conditions, and obtain the target scene image.

[0089] In this embodiment, the geographic agent controls the geographic information system (GIS) according to the master control model and target correction logic. The GIS regenerates an updated scene image and performs multi-dimensional image verification layer by layer on the updated scene image, following the same principle as step 104, to obtain a structured verification result corresponding to the updated scene image. If the updated scene image meets a preset iteration condition, the master control model uses the updated scene image as the target scene image. The preset iteration condition can be that the initial structured verification result corresponding to the updated scene image does not contain any abnormal information, or that the number of times the geographic agent performs multi-dimensional image verification on the updated scene image reaches a preset threshold. For example, after completing five multi-dimensional image verifications and corrections on the initial scene image, the geographic agent uses the currently corrected and updated scene image as the target scene image.

[0090] If the structured verification result still contains abnormal information after the updated scene image has undergone multi-dimensional image verification at each level, the master control model will re-analyze the abnormal information corresponding to the updated scene image to obtain a new target correction logic. The geographic information system will then be controlled to regenerate the initial scene image according to the new target correction logic until the structured verification result shows that there is no abnormal information, thus obtaining the target scene image.

[0091] In the process of the geographic intelligent agent regenerating the updated scene image based on the master control model, target correction logic and geographic information system, if the number of retries exceeds the preset retry threshold, the master control model will report the abnormal information to the user and stop the process of regenerating the initial scene image.

[0092] Step 110: Perform data processing based on the target scene image to obtain the data processing result.

[0093] In this embodiment, after the geographic agent completes the verification of the initial scene image, confirms that the initial scene image does not contain any abnormal information, and obtains the target scene image, the geographic agent implements user instructions based on the target scene image to obtain data processing results. For example, the user instruction may be to instruct the geographic agent to conduct a security inspection of the target location. After the geographic agent confirms that the initial scene image containing the target location is clear and its content is correct, it uses this initial scene image as the target scene image. The master control model inputs this target scene image into the visual verification big model, which performs security analysis processing to obtain the security analysis result of the target location as the data processing result. In this embodiment, the visual verification big model can not only perform multi-dimensional verification of the initial scene image, but its essence is a visual language big model, capable of content analysis and reasoning about image content. Therefore, the visual verification big model can also respond to the data processing instructions of the master control model to verify and evaluate image content information. For example, the visual verification big model can also perform reasoning analysis such as whether a security channel is occupied.

[0094] In the above data processing method, the visual verification model in the geographic agent performs multi-level and multi-dimensional image verification on the initial scene image fed back by the geographic information system for the target command. This can effectively identify visual presentation errors in the initial scene image and determine the target correction logic that matches the structured verification results under different dimensions according to the preset correction strategy. This enables the iterative generation of updated scene images according to the target correction logic generated in each iteration until the structured verification results of the regenerated initial scene image are all normal information. This ensures the accuracy of the target scene image processed by the geographic agent and improves the accuracy of geographic agent data processing.

[0095] In one exemplary embodiment, such as Figure 2 As shown, step 104 includes steps 202 to 206. Wherein:

[0096] Step 202: Obtain the natural language commands input by the user and the view status information in the geographic information system.

[0097] The view status information includes camera parameters, view frustum information, and layer status information.

[0098] In this embodiment, the master control model (LLM) of the geographic intelligent agent receives the input geographic operation intent, parses and generates geospatial operation instructions, and executes the operation instructions through a geographic information system (e.g., Web GIS, a geographic information system that uses Web technology to enable communication between a server and a client). For example, the natural language instruction input by the user could be "retrieve a real-time 3D view of the fire escape on the east side of Building A in the Central Business District". Then, the master control model converts the natural language instruction into a geospatial operation instruction that the geographic information system can recognize and process, i.e., a GIS instruction. For example, the GIS instruction corresponding to the current natural language instruction is "moveTo(11X.XX, 3X.XX); setCamera(pitch:-45,heading:90)".

[0099] If the execution result of the geographic information system for the geospatial operation command fails, the relevant information of the execution failure is directly fed back to the master control model, and the error is handled by the master control model; if the geographic information system executes successfully, in addition to returning the execution result (which is the initial scene image), the geographic agent also obtains the view status information of the current geographic scene in real time (geometric situation snapshot) and feeds it back to the master control model.

[0100] For two-dimensional geographic information systems, the view status information includes camera parameters, view frustum information, and layer status information, such as the map center point, scale, rotation angle, and current view bounding box (Extent) corresponding to the initial scene image, as well as spatial coordinate parameters and layer status information containing the current viewpoint. For three-dimensional geographic information systems, the view status information further includes parameters such as camera height, pitch angle, roll angle, field of view (FOV), and near and far clipping planes. For two-dimensional and three-dimensional linked or integrated geographic information systems, in addition to the above-mentioned two-dimensional and three-dimensional parameters, the view status information should also include the two-dimensional and three-dimensional view association matrix and synchronization lock status parameters to determine the spatial alignment consistency between the two dimensions.

[0101] Step 204: Based on the master control model, perform data processing on natural language instructions and view state information to dynamically synthesize visual verification task description information corresponding to the initial scene image.

[0102] In this embodiment, after receiving the user's input natural language commands and the view status information returned by the geographic information system, the master control model, based on its understanding of the operational intent and its context awareness, performs joint semantic parsing and spatial context modeling on the natural language commands and view status information. This involves combining the natural language commands with camera parameters, view frustum information, and layer status information for parsing and modeling, dynamically generating visual verification task description information that highly matches the current scene. This visual verification task description information clarifies the target semantics, area of ​​interest, key elements, and expected spatial relationships for this image verification, serving as a guiding prompt for the subsequent large-scale visual verification model to verify the initial scene image.

[0103] Specifically, the master control model first performs cross-modal alignment between semantic elements in natural language instructions (e.g., target object "fire lane", directional term "east side", functional attribute "real-time 3D view") and geometric parameters in view state information (e.g., camera position, orientation, layer visibility, etc.) to construct the intent information of the current geographic agent's instructions to the geographic information system. Based on this, the master control model uses its pre-trained language generation capabilities to synthesize semantically complete natural language description information for the visual verification task. For example, this description information could be "verify whether the target building is fully loaded," serving as the control instruction for the large-scale visual verification model.

[0104] The visual verification task description information not only reflects the original user intent, but also integrates the actual spatial configuration of the current geographical scene. This can guide the visual verification big model to determine the target to be verified based on multi-dimensional information, thereby improving the targeting and accuracy of visual verification.

[0105] Step 206: According to the preset priority corresponding to each image verification dimension, perform multi-dimensional image verification of the initial scene image and natural language instructions layer by layer based on the visual verification big model and visual verification task description information to obtain structured verification results.

[0106] In terms of model selection, this application's embodiments adapt to mainstream open-source multimodal large models (e.g., InternVL2.5, Qwen2-VL, or Llava series) to utilize their pre-trained visual feature extraction capabilities. Alternatively, for the specific characteristics of geographical scenes, LoRA (Low-Rank Adaptation) or full-parameter fine-tuning techniques can be used to fine-tune on geographically specific datasets containing samples such as oblique photography, tile black blocks, and model drift, thereby improving the sensitivity to GIS rendering anomalies.

[0107] In this embodiment, the geographic agent inputs the initial scene image and visual verification task description information into a large-scale visual verification model, which then performs a multi-level, multi-dimensional automated image verification process. Specifically, the large-scale visual verification model performs progressive verification of the initial scene image at different abstraction levels according to a preset priority. First, it verifies the basic rendering state of the initial scene image, identifying any rendering defects caused by network latency, incomplete data loading, or system anomalies. If the rendering state is normal, it proceeds to the next verification level, further detecting spatial occlusion or 2D / 3D coordinate offset issues. After passing the aforementioned lower-level verifications, it continues to perform higher-level semantic and functional verifications, such as the integrity, geometric consistency, and attribute compliance of the target object. Each level of verification is guided by the visual verification task description information generated by the master model, combining the intent of natural language instructions with the current visual verification task description information to achieve multi-dimensional verification based on multi-level verification instructions. The verification results at each level are generated and accumulated sequentially, and finally integrated by the visual verification big model into a structured verification result that includes visual matching status, environmental element description and anomaly information details, which is then fed back to the main control model for subsequent judgment and decision-making.

[0108] In this embodiment, the visual verification task description is dynamically generated by the master control model, and the visual verification big model is driven to carry out multi-dimensional image verification in a hierarchical manner according to priority. This realizes the full-chain automated verification of geographic scene images from rendering quality and spatial consistency to semantic correctness, improves the automated verification of the execution results of complex spatial instructions by the geographic agent, avoids the geographic agent from processing tasks based on the initial scene image with anomalies, and further improves the accuracy of data processing.

[0109] In one exemplary embodiment, such as Figure 3 As shown, the structured verification result includes the rendering integrity status result and the semantic consistency result; step 206 includes steps 302 to 306. Wherein:

[0110] Step 302: Based on the visual verification big model, semantic extraction is performed on the initial scene image and the visual verification task description information to obtain the visual semantic features corresponding to the initial scene image and the text semantic features corresponding to the visual verification task description information.

[0111] In this embodiment, the geographic agent inputs the initial scene image and the visual verification task description information generated by the master control model into the multimodal visual verification model (VLM). The image encoder built into the visual verification model first performs hierarchical feature extraction on the initial scene image, capturing low-level visual information (e.g., edges, textures, color distribution) and high-level semantic information (e.g., building outlines, road structures, 3D model integrity) in the image, generating high-dimensional vector representations of visual semantic features.

[0112] For the visual verification task description information, the visual verification big model uses a built-in text encoder to perform natural language parsing on the visual verification task description information, converting the natural language format visual verification task description information into text semantic features aligned with the feature space of visual semantic features. For example, the text semantic features encode semantic information such as user intent, spatial orientation information, and expected attributes.

[0113] Step 304: Based on the underlying visual patterns in the visual semantic features, identify whether there are rendering anomalies in the initial scene image and obtain the rendering integrity status result.

[0114] In this embodiment, the visual verification big model identifies whether there are rendering anomalies in the initial scene image based on the low-level visual patterns in the visual semantic features. It performs low-level image quality and rendering result analysis on the visual semantic features extracted from the initial scene image to determine whether there are any anomalies in the geographic information system during the view loading and rendering process, and obtains the rendering integrity status result.

[0115] Specifically, the rendering anomaly detection submodule analyzes the visual semantic features to identify whether there are typical anomaly patterns such as large smooth, textureless areas, abnormal color blocks (e.g., black or grayish-white blocks), broken model surfaces, blurred or missing textures, etc., and identifies whether problems such as "black screen", "white model", "mosaic" and "model collapse" occur due to network transmission delay, unloaded tile data, GPU (Graphics Processing Unit) rendering failure or cache errors.

[0116] Step 306: Based on the attention alignment mechanism between visual semantic features and textual semantic features, calculate the semantic matching degree between the initial scene image and the visual verification task description information to obtain the semantic consistency result.

[0117] In this embodiment, if the rendering integrity result of the visual verification big model verifying the visual semantic features is normal, the terminal calculates the semantic matching degree of the text semantic features and visual semantic features based on the attention alignment mechanism between the visual verification big model and the visual semantic features and text semantic features. This verifies whether the content actually presented in the initial scene image is semantically consistent with the user's instruction intent and the visual verification task description information, thereby obtaining a semantic consistency result. For example, if the semantic matching degree between the initial scene image and the visual verification task description information is greater than or equal to a preset matching degree threshold, it indicates that the semantic consistency result is that the initial scene image and the visual verification task description information are semantically consistent; if the semantic matching degree between the initial scene image and the visual verification task description information is less than the preset matching degree threshold, it indicates that the semantic consistency result is that the initial scene image and the visual verification task description information are semantically inconsistent, and the structured verification result is abnormal.

[0118] In one specific embodiment, the visual verification big data model calculates the semantic matching degree between the initial scene image and the visual verification task description information through an attention alignment mechanism between visual semantic features and textual semantic features, and performs multi-dimensional verification. For example, the attention relationship between "fire lane," "east side," and "unobstructed" in the visual verification task description information and the visual features in the initial scene image. Specifically, multi-dimensional verification includes verification of the existence of the target object, spatial location consistency, whether the target object is occluded, and layer consistency. The target object existence verification determines whether the initial scene image contains the target object in the visual verification task description information, for example, verifying whether there is a fire lane on the east side of building A; the spatial location consistency verification verifies whether the spatial orientation of the target object meets the instruction requirements, for example, verifying whether the fire lane is located on the east side of the building instead of the south side; the target object occlusion verification detects whether the target object is partially or completely occluded by trees, vehicles, or other buildings due to angle issues; the layer consistency verification, in a 2D / 3D linked scene, determines whether the spatial coordinates of the 2D base map (Tianditu) and the 3D model are aligned and whether there is any offset or misalignment.

[0119] In this embodiment, a multimodal visual verification big model is used to perform phased semantic extraction, rendering state analysis, and content consistency comparison between the initial scene image and the visual verification task description information, thereby realizing multi-dimensional verification of the initial scene image and improving the accuracy of the verification of the initial scene image.

[0120] In one exemplary embodiment, such as Figure 4 As shown, step 104 includes steps 402 to 406. Wherein:

[0121] Step 402: Perform multiple sampling processes on the initial scene image according to the preset dynamic sampling strategy and multi-dimensional verification types to generate verification images corresponding to each verification type.

[0122] In this embodiment, after acquiring the current visual presentation output by the geographic information system, the terminal first performs adaptive lightweight preprocessing. Based on a preset dynamic sampling strategy and multi-dimensional verification types, it performs multiple sampling processes on the initial scene image to generate verification images corresponding to each verification type. This optimizes the inference latency of the multimodal processing unit in the large visual verification model. Specifically, the visual verification task description information issued by the master control model of the geographic agent includes multi-dimensional verification types. Each verification type requires a different image resolution. Therefore, before performing multi-dimensional image verification, the geographic agent performs sampling processing on the same initial scene image at different resolutions according to a preset dynamic sampling strategy, obtaining verification images at multiple resolutions.

[0123] In a specific embodiment, for global anomaly verification tasks such as "black block detection" and "tile loading failure", the terminal determines that it has low dependence on detail resolution but high requirements for real-time response. Therefore, it adopts a single-patch resampling strategy to scale the original image proportionally to a uniform low resolution, such as 448×448 pixels, to generate a lightweight image to be verified. This significantly reduces the number of input tokens for the subsequent large visual verification model, achieves millisecond-level preliminary screening, and ensures real-time feedback of the instruction stream.

[0124] For tasks such as annotation text recognition and comparison of minute ground features, the geographic agent recognizes that this verification dimension type belongs to the high-precision requirement type. Based on the mapping matrix, the sampling resolution is increased to 896×896 pixels or the original image resolution is retained. A multi-patch segmentation strategy is adopted to divide the image into multiple overlapping sub-regions to ensure that minute targets are not lost due to downsampling. Each sub-region is used as an independent image to be verified and input into the subsequent verification process.

[0125] Step 404: For each level dimension verification type, perform image verification on the image to be verified corresponding to the current verification type based on the visual verification big model to obtain the initial verification result of the current dimension verification type.

[0126] In this embodiment, the terminal sends images to be verified at different dimensions and resolutions to the visual verification model and combines the semantic priority of the verification type with the visual verification task description information to perform image verification at the current level. For example, for a globally abnormal image to be verified with an input resolution of 448×448, the visual verification model quickly determines whether there are typical rendering anomalies such as large black areas, gray-white blocks, and missing textures through shallow convolution and global pooling, and outputs a label of "abnormal rendering status" or "normal" and a confidence level. For a high-resolution multi-slice input of 896×896, the visual verification model enables a fine-grained object detection module to analyze each slice to determine whether there are specified annotation text, font clarity, and whether the orientation is correct, and fuses the results of multiple slices to generate a complete semantic judgment, thereby obtaining the initial verification results of different verification types.

[0127] Furthermore, the terminal dynamically allocates timeout thresholds for multimodal processing units based on the weight of the current task in the mapping matrix, preventing the agent logic from being suspended due to the pressure of rendering complex 3D scenes. Specifically, the terminal uses a dynamic adjustment mechanism based on the timeout thresholds pre-set in the preset dynamic sampling strategy to allocate differentiated maximum waiting times for verification tasks of different dimensions. For example, global anomaly verification is set to ≤1 second, while complex 3D semantic comparison can be relaxed to 3~5 seconds, preventing the main control logic from stalling due to the blocking of a single task.

[0128] Step 406: Generate structured verification results based on the initial verification results of each dimension's verification type.

[0129] In this embodiment, the terminal aggregates the initial verification results from multiple verification dimensions to obtain structured verification results, which serve as the output of the geographic agent. The geographic agent categorizes the scattered detection items according to the verification dimensions. For example, the geographic agent classifies "missing textures," "white models," and "unloaded tiles" into the verification type "rendering anomaly," thus forming structured verification results.

[0130] In this embodiment, the inference latency of the large visual verification model for multi-dimensional image verification is reduced by using a dynamic sampling strategy and a multi-dimensional hierarchical verification mechanism, thereby improving the efficiency of the geographic agent in performing multi-dimensional visual verification of the initial scene image.

[0131] In an exemplary embodiment, the structured verification result includes the anomaly priority corresponding to the anomaly information; step 106 includes step 1061. Wherein:

[0132] Step 1061: For the structured verification results of each level of image verification dimension, if the structured verification result indicates the presence of abnormal information, match the abnormality priority and the preset correction strategy according to the master control model to determine the target correction logic corresponding to the abnormal information.

[0133] In this embodiment, after generating the structured verification result, the terminal further inputs it into the master control model (LLM) of the geographic intelligent agent. The master control model performs semantic analysis on the abnormal information in the structured verification result and, in conjunction with a preset abnormality priority system and a correction strategy knowledge base, performs automated matching to determine the target correction logic appropriate to the current abnormality type. The structured verification result contains abnormal information and its corresponding abnormality priority (e.g., P0, P1, P2, etc.). Based on this priority label, the master control model calls different levels of response mechanisms to achieve differentiated intelligent correction decisions. By semantically aligning and logically comparing the visual abnormality feedback with the geometric situation snapshot, the master control model determines whether the current visual presentation deviates from the expected spatial logic. If an inconsistency exists, the corresponding correction logic is automatically triggered. This correction logic includes reissuing correction instructions, adjusting rendering parameters, changing the interaction perspective, or prompting user intervention until the visual verification state is consistent with the original geometric situation, thereby improving the system's closed-loop control capability and autonomous operation and maintenance level.

[0134] In this embodiment, the master control model matches the anomaly priority in the structured verification results with the preset correction strategy, thereby realizing the automatic correction of multi-level anomalies and improving the accuracy of the geographic information system in handling anomalies under visual feedback.

[0135] In one exemplary embodiment, such as Figure 5 As shown, step 1061 includes steps 502 to 504. Wherein:

[0136] Step 502: If the structured verification result indicates the existence of rendering anomalies, determine the target correction logic as controlling the geographic information system to re-render based on the anomaly priority of the rendering anomalies.

[0137] In this embodiment, when the structured verification result contains anomaly information of the "rendering anomaly" category, such as black blocks, tile loading failures, white models, missing textures, etc., and its anomaly priority is marked as P0 (low-level error), the master control model determines that the problem affects the stability of the geographic agent's task execution, and therefore requires a correction operation on the geographic information system. At this time, the target correction logic is determined to be engine correction, specifically including issuing a redraw request to the geographic information engine, clearing local cache data, restarting the rendering pipeline, or switching the data source channel to force the system to re-execute the scene rendering process. For example, when a large area of ​​black screen or unloaded textures is detected in the 3D geographic information system, the master control model will automatically trigger a "redraw current viewport" command, accompanied by resource cleanup actions, to ensure that the next round of rendering output returns to normal. This mechanism effectively avoids agent misjudgment or command stagnation caused by local rendering stuttering, ensuring the continuous and stable operation of the geographic information system in complex environments.

[0138] Step 504: If the structured verification result indicates that there are abnormal image contents, the target correction logic is matched according to the type of abnormal content in the preset correction strategy based on the abnormality priority of the abnormal image contents.

[0139] In this embodiment of the application, when abnormal information of the "image content abnormality" category is identified in the structured verification result, such as missing target features, incorrect annotations, spatial location offset, feature occlusion, etc., the master control model further matches in the preset correction strategy library according to its abnormality priority and specific type, and dynamically selects the optimal target correction logic.

[0140] Specifically, if the anomaly priority is P1 (visually invisible), such as when a key target is detected to be completely occluded or unable to be identified due to a limited viewpoint, the target correction logic is interactive correction. The master control model automatically adjusts the position, posture, or view zoom level of the virtual camera, such as increasing the observation height, rotating the viewpoint to avoid obstacles, or making the occluded objects transparent / semi-transparent, thereby restoring the visibility of the target. If the anomaly priority is P2 (position deviation), such as when the projection position of a 2D feature in the 3D scene is detected to be drifted or the coordinate offset of the ground feature exceeds the threshold, the target correction logic is logical correction. The master control model recalculates the spatial coordinate transformation parameters, calibrates the projection relationship, corrects the spatial anchoring position of the feature, and achieves integrated alignment of 2D and 3D data. In 2D and 3D overlay display scenarios, the master control model can also combine the semantic information fed back by the multimodal visual verification module to judge the consistency of the same geographic entity in different dimensional views. If a mismatch is found, the spatial consistency verification process is automatically started, and automatic correction is completed by reverse optimization of the projection matrix or updating the registration parameters.

[0141] In this embodiment, the master control model matches the anomaly priority in the structured verification results with the preset correction strategy, thereby realizing the automatic correction of multi-level anomalies and improving the accuracy of the geographic information system in handling anomalies under visual feedback.

[0142] In a specific embodiment, such as Figure 6 As shown, an example of a data processing method is provided, wherein:

[0143] Step 601: The main control model acquires user instructions, performs intent understanding on the user instructions, and obtains the instruction intent.

[0144] Step 602: The master control model maps the instruction intent to the corresponding operation instructions of the geographic information system.

[0145] Step 603: The geographic information system executes the operation instructions.

[0146] Step 604: The geographic information system confirms whether the operation command was executed successfully. If it was executed successfully, proceed to step 606; if it failed, proceed to step 614.

[0147] Step 605: The geographic information system obtains the system snapshot (view status information) corresponding to the operation command and feeds the system snapshot back to the master control model.

[0148] Step 606: The master control model performs interpretation processing on the system snapshot to obtain the semantic information of the system snapshot.

[0149] Step 607: The geographic information system acquires the initial scene image obtained after the execution of the operation command.

[0150] Step 608: The master control model generates a verification task description based on the semantic information of the system snapshot and the initial scene image.

[0151] Step 609: The visual verification big model performs multimodal semantic extraction on the initial scene image and the verification task description, performs multi-level and multi-dimensional verification on the initial scene image, and obtains multi-dimensional verification results.

[0152] Step 610: The visual verification big model constructs a verification report based on the verification structure of each dimension, and feeds the verification report back to the main control model.

[0153] Step 611: The master control model performs evidence fusion and logical comparison on the initial scene image based on the verification report to determine whether the initial scene image conforms to the logic.

[0154] Step 612: If the initial scene image does not conform to the logic, proceed to step 613; if the initial scene image conforms to the logic, proceed to step 614.

[0155] Step 613: The master control model determines the correction logic of the initial scene image, adjusts the parameters or performs retry processing according to the correction logic, and executes step 603.

[0156] Step 614: The master control model will feed back the final result of the task processing based on the corrected target scene image to the user.

[0157] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages in other steps. It is understood that the steps in different embodiments can be freely combined as needed, and all non-contradictory solutions formed by such combinations are within the scope of protection of this application.

[0158] Based on the same inventive concept, this application also provides a data processing apparatus for implementing the data processing method described above. The solution provided by this apparatus is similar to the implementation scheme described in the above method; therefore, the specific limitations in one or more data processing apparatus embodiments provided below can be found in the limitations of the data processing method described above, and will not be repeated here.

[0159] In one exemplary embodiment, such as Figure 7 As shown, a data processing device 700 is provided, including: an acquisition module 701, a verification module 702, a determination module 703, a correction module 704, and a data processing module 705, wherein:

[0160] The acquisition module 701 is used to acquire the initial scene image generated by the geographic information system in response to the target operation command;

[0161] The verification module 702 is used to control the visual verification big model based on the master control model to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image.

[0162] The determination module 703 is used to determine the target correction logic corresponding to the abnormal information in each level of image verification dimension if the structured verification result shows that there is abnormal information, based on the master control model, the abnormal information and the preset correction strategy.

[0163] The correction module 704 is used to control the geographic information system to generate updated scene images based on the master control model and target correction logic, until the updated scene images meet the preset iteration conditions and the target scene image is obtained.

[0164] The data processing module 705 is used to process data based on the target scene image to obtain the data processing result.

[0165] In one embodiment, the verification module 702 is specifically used to obtain the natural language command input by the user and the view status information in the geographic information system. The view status information includes camera parameters, view frustum information, and layer status information.

[0166] Based on the master control model, data processing is performed on natural language instructions and view state information to dynamically synthesize visual verification task description information corresponding to the initial scene image;

[0167] Based on the preset priority corresponding to each image verification dimension, the initial scene image and natural language instructions are verified layer by layer using the visual verification big model and visual verification task description information to obtain structured verification results.

[0168] In one embodiment, the structured verification result includes the rendering integrity status result and the semantic consistency result; the verification module 702 is specifically used to perform semantic extraction on the initial scene image and the visual verification task description information based on the visual verification big model, so as to obtain the visual semantic features corresponding to the initial scene image and the text semantic features corresponding to the visual verification task description information.

[0169] Based on the underlying visual patterns in visual semantic features, the system identifies whether there are rendering anomalies in the initial scene image and obtains the rendering integrity status result.

[0170] Based on the attention alignment mechanism between visual semantic features and textual semantic features, the semantic matching degree between the initial scene image and the visual verification task description information is calculated to obtain the semantic consistency result.

[0171] In one embodiment, the verification module 702 is specifically used to perform multiple sampling processes on the initial scene image according to a preset dynamic sampling strategy and multi-dimensional verification types, and generate verification images corresponding to each verification type.

[0172] For each level of verification type, the image to be verified is performed on the image corresponding to the current verification type based on the visual verification big model to obtain the initial verification result of the current dimension of verification type.

[0173] Structured verification results are generated based on the initial verification results of each verification type.

[0174] In one embodiment, the structured verification result includes the abnormality priority corresponding to the abnormal information; the determination module 703 is specifically used for the structured verification result of each level of image verification dimension. If the structured verification result indicates the presence of abnormal information, the abnormality priority and the preset correction strategy are matched according to the master control model to determine the target correction logic corresponding to the abnormal information.

[0175] In one embodiment, the determining module 703 is specifically used to determine the target correction logic as controlling the geographic information system to re-render if the structured verification result indicates that there is a rendering anomaly, based on the anomaly priority of the rendering anomaly.

[0176] If the structured verification result indicates the presence of image content anomalies, the target correction logic is matched according to the type of the anomaly in the preset correction strategy based on the anomaly priority.

[0177] Each module in the aforementioned data processing device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.

[0178] In one exemplary embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 8 As shown, the computer device includes a processor, memory, input / output interfaces, a communication interface, a display unit, and an input device. The processor, memory, and input / output interfaces are connected via a system bus, and the communication interface, display unit, and input device are also connected to the system bus via the input / output interfaces. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The input / output interfaces are used for exchanging information between the processor and external devices. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, Near Field Communication (NFC), or other technologies. When the computer program is executed by the processor, it implements a data processing method. The display unit is used to form a visually visible image and can be a display screen, a projection device, or a virtual reality imaging device. The display screen can be an LCD screen or an e-ink screen. The input device of the computer device can be a touch layer covering the display screen, or buttons, trackballs, or touchpads set on the casing of the computer device, or external keyboards, touchpads, or mice, etc.

[0179] Those skilled in the art will understand that Figure 8The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0180] In one exemplary embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:

[0181] Acquire the initial scene image generated by the geographic information system in response to the target operation command;

[0182] Based on the master control model, the visual verification big model is used to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image.

[0183] If the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information in each level of image verification dimension is determined based on the master control model, the abnormal information, and the preset correction strategy.

[0184] Based on the master control model and target correction logic, the geographic information system is controlled to generate updated scene images until the updated scene images meet the preset iteration conditions, thus obtaining the target scene image.

[0185] Data processing is performed on the target scene image to obtain the data processing results.

[0186] In one embodiment, the processor, when executing a computer program, also performs the following steps:

[0187] It acquires natural language commands input by the user and view status information from the geographic information system, including camera parameters, view frustum information, and layer status information.

[0188] Based on the master control model, data processing is performed on natural language instructions and view state information to dynamically synthesize visual verification task description information corresponding to the initial scene image;

[0189] Based on the preset priority corresponding to each image verification dimension, the initial scene image and natural language instructions are verified layer by layer using the visual verification big model and visual verification task description information to obtain structured verification results.

[0190] In one embodiment, the processor, when executing a computer program, also performs the following steps:

[0191] Based on the visual verification big model, semantic extraction is performed on the initial scene image and the visual verification task description information to obtain the visual semantic features corresponding to the initial scene image and the text semantic features corresponding to the visual verification task description information.

[0192] Based on the underlying visual patterns in visual semantic features, the system identifies whether there are rendering anomalies in the initial scene image and obtains the rendering integrity status result.

[0193] Based on the attention alignment mechanism between visual semantic features and textual semantic features, the semantic matching degree between the initial scene image and the visual verification task description information is calculated to obtain the semantic consistency result.

[0194] In one embodiment, the processor, when executing a computer program, also performs the following steps:

[0195] The initial scene image is sampled multiple times according to the preset dynamic sampling strategy and multi-dimensional verification types to generate verification images corresponding to each verification type.

[0196] For each level of verification type, the image to be verified is performed on the image corresponding to the current verification type based on the visual verification big model to obtain the initial verification result of the current dimension of verification type.

[0197] Structured verification results are generated based on the initial verification results of each verification type.

[0198] In one embodiment, the processor, when executing a computer program, also performs the following steps:

[0199] For the structured verification results of each level of image verification dimension, if the structured verification result indicates the presence of abnormal information, the master control model is used to match the abnormality priority and the preset correction strategy to determine the target correction logic corresponding to the abnormal information.

[0200] In one embodiment, the processor, when executing a computer program, also performs the following steps:

[0201] If the structured verification result indicates the presence of rendering anomalies, the target correction logic is determined to control the geographic information system to re-render based on the anomaly priority of the rendering anomalies.

[0202] If the structured verification result indicates the presence of image content anomalies, the target correction logic is matched according to the type of the anomaly in the preset correction strategy based on the anomaly priority.

[0203] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the steps in the above method embodiments.

[0204] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.

[0205] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of the relevant data must comply with relevant regulations.

[0206] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile memory and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, artificial intelligence (AI) processors, etc., and are not limited to these.

[0207] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this application.

[0208] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A data processing method, characterized by, The method is applied to a geographic intelligent agent, which includes a master control model and a large visual verification model. The method includes: Acquire the initial scene image generated by the geographic information system in response to the target operation command; Based on the master control model, the visual verification big model is controlled to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image; If the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information in each level of image verification dimension is determined based on the master control model, the abnormal information, and the preset correction strategy. Based on the master control model and the target correction logic, the geographic information system is controlled to generate updated scene images until the updated scene images meet the preset iteration conditions, thereby obtaining the target scene image; Data processing is performed on the target scene image to obtain the data processing result; The process involves controlling the visual verification model based on the master control model to perform multi-dimensional image verification on the initial scene image layer by layer, obtaining the structured verification result of the initial scene image, including: The system acquires natural language commands input by the user and view status information from the geographic information system, including camera parameters, view frustum information, and layer status information. The main control model performs data processing on the natural language instructions and the view state information to dynamically synthesize the visual verification task description information corresponding to the initial scene image; According to the preset priority corresponding to each image verification dimension, the initial scene image and the natural language instruction are subjected to multi-dimensional image verification at each level based on the visual verification big model and the visual verification task description information to obtain structured verification results. The structured verification results include rendering integrity status results and semantic consistency results; the initial scene image and the natural language instruction are subjected to multi-dimensional image verification in a hierarchical manner according to the preset priority corresponding to each image verification dimension, based on the visual verification big model and the visual verification task description information, to obtain structured verification results, including: Based on the visual verification big model, semantic extraction is performed on the initial scene image and the visual verification task description information to obtain the visual semantic features corresponding to the initial scene image and the text semantic features corresponding to the visual verification task description information. Based on the underlying visual patterns in the visual semantic features, identify whether there are rendering anomalies in the initial scene image, and obtain the rendering integrity status result; Based on the attention alignment mechanism between the visual semantic features and the text semantic features, the semantic matching degree between the initial scene image and the visual verification task description information is calculated to obtain the semantic consistency result.

2. The method according to claim 1, characterized in that, The process involves controlling the visual verification model based on the master control model to perform multi-dimensional image verification on the initial scene image layer by layer, obtaining the structured verification result of the initial scene image, including: The initial scene image is sampled multiple times according to a preset dynamic sampling strategy and multi-dimensional verification types to generate verification images corresponding to the verification types in each dimension. For each level dimension of the verification type, the image to be verified is verified based on the visual verification big model to obtain the initial verification result of the verification type of the current dimension. Structured verification results are generated based on the initial verification results of the verification types for each dimension.

3. The method according to claim 1, characterized in that, The structured verification result includes the anomaly priority corresponding to the anomaly information; if the structured verification result indicates the presence of anomaly information, the target correction logic corresponding to the anomaly information in each level of image verification dimension is determined based on the master control model, the anomaly information, and the preset correction strategy, including: For the structured verification results of each level of image verification dimension, if the structured verification result indicates the presence of abnormal information, the abnormality priority and preset correction strategy are matched according to the master control model to determine the target correction logic corresponding to the abnormal information.

4. The method according to claim 3, characterized in that, For the structured verification results of each level of image verification dimension, if the structured verification result indicates the presence of abnormal information, the target correction logic corresponding to the abnormal information is determined by matching the abnormality priority and the preset correction strategy according to the master control model, including: If the structured verification result indicates the presence of rendering anomalies, the target correction logic is determined to control the geographic information system to re-render based on the anomaly priority of the rendering anomalies. If the structured verification result indicates that there is an image content anomaly, the target correction logic is matched according to the type of the anomaly content in the preset correction strategy based on the anomaly priority of the image content anomaly.

5. A data processing apparatus, characterized in that, The device is applied to a geographic intelligent agent, which includes a master control model and a large visual verification model. The device includes: The acquisition module is used to acquire the initial scene image generated by the geographic information system in response to the target operation command; The verification module is used to control the visual verification big model based on the main control model to perform multi-dimensional image verification of the initial scene image layer by layer, so as to obtain the structured verification result of the initial scene image. The determination module is used to determine the target correction logic corresponding to the abnormal information in each level of image verification dimension if the structured verification result is that there is abnormal information, based on the master control model, the abnormal information and the preset correction strategy. The correction module is used to control the geographic information system to generate updated scene images according to the main control model and the target correction logic, until the updated scene images meet the preset iteration conditions to obtain the target scene image; The data processing module is used to perform data processing based on the target scene image to obtain data processing results; The verification module is specifically used to obtain the natural language commands input by the user and the view status information in the geographic information system. The view status information includes camera parameters, view frustum information, and layer status information. The main control model performs data processing on the natural language instructions and the view state information to dynamically synthesize the visual verification task description information corresponding to the initial scene image; According to the preset priority corresponding to each image verification dimension, the initial scene image and the natural language instruction are subjected to multi-dimensional image verification at each level based on the visual verification big model and the visual verification task description information to obtain structured verification results. The structured verification results include rendering integrity status results and semantic consistency results; the verification module is specifically used to perform semantic extraction on the initial scene image and the visual verification task description information based on the visual verification big model, to obtain the visual semantic features corresponding to the initial scene image and the text semantic features corresponding to the visual verification task description information. Based on the underlying visual patterns in the visual semantic features, identify whether there are rendering anomalies in the initial scene image, and obtain the rendering integrity status result; Based on the attention alignment mechanism between the visual semantic features and the text semantic features, the semantic matching degree between the initial scene image and the visual verification task description information is calculated to obtain the semantic consistency result.

6. The apparatus according to claim 5, characterized in that, The verification module is specifically used to perform multiple sampling processes on the initial scene image according to a preset dynamic sampling strategy and multi-dimensional verification types, and generate verification images corresponding to the verification types in each dimension. For each level dimension of the verification type, the image to be verified is verified based on the visual verification big model to obtain the initial verification result of the verification type of the current dimension. Structured verification results are generated based on the initial verification results of the verification types for each dimension.

7. The apparatus according to claim 5, characterized in that, The structured verification result includes the anomaly priority corresponding to the anomaly information; the determining module is specifically used for the structured verification result of each level of image verification dimension. If the structured verification result indicates the presence of anomaly information, the module matches the anomaly priority with the preset correction strategy according to the master control model to determine the target correction logic corresponding to the anomaly information.

8. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 4.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 4.

10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 4.