A VR large space space mapping precision evaluation method

By constructing a 3D map and collecting real-time data from immersive terminal devices, combined with visual, UWB, IMU, and acoustic assessments, the problem of inconsistency between equipment locations and virtual displays in emergency drills of large public buildings was solved. This enabled automatic error correction and high-frequency training, improving the effectiveness of the drills and reducing costs.

CN122196477APending Publication Date: 2026-06-12中视互联(北京)科技有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
中视互联(北京)科技有限公司
Filing Date
2026-03-13
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In emergency drills in large public buildings, existing technologies are prone to discrepancies between equipment locations and virtual displays under low light or smoke conditions, affecting path selection and safe distance determination, and increasing the cost of drill organization and debriefing.

Method used

By constructing a 3D map, marking the coordinates of key points, collecting real-time data using immersive terminal devices, and displaying the relative positions of key points and objects in real time, combined with real-time quality assessments using vision, UWB, IMU, and acoustics, an automated inspection report is generated, enabling automatic error correction and high-frequency training.

🎯Benefits of technology

Maintaining controllable mapping error under occlusion conditions reduces the incomparability of evaluation results caused by environmental fluctuations, improves training effectiveness, and reduces organizational and review costs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122196477A_ABST
    Figure CN122196477A_ABST
Patent Text Reader

Abstract

The application relates to the technical field of VR large space, and provides a VR large space space mapping precision evaluation method, which comprises the following steps: constructing a three-dimensional map, labeling the coordinates of predetermined key points, collecting real-time data, obtaining accurate position distribution through mapping, and displaying the relative position in real time; the positions of multiple people are synchronously mapped and the mutual distance and formation change are displayed, the longitudinal and transverse distances of the team and the formation stability are quantified, the collected data and the mapping result are combined; the mapping deviation is calculated according to the preset key points, and the mapping deviation is compared with a preset threshold value, so that a quantifiable error index is obtained, the error is automatically reminded and a correction process is triggered when the error is out of limit, the record is saved as a replayable data packet for post-mortem analysis; in the same large space, the disturbance is switched according to a preset disturbance sequence, the mapping results under various disturbance conditions are collected, training samples and sensitivity curves are constructed, the mapping evaluation process is standardized, and an automatic inspection report is generated.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of VR large space technology, specifically a method for evaluating the accuracy of VR large space spatial mapping. Background Technology

[0002] VR large-space spatial mapping refers to a technology that uses positioning technology to combine a large, freely movable physical space with a virtual world, matching the virtual content with the geometry of the real environment. In emergency drills, it can align the position of the drillers with key points on the site in real time, ensuring the accuracy of retrieving tools and selecting routes.

[0003] Chinese patent application number CN202510653640.0 discloses a method for evaluating the accuracy of VR large-space spatial mapping. This method includes: real-time acquisition of raw sensor data from the front end, performing feature extraction and time synchronization to generate preliminary pose estimation, observation factors, and keyframes; the front end stores the packaged keyframe data and sensor metadata in a circular buffer and transmits it to the back end according to timestamps; after receiving the packaged keyframe data, the back end constructs pose, velocity, bias, and environmental feature state nodes for each keyframe in the factor map. By detecting instantaneous jumps of "bright drift" in the short term and compensating for sensor biases of "dark drift" in the long term, it overcomes the shortcomings of traditional SLAM in simultaneously addressing short-term abrupt changes and long-term error accumulation, achieving simultaneous suppression of short-term positioning anomalies and long-term map distortion, thereby improving the accuracy and robustness of VR large-space mapping.

[0004] In the field of VR large space technology, although there are technical solutions that can overcome the shortcomings of traditional SLAM in not being able to take into account both short-term sudden changes and long-term error accumulation by detecting the instantaneous jump of "light drift" in the short term and compensating for "dark drift" in the long term by sensor bias, in emergency drill scenarios of large public buildings, existing technologies can provide relatively stable position display under good lighting conditions, but when faced with emergency drills in low light or smoke, there are problems such as inconsistency between the device position and the virtual display, and the inability to restore the accurate position for a long time after temporary changes in the environment. This affects the training effectiveness of path selection and safe distance determination, and increases the cost of drill organization and review.

[0005] To address the aforementioned issues, this invention proposes a VR large-space spatial mapping accuracy evaluation method. This method reconstructs paths and locations, identifies key equipment, maps and quantifies collaborative capabilities, substitutes references under occlusion conditions, quantifies errors and provides automatic warnings, summarizes review evidence, conducts high-frequency training and scenario variations, and generates automatic inspection reports. This reduces the cost of organizing and reviewing exercises and improves training effectiveness. Summary of the Invention

[0006] In view of the existing problems mentioned above, a deep learning-based intelligent control method for the enzymatic hydrolysis process of sea cucumber peptides is proposed.

[0007] The technical solution adopted by this invention to solve the above-mentioned technical problems is: a VR large-space spatial mapping accuracy evaluation method, comprising:

[0008] A 3D map is constructed at the target site, and the coordinates of pre-determined key points are marked. Real-time data is collected using immersive terminal devices, and the precise location distribution is obtained through mapping. The relative positions of key points and objects are displayed in real time, which is used to present the location information of personnel picking up and using equipment.

[0009] The system synchronously maps and displays the positions of multiple people and their mutual spacing and formation changes, quantifies the longitudinal and transverse spacing of the formation and the stability of the formation, and combines the collected data with the mapping results to form reference position information under occlusion conditions.

[0010] The mapping deviation is calculated based on preset key points and compared with preset thresholds to obtain quantifiable error indicators. When the error exceeds the limit, an automatic reminder is given and a correction process is triggered. By recording pose sequences, error curves and relocation events, the records are saved as a replayable data package for post-event review and analysis.

[0011] In the same large space, the perturbation is switched according to a preset perturbation sequence, and the mapping results under each perturbation condition are collected to construct training samples and sensitivity curves. The mapping evaluation process is standardized, and an automated inspection report is generated for acceptance and daily inspection.

[0012] As a preferred embodiment, the specific steps for constructing a three-dimensional map of the target site and marking the coordinates of predetermined key points are as follows:

[0013] The physical building floor plan of the target site is imported into the mapping system. The absolute coordinates of key points are measured and recorded using a laser rangefinder and registered with the 3D map to obtain a unified global coordinate system. The 3D map is obtained by first scanning the ground with LiDAR to obtain a high-density point cloud, then using a head-mounted camera to take panoramic photos and run a reconstruction algorithm to generate a mesh to fill details. The 3D map is coarsely aligned with the physical building floor plan and the marked key points, and the data is saved in point cloud format.

[0014] Structural corner points are selected as anchor points for the 3D map. The 3D coordinates of key points are marked based on the anchor points. A unique ID, name, and description of the actual objects on site are recorded for each key point. Real-time data of the trainees are collected using immersive terminal devices. The sensor data is mapped to the 3D coordinate system by hardware calibration of the immersive terminal devices.

[0015] As a preferred embodiment, the specific steps for collecting real-time data using an immersive terminal device are as follows:

[0016] The intrinsic parameters and distortion coefficients of the camera in the immersive terminal device are calibrated to obtain the camera's focal length, principal point coordinates, and tangential distortion parameters. The static zero bias, scale factor, and temperature drift characteristics of the device's IMU are calibrated to obtain the IMU's bias, scaling, and noise model parameters. Camera observations are collected at key points in the site. The extrinsic parameters between the device and the 3D map coordinate system are estimated using the least squares method to obtain the transformation matrix from the device coordinate system to the map coordinate system. By performing extrinsic parameter calibration of the camera and IMU, the relative pose between the camera and the IMU is obtained.

[0017] As a preferred embodiment, the specific steps for real-time display of the relative positions of key points and physical objects are as follows:

[0018] Based on the transformation matrix from the device coordinate system to the map coordinate system, the device pose is transformed, and the obtained global coordinates are used to draw the trajectory and current position on the 3D map. When the trainees approach the key point, the key point name, object description, and current deviation are displayed on the visualization interface, providing a basis for distinguishing between positioning error and object movement.

[0019] As a preferred embodiment, the specific steps for synchronously mapping and displaying the positions of multiple people and their mutual spacing and formation changes are as follows:

[0020] Before the exercise, a unique identifier was established for each participant and bound to the immersive terminal device. A mapping table between the unique identifier of the participant and the terminal device was established. Time synchronization was implemented for each immersive terminal device, using PTP supplemented by periodic clock deviation estimation. Each immersive terminal device timestamped each acquisition stream locally and attached a time source identifier. The server corrected the data of each immersive terminal device according to the timestamp and performed frame-level alignment of multiple data streams according to the time window.

[0021] Using the pre-calibrated extrinsic parameter matrix from the device coordinate system to the map coordinate system, the local pose of each terminal is converted into the global pose in the map coordinate system. The formula is as follows:

[0022] ,

[0023] Where t represents the current time. Represents the global pose in the map coordinate system. Represents the global pose of the device coordinate system. Represents the extrinsic parameter matrix. Compound operators representing rigid transformations;

[0024] Quality metrics are calculated for each channel of the immersive terminal device. The visual quality metric is the feature matching success rate within the sliding window. The UWB metric is the smoothed signal-to-noise ratio after linear normalization. The IMU integrity metric is derived from the frame loss rate. Each channel's metric is smoothed by an exponential moving average and then linearly mapped to generate non-negative weights, which are then normalized to form the fusion weights. An extended Kalman filter is used to establish a state estimator for each trainee, with the state vector set as follows: Where x, y, and z represent the three-dimensional coordinates of the positions of the trainees, This represents the time derivative of the position in the corresponding direction. The rotation angle in the horizontal plane is represented by IMU data for prediction, and visual pose, UWB distance, and acoustic measurements are used as correction observations. The filter outputs the instantaneous pose and covariance matrix. The pairing distance between any two trainees is calculated at each alignment using the following formula:

[0025] ,

[0026] Where i and j represent the indexes of the participants in the exercise. This represents the pairing distance between trainee i and trainee j at time t. This represents the instantaneous pose of trainee i at time t. This represents the instantaneous pose of trainee j at time t.

[0027] As a preferred embodiment, the specific steps for quantifying the longitudinal and transverse spacing and formation stability of the formation are as follows:

[0028] All pairing distances are constructed into a matrix, and the reference travel vector is determined by the team's centroid velocity to obtain the longitudinal and lateral components. A local neighborhood is defined for each trainee with their three nearest teammates, and the local average distance and variance are calculated within a time window. When the visual statistics exceed the corresponding threshold, it is judged as a failure. UWB measurement calculation, IMU odometry prediction, acoustic echo ranging, and map semantic constraint projection are used in sequence according to the determined priority to generate the reference pose. The pose estimates from each source are fused according to real-time weights, and the identifier and uncertainty value of each source are recorded simultaneously.

[0029] As a preferred embodiment, the specific steps of calculating the mapping deviation based on preset key points and comparing it with a preset threshold are as follows:

[0030] During the mapping initialization phase, a set of key points is pre-registered in the 3D map, and the coordinates of each key point are used as the benchmark for calculating the mapping deviation. During online operation, the system-labeled position of each key point is obtained at each alignment time frame, and the point position deviation and global statistics are calculated. When any indicator exceeds the preset threshold, the system generates a structured over-limit event and writes it to the event log according to the following fields, triggering an automatic warning synchronously. The message is pushed to the visualization front-end and the operation and maintenance back-end in a unified JSON format. The message includes an event summary and a replay package index. The interface displays the affected personnel and the corresponding time interval. The reference pose recovery time is defined as the minimum time difference from the moment the threshold is first exceeded to the moment the corresponding indicator returns to below the threshold and remains below the threshold within a continuous time window. The system saves the average and maximum values ​​of the reference pose recovery time in the event log and generates a replay package according to the event.

[0031] As a preferred embodiment, the specific steps for constructing the training samples and sensitivity curve are as follows:

[0032] Within the same large space, perturbations are applied sequentially according to a preset perturbation sequence. A standardized test route is executed at each perturbation level to collect mapping data. The perturbation sequence includes illumination perturbation, visual occlusion perturbation, smoke perturbation, and geometric structure perturbation. For each perturbation level, the perturbation value and application method are specified, and the corresponding quantitative indicators are recorded. The perturbation sequence is sent to the execution device. After waiting for the parameters to stabilize for 30 seconds, the pose sequence, sensing quality indicators, reference pose, and 3D point cloud slices of the immersive terminal device are recorded. The instantaneous point position deviation sequence is calculated again for each run, and the point position deviation and global statistics are calculated simultaneously. The time series data of each run is written into the training sample storage area in a structured format. Statistical summaries are obtained for multiple runs at the same perturbation level, and a sensitivity curve is plotted with the perturbation value on the horizontal axis and the statistical error on the vertical axis. A quadratic polynomial is used for fitting, and the threshold inflection point is calculated on the fitted curve. Training record entries are generated for each collected sample according to time frames.

[0033] Beneficial effects

[0034] Compared with the prior art, the present invention has the following advantages:

[0035] 1. Through real-time quality assessment and weighted fusion of vision, UWB, IMU and acoustics, a reference pose is generated according to a preset priority sequence when vision fails, and a controllable mapping error is maintained under occlusion conditions, so that the error can be controlled within a preset threshold range.

[0036] 2. By collecting samples in the same large space according to a standardized perturbation sequence and constructing sensitivity curves, quantitative standards for site-level acceptance and daily inspections can be achieved, reducing the incomparability of assessment results caused by environmental fluctuations. Attached Figure Description

[0037] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation on the scope of this application.

[0038] Figure 1 This is a flowchart illustrating the present invention;

[0039] Figure 2 This is a comparison diagram of the effects of the present invention and the prior art, where gray bars represent the prior art and black bars represent the present invention. Detailed Implementation

[0040] To make the technical means, creative features, objectives, and effects of this invention easier to understand, the invention is further described below with reference to specific embodiments. However, the following embodiments are merely preferred embodiments of this invention and not all of them. Other embodiments obtained by those skilled in the art based on the embodiments described herein without creative effort are all within the protection scope of this invention.

[0041] Example 1:

[0042] To achieve the above objectives, please refer to Figures 1 to 2 This invention provides a method for evaluating the accuracy of VR large-space spatial mapping, the method comprising the following steps:

[0043] Step S1: Reconstruct the path and location, and pinpoint the key equipment;

[0044] Step S2: Mapping quantization collaborative capability, substitute reference under occlusion conditions;

[0045] Step S3: Quantify the error and issue automatic warnings, summarize and review evidence;

[0046] Step S4: High-frequency training and scene variations to generate automatic inspection reports.

[0047] This method is implemented in the order of S1–S4, and its overall process is as follows:

[0048] A 3D map is constructed at the target site, and the coordinates of pre-determined key points are marked. Real-time data is collected using immersive terminal devices, and the precise location distribution is obtained through mapping. The relative positions of key points and objects are displayed in real time, which is used to present the location information of personnel picking up and using equipment.

[0049] The system synchronously maps and displays the positions of multiple people and their mutual spacing and formation changes, quantifies the longitudinal and transverse spacing of the formation and the stability of the formation, and combines the collected data with the mapping results to form reference position information under occlusion conditions.

[0050] The mapping deviation is calculated based on preset key points and compared with preset thresholds to obtain quantifiable error indicators. When the error exceeds the limit, an automatic reminder is given and a correction process is triggered. By recording pose sequences, error curves and relocation events, the records are saved as a replayable data package for post-event review and analysis.

[0051] In the same large space, the perturbation is switched according to a preset perturbation sequence, and the mapping results under each perturbation condition are collected to construct training samples and sensitivity curves. The mapping evaluation process is standardized, and an automated inspection report is generated for acceptance and daily inspection.

[0052] Combination Figure 1 The specific implementation methods and effects of steps S1–S4 are explained respectively. The specific steps for restoring the path and location and locating the key equipment are as follows:

[0053] The physical building plan of the target site is imported into the mapping system. The absolute coordinates of key points, such as the center of the door frame, the center of the fire extinguisher, and the edge of the platform, are measured and recorded using a laser rangefinder. These coordinates are then registered with the 3D map to obtain a unified global coordinate system. The 3D map is obtained by first scanning the ground with LiDAR to obtain a high-density point cloud, and then using a head-mounted camera to take panoramic photos and run a reconstruction algorithm to generate a mesh to fill in details. The 3D map is then coarsely aligned with the physical building plan and the marked key points, for example, through automatic registration. The data is saved in a general point cloud format, such as a PLY coordinate list.

[0054] Structural corner points are selected as anchor points for the 3D map, such as the four corners of the target site. The 3D coordinates of key points are marked according to the anchor points. A unique ID, name and description of the actual objects on site are recorded for each key point. Real-time data of the trainees are collected using immersive terminal devices, including pose sequences, original images, interactive events, audio streams, etc. The sensor data is mapped to the 3D coordinate system by hardware calibration of the immersive terminal devices.

[0055] Specifically, the process involves hardware calibration of the immersive terminal device, intrinsic parameter and distortion coefficient calibration of the camera to obtain the camera's focal length, principal point coordinates and tangential distortion parameters, static zero bias, scale factor and temperature drift characteristic calibration of the inertial measurement unit (IMU) to obtain the IMU's bias, scaling and noise model parameters, acquisition of camera observations at key points in the site, estimation of extrinsic parameters between the device and the 3D map coordinate system using the least squares method to obtain the transformation matrix from the device coordinate system to the map coordinate system, and acquisition of the relative pose between the camera and the IMU by performing extrinsic parameter calibration of the camera and the IMU.

[0056] Based on the transformation matrix from the device coordinate system to the map coordinate system, the device pose is transformed, and the obtained global coordinates are used to draw the trajectory and current position on the 3D map. When the trainees approach the key point, for example, when the distance is less than or equal to 1m, the key point name, object description and current deviation are displayed on the visualization interface, which provides the basis for distinguishing between positioning error and object movement.

[0057] The specific steps for mapping quantization collaborative capabilities and substituting a reference under occlusion conditions are as follows:

[0058] Before the exercise, a unique identifier is established for each participant and bound to the immersive terminal device. A mapping table between the unique identifier of the participant and the terminal device is established. Time synchronization is implemented for each immersive terminal device, using PTP supplemented by periodic clock deviation estimation. Each immersive terminal device timestamps each acquisition stream locally and attaches a time source identifier. The server corrects the data of each immersive terminal device according to the timestamp and performs frame-level alignment of multiple data streams according to time windows, such as 100ms.

[0059] Using the pre-calibrated extrinsic parameter matrix from the device coordinate system to the map coordinate system, the local pose of each terminal is converted into the global pose in the map coordinate system. The formula is as follows:

[0060] ,

[0061] Where t represents the current time. Represents the global pose in the map coordinate system. Represents the global pose of the device coordinate system. Represents the extrinsic parameter matrix. Compound operators representing rigid transformations;

[0062] Quality metrics were calculated for the visual, UWB, and IMU channels of the immersive terminal device. The visual quality metric was the feature matching success rate within the sliding window; the UWB metric was the smoothed signal-to-noise ratio after linear normalization; and the IMU integrity metric was derived from the frame loss rate. Each channel's metric was smoothed using an exponential moving average and then linearly mapped to generate non-negative weights, which were then normalized to form the fusion weights. An extended Kalman filter was used to establish a state estimator for each participant, with the state vector set as follows: Where x, y, and z represent the three-dimensional coordinates of the positions of the trainees, This represents the time derivative of the position in the corresponding direction. The rotation angle in the horizontal plane is represented by IMU data for prediction, and visual pose, UWB distance, and acoustic measurements are used as correction observations. The filter outputs the instantaneous pose and covariance matrix. The pairing distance between any two trainees is calculated at each alignment using the following formula:

[0063] ,

[0064] Where i and j represent the indexes of the participants in the exercise. This represents the pairing distance between trainee i and trainee j at time t. This represents the instantaneous pose of trainee i at time t. The instantaneous pose of the trainee j at time t;

[0065] All pairing distances are constructed into a matrix, and the reference travel vector is determined by the team's centroid velocity to obtain the longitudinal and lateral components. A local neighborhood is defined for each trainee with their three nearest teammates. The local average distance and variance are calculated within a time window, such as 2 seconds. When the visual statistics exceed the corresponding threshold, it is judged as a failure, such as the standard deviation exceeding 0.25m or the average distance exceeding 0.5m. According to the determined priority sequence, UWB measurement calculation, IMU odometry prediction, acoustic echo ranging and map semantic constraint projection are used in sequence to generate reference pose. The pose estimates from each source are fused according to real-time weights, and the identifier and uncertainty value of each source are recorded synchronously.

[0066] The specific steps for quantifying errors, issuing automatic warnings, and summarizing and reviewing evidence are as follows:

[0067] During the mapping initialization phase, a set of key points is pre-registered in the 3D map, and the coordinates of each key point are used as the benchmark for calculating the mapping deviation. During online operation, the system-labeled position of each key point is obtained at each alignment time frame, and the point position deviation and global statistics are calculated. When any indicator exceeds the preset threshold, such as the root mean square error being greater than 0.3m, the system generates a structured over-limit event and writes it to the event log with the following fields: event ID, session_id, map version, set of trainee IDs, trigger timestamp, trigger indicator, fusion confidence curve index, etc., and triggers an automatic warning simultaneously. The message is pushed to the visualization front-end and the operation and maintenance back-end in a unified JSON format. The message includes an event summary and a replay package index. The interface displays the affected trainees and the corresponding time interval. The reference pose recovery time is defined as the minimum time difference from the moment the threshold is first exceeded to the moment the corresponding indicator returns to below the threshold and remains below the threshold within a continuous time window. The system saves the average and maximum values ​​of the reference pose recovery time in the event record and generates an immutable replay package according to the event.

[0068] The specific steps for generating automatic inspection reports based on high-frequency training and scene variations are as follows:

[0069] Within the same large space, perturbations are applied sequentially according to a preset perturbation sequence. A standardized test route is executed at each perturbation level to collect mapping data. The perturbation sequence includes illumination perturbation, visual occlusion perturbation, smoke perturbation, and geometric structure perturbation. For each perturbation level, the perturbation value and application method are specified, and the corresponding quantitative indicators are recorded. The perturbation sequence is sent to the execution device. After waiting for the parameters to stabilize for 30 seconds, the pose sequence, sensor quality indicators, reference pose, and 3D point cloud slices of the immersive terminal device are recorded. The instantaneous point position deviation sequence is calculated again for each run, along with the point position deviation and global statistics. The time series data of each run is written into the training sample storage area in a structured format. Multiple runs at the same perturbation level are then processed. The system performs statistical analysis and plots a sensitivity curve with the perturbation value on the horizontal axis and the statistical error on the vertical axis. A quadratic polynomial is used for fitting, and the threshold inflection point is calculated on the fitted curve. For each acquired sample, training record entries are generated according to time frames. Record fields include perturbation type and level, timestamp, system pose, reference pose, single-point deviation, sensor quality, fusion weights, and event markers. The training records are archived in a compressed file. Based on the statistical results, an automated inspection generates a standardized report. The report includes: site identification and map version, perturbation sequence and parameters, error table for each perturbation level, sensitivity curve, threshold inflection point location, recovery time distribution, event list and representative playback segment index, calibration recommendations, and versioned conclusions.

[0070] like Figure 2 A comparison chart of the effects of a VR large-space spatial mapping accuracy evaluation method is presented. The horizontal axis lists key performance indicators, and the vertical axis represents the exemplified performance scores, ranging from 0 to 100. Higher values ​​indicate better performance. The aim is to intuitively demonstrate the expected improvement of the present invention in key capabilities compared to typical prior art.

[0071] Example 2:

[0072] Based on the above embodiment 1, a VR large-space spatial mapping accuracy evaluation method for emergency drill scenarios in large public buildings is specifically as follows:

[0073] Step 1: Import the building plan of the target site into the mapping system in vector CAD format. Place control markers within the site according to rules, measured by the main station and having their absolute coordinates obtained. Write the absolute coordinates of these control markers as control points into the map metadata. Use a structured light depth camera and a panoramic camera on a wearable head-mounted display to acquire high-resolution images and depth data. Generate a textured 3D mesh using structured light reconstruction and image-based multi-view reconstruction algorithms. Use the pixel coordinates of the control markers identified in the images as PnP observations. Solve the rigid transformation matrix from the device coordinate system to the map coordinate system using least squares extrinsic parameter estimation. Extract the four corners of the target site from the 3D mesh as anchor points. Label the 3D coordinates of key objects based on these anchor points and assign a unique ID, name, and physical description to each key point. Perform in-camera IMU offset calibration on the panoramic camera and map the sensor data to the global pose flow within the map coordinate system. When the 3D distance between the trainee and any key point is ≤1.0m, display the ID, name, and physical description of that key point on the visualization interface, along with the deviation from the current system label. Use the historical displacement sequence of the key point to distinguish between positioning deviation and physical movement.

[0074] Step 2: Before the exercise begins, assign a unique identifier to each participant and bind it to the corresponding terminal. Perform frame-level alignment of multiple data streams according to the time window. Based on the calibrated extrinsic parameter matrix, map the device pose to the global pose in the map coordinate system. Calculate quality indicators for the visual, UWB, IMU, and acoustic channels and smooth them with exponential moving average. Generate non-negative weights using linear mapping and normalize them into weights. Maintain an EKF state estimator and state vector for each participant, using IMU, visual, UWB, and acoustic corrections to output instantaneous pose and covariance. Calculate the pairing distance to form a distance matrix. Determine the reference vector based on the team's centroid velocity and decompose it into longitudinal and lateral components. Take the three nearest teammates for each participant as the local neighborhood. Calculate the local average distance and variance within the window and compare it with the threshold. When visual failure occurs, generate a reference pose according to UWB, IMU, acoustic, and map projection, and fuse them using information filtering to obtain the reference position and uncertainty. Write the statistics, out-of-limit events, fusion confidence, and original observation index into the playback package in a structured format.

[0075] Step 3: During the mapping initialization phase, a set of reference points is pre-registered in the 3D map, and their coordinates are written into the metadata. During online operation, the system-labeled positions are obtained in each alignment frame, and the point position deviation and instantaneous root mean square error are calculated. When any threshold is exceeded, the system generates a structured over-limit event and writes it into the event log. After the event is generated, the correction process is automatically triggered: First, three pairs of observations are collected within 5 seconds before and after the triggering period to construct a rigid transformation estimation problem and solve the rigid transformation using least squares; Second, the transformation is updated to a rigid transformation matrix, and ICP fine registration is performed in the triggering area with initial values, recording the residuals before and after registration; Third, when the RMSE is less than or equal to 0.1m, it is written into the transformation database and versioned; otherwise, the correction failure is recorded and marked as awaiting manual review, generating an immutable review package for each event.

[0076] Step 4: Within the same large space, issue disturbance commands to the execution device according to a preset disturbance sequence, and wait for the parameters to stabilize for 30 seconds. The disturbance sequence includes illumination disturbance, visual occlusion disturbance, smoke disturbance, and geometric structure disturbance. The disturbance levels and application methods are fixed as follows: illumination = {100lx, 10lx, 5lx}, occlusion rate = {0%, 10%, 30%, 60%}, smoke optical density = {0.00, 0.10, 0.30, 0.60}, and critical object offset = {0.00m, 0.10m, 0.30m, 0.50m}. Under each disturbance level, a preset script executes three runs at a constant speed along a standardized test route, collecting and saving the immersive terminal device pose sequence and sensor quality indicators. The system references the pose and corresponding 3D point cloud slices; calculates the instantaneous deviation of key points and global statistics frame by frame for each run, and calculates the sliding statistics within a 2-second window. The time series of each run is written into the training sample storage area in a structured format. Statistical summaries are obtained for multiple runs with the same disturbance level. Sensitivity data points are plotted with the disturbance value on the horizontal axis and the statistical error on the vertical axis. The mean RMSE is fitted with a quadratic polynomial, and the intersection of the fitted curve and the preset error threshold is used as the threshold inflection point. Based on the statistical summaries, a standardized inspection report is automatically generated, including: site identification and map version, disturbance sequence and parameters, error table of disturbance level, sensitivity curve and fitting parameters, threshold inflection point location, recovery time distribution, event list, etc.

[0077] The embodiments of the present invention described above are subject to modification and change of method by those skilled in the art without departing from the embodiments and broader aspects of the present invention. The appended claims are intended to include all such modifications and changes of method that do not depart from the present invention.

Claims

1. A method for evaluating the accuracy of VR large-space spatial mapping, characterized in that, include: A 3D map is constructed at the target site, and the coordinates of pre-determined key points are marked. Real-time data is collected using immersive terminal devices, and the precise location distribution is obtained through mapping. The relative positions of key points and objects are displayed in real time, which is used to present the location information of personnel picking up and using equipment. The system synchronously maps and displays the positions of multiple people and their mutual spacing and formation changes, quantifies the longitudinal and transverse spacing of the formation and the stability of the formation, and combines the collected data with the mapping results to form reference position information under occlusion conditions. The mapping deviation is calculated based on preset key points and compared with preset thresholds to obtain quantifiable error indicators. When the error exceeds the limit, an automatic reminder is given and a correction process is triggered. By recording pose sequences, error curves and relocation events, the records are saved as a replayable data package for post-event review and analysis. In the same large space, the perturbation is switched according to a preset perturbation sequence, and the mapping results under each perturbation condition are collected to construct training samples and sensitivity curves. The mapping evaluation process is standardized, and an automated inspection report is generated for acceptance and daily inspection.

2. The VR large-space spatial mapping accuracy evaluation method according to claim 1, characterized in that, The specific steps for constructing a 3D map of the target site and marking the coordinates of predetermined key points are as follows: The physical building floor plan of the target site is imported into the mapping system. The absolute coordinates of key points are measured and recorded using a laser rangefinder and registered with the 3D map to obtain a unified global coordinate system. The 3D map is obtained by first scanning the ground with LiDAR to obtain a high-density point cloud, then using a head-mounted camera to take panoramic photos and run a reconstruction algorithm to generate a mesh to fill details. The 3D map is coarsely aligned with the physical building floor plan and the marked key points, and the data is saved in point cloud format. Structural corner points are selected as anchor points for the 3D map. The 3D coordinates of key points are marked based on the anchor points. A unique ID, name, and description of the actual objects on site are recorded for each key point. Real-time data of the trainees are collected using immersive terminal devices. The sensor data is mapped to the 3D coordinate system by hardware calibration of the immersive terminal devices.

3. The VR large-space spatial mapping accuracy evaluation method according to claim 1, characterized in that, The specific steps for collecting real-time data using an immersive terminal device are as follows: The intrinsic parameters and distortion coefficients of the camera in the immersive terminal device are calibrated to obtain the camera's focal length, principal point coordinates, and tangential distortion parameters. The static zero bias, scale factor, and temperature drift characteristics of the device's IMU are calibrated to obtain the IMU's bias, scaling, and noise model parameters. Camera observations are collected at key points in the site. The extrinsic parameters between the device and the 3D map coordinate system are estimated using the least squares method to obtain the transformation matrix from the device coordinate system to the map coordinate system. By performing extrinsic parameter calibration of the camera and IMU, the relative pose between the camera and the IMU is obtained.

4. The VR large-space spatial mapping accuracy evaluation method according to claim 1, characterized in that, The specific steps for real-time display of the relative positions of key points and physical objects are as follows: Based on the transformation matrix from the device coordinate system to the map coordinate system, the device pose is transformed, and the obtained global coordinates are used to draw the trajectory and current position on the 3D map. When the trainees approach the key point, the key point name, object description, and current deviation are displayed on the visualization interface, providing a basis for distinguishing between positioning error and object movement.

5. The VR large-space spatial mapping accuracy evaluation method according to claim 1, characterized in that, The specific steps for synchronously mapping and displaying the positions of multiple people and their mutual spacing and formation changes are as follows: Before the exercise, a unique identifier was established for each participant and bound to the immersive terminal device. A mapping table between the unique identifier of the participant and the terminal device was established. Time synchronization was implemented for each immersive terminal device, using PTP supplemented by periodic clock deviation estimation. Each immersive terminal device timestamped each acquisition stream locally and attached a time source identifier. The server corrected the data of each immersive terminal device according to the timestamp and performed frame-level alignment of multiple data streams according to the time window.

6. The VR large-space spatial mapping accuracy evaluation method according to claim 5, characterized in that, The specific steps for synchronously mapping and displaying the positions of multiple people and their mutual spacing and formation changes also include: Using the pre-calibrated extrinsic parameter matrix from the device coordinate system to the map coordinate system, the local pose of each terminal is converted into the global pose in the map coordinate system. The formula is as follows: , Where t represents the current time. Represents the global pose in the map coordinate system. Represents the global pose of the device coordinate system. Represents the extrinsic parameter matrix. Compound operators representing rigid transformations.

7. The VR large-space spatial mapping accuracy evaluation method according to claim 5, characterized in that, The specific steps for synchronously mapping and displaying the positions of multiple people and their mutual spacing and formation changes also include: Quality metrics are calculated for each channel of the immersive terminal device. The visual quality metric is the feature matching success rate within the sliding window. The UWB metric is the smoothed signal-to-noise ratio after linear normalization. The IMU integrity metric is derived from the frame loss rate. Each channel's metric is smoothed by an exponential moving average and then linearly mapped to generate non-negative weights, which are then normalized to form the fusion weights. An extended Kalman filter is used to establish a state estimator for each trainee, with the state vector set as follows: Where x, y, and z represent the three-dimensional coordinates of the positions of the trainees, This represents the time derivative of the position in the corresponding direction. The rotation angle in the horizontal plane is represented by IMU data for prediction, and visual pose, UWB distance, and acoustic measurements are used as correction observations. The filter outputs the instantaneous pose and covariance matrix. The pairing distance between any two trainees is calculated at each alignment using the following formula: , Where i and j represent the indexes of the participants in the exercise. This represents the pairing distance between trainee i and trainee j at time t. This represents the instantaneous pose of trainee i at time t. This represents the instantaneous pose of trainee j at time t.

8. The VR large-space spatial mapping accuracy evaluation method according to claim 1, characterized in that, The specific steps for quantifying the longitudinal and transverse spacing and formation stability of the formation are as follows: All pairing distances are constructed into a matrix, and the reference travel vector is determined by the team's centroid velocity to obtain the longitudinal and lateral components. A local neighborhood is defined for each trainee with their three nearest teammates, and the local average distance and variance are calculated within a time window. When the visual statistics exceed the corresponding threshold, it is judged as a failure. UWB measurement calculation, IMU odometry prediction, acoustic echo ranging, and map semantic constraint projection are used in sequence according to the determined priority to generate the reference pose. The pose estimates from each source are fused according to real-time weights, and the identifier and uncertainty value of each source are recorded simultaneously.

9. The VR large-space spatial mapping accuracy evaluation method according to claim 1, characterized in that, The specific steps for calculating the mapping deviation based on preset key points and comparing it with a preset threshold are as follows: During the mapping initialization phase, a set of key points is pre-registered in the 3D map, and the coordinates of each key point are used as the benchmark for calculating the mapping deviation. During online operation, the system-labeled position of each key point is obtained at each alignment time frame, and the point position deviation and global statistics are calculated. When any indicator exceeds the preset threshold, the system generates a structured over-limit event and writes it to the event log according to the following fields, triggering an automatic warning synchronously. The message is pushed to the visualization front-end and the operation and maintenance back-end in a unified JSON format. The message includes an event summary and a replay package index. The interface displays the affected personnel and the corresponding time interval. The reference pose recovery time is defined as the minimum time difference from the moment the threshold is first exceeded to the moment the corresponding indicator returns to below the threshold and remains below the threshold within a continuous time window. The system saves the average and maximum values ​​of the reference pose recovery time in the event log and generates a replay package according to the event.

10. The VR large-space spatial mapping accuracy evaluation method according to claim 1, characterized in that, The specific steps for constructing the training samples and sensitivity curves are as follows: Within the same large space, perturbations are applied sequentially according to a preset perturbation sequence. A standardized test route is executed at each perturbation level to collect mapping data. The perturbation sequence includes illumination perturbation, visual occlusion perturbation, smoke perturbation, and geometric structure perturbation. For each perturbation level, the perturbation value and application method are specified, and the corresponding quantitative indicators are recorded. The perturbation sequence is sent to the execution device. After waiting for the parameters to stabilize for 30 seconds, the pose sequence, sensing quality indicators, reference pose, and 3D point cloud slices of the immersive terminal device are recorded. The instantaneous point position deviation sequence is calculated again for each run, and the point position deviation and global statistics are calculated simultaneously. The time series data of each run is written into the training sample storage area in a structured format. Statistical summaries are obtained for multiple runs at the same perturbation level, and a sensitivity curve is plotted with the perturbation value on the horizontal axis and the statistical error on the vertical axis. A quadratic polynomial is used for fitting, and the threshold inflection point is calculated on the fitted curve. Training record entries are generated for each collected sample according to time frames.