Method for creating a 3D volumetric scene

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By collecting data from multiple vehicles and merging point clouds in edge/cloud infrastructure, and using high-definition maps and algorithms to align and compress the data, the problems of single perspective and limited field of view of vehicles are solved, achieving a more accurate 3D volumetric point cloud representation and improving navigation and safety.

CN116071488BActive Publication Date: 2026-06-23GM GLOBAL TECHNOLOGY OPERATIONS LLC

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: GM GLOBAL TECHNOLOGY OPERATIONS LLC
Filing Date: 2022-10-18
Publication Date: 2026-06-23

AI Technical Summary

Technical Problem

Existing vehicles, due to their single perspective and limited field of view, struggle to create complete 3D volumetric point clouds. Furthermore, the range of onboard visual sensors limits their ability to observe the surrounding environment, resulting in vehicles being unable to fully understand their surroundings.

Method used

Data is collected by visual and motion sensors on multiple vehicles, scene point clouds are generated using computer processors, and these point clouds are merged in edge/cloud infrastructure. High-definition maps and normal distribution transformation algorithms are used to align and compress the data, and overlap search and iterative nearest point algorithms are applied for alignment and merging.

Benefits of technology

It achieves a more accurate 3D volumetric point cloud representation of the environment surrounding vehicles, improving the accuracy of vehicle navigation and safety decisions.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116071488B_ABST

Patent Text Reader

Abstract

A system for creating a 3D volumetric scene, comprising: a first vision sensor located on a first vehicle to acquire first vision images; a first motion sensor located on the first vehicle to acquire first motion data; a first computer processor located on the first vehicle to generate a first scene point cloud; a second vision sensor located on a second vehicle to acquire second vision images; a second motion sensor located on the second vehicle to acquire second motion data; a second computer processor located on the second vehicle to generate a second scene point cloud, the first and second computer processors further to send the first and second scene point clouds to a third computer processor, the third computer processor located within an edge / cloud infrastructure and to create a stitched point cloud.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to methods and systems for creating 3D volumetric point clouds of traffic scenes by merging 3D volumetric point clouds (“volumetric pointclouds”) from multiple vehicles.

[0002] Automated vehicles use motion and vision sensors to capture images of their surroundings and create 3D volumetric point cloud representations of the vehicle's environment and its position within them. This 3D volumetric point cloud is limited by the single viewpoint provided by the vehicle. Furthermore, objects within the vehicle's field of view prevent the creation of a complete 3D volumetric point cloud of the surrounding environment. Finally, the ability of a vehicle to "see" and create a 3D volumetric point cloud of its surroundings using onboard motion and vision sensors is limited by the applicability of the onboard vision sensors.

[0003] Therefore, while current systems and methods achieve their intended purpose, there is a need for new and improved systems and methods for creating 3D volumetric point clouds of traffic scenes by merging multiple 3D volumetric point clouds created by multiple vehicles. Summary of the Invention

[0004] According to several aspects of this disclosure, a method for creating a 3D volumetric scene includes: acquiring a first visual image from a first visual sensor on a first vehicle; acquiring first motion data from a plurality of first motion sensors on the first vehicle; generating a first scene point cloud using the first visual image and the first motion data via a first computer processor on the first vehicle; acquiring a second visual image from a second visual sensor on a second vehicle; acquiring second motion data from a plurality of second motion sensors on the second vehicle; generating a second scene point cloud using the second visual image and the second motion data via a second computer processor on the second vehicle; sending the first scene point cloud and the second scene point cloud to a third computer processor located within an edge / cloud infrastructure; and merging the first scene point cloud and the second scene point cloud via the third computer processor to create a "stitched point cloud".

[0005] According to another aspect, the method further includes: generating a first original point cloud using a first visual image via a first computer processor; generating a first roughly transformed point cloud using first motion data via the first computer processor to transform the first original point cloud; generating a second original point cloud using a second visual image via a second computer processor; and generating a second roughly transformed point cloud using second motion data via the second computer processor to transform the second original point cloud.

[0006] According to another aspect, the method further includes: generating a first scene point cloud using a high-definition map via a first computer processor, and applying a normal distribution transformation algorithm to the first coarsely transformed point cloud; generating a second scene point cloud using a high-definition map via a second computer processor, and applying the normal distribution transformation algorithm to the second coarsely transformed point cloud.

[0007] According to another aspect, generating a first scene point cloud using a high-definition map via a first computer processor and applying a normal distribution transformation algorithm to the first coarsely transformed point cloud further includes: deleting dynamic objects from the first coarsely transformed point cloud before applying the normal distribution transformation algorithm; generating a second scene point cloud using a high-definition map via a second computer processor and applying a normal distribution transformation algorithm to the second coarsely transformed point cloud further includes: deleting dynamic objects from the second coarsely transformed point cloud before applying the normal distribution transformation algorithm.

[0008] According to another aspect, generating a first scene point cloud using a high-definition map via a first computer processor and applying a normal distribution transformation algorithm to the first coarsely transformed point cloud further includes: reusing the obtained first transformation matrix by inserting the obtained first transformation matrix back into the normal distribution transformation algorithm to improve the accuracy of the first scene point cloud; generating a second scene point cloud using a high-definition map via a second computer processor and applying a normal distribution transformation algorithm to the second coarsely transformed point cloud further includes: reusing the obtained second transformation matrix by inserting the obtained second transformation matrix back into the normal distribution transformation algorithm to improve the accuracy of the second scene point cloud.

[0009] According to another aspect, the method further includes: generating a first original point cloud using a first visual image via a first computer processor; generating a first scene point cloud using the first motion data via the first computer processor to transform the first original point cloud; generating a second original point cloud using a second visual image via a second computer processor; and generating a second scene point cloud using the second motion data via the second computer processor to transform the second original point cloud.

[0010] According to another aspect, sending the first scene point cloud and the second scene point cloud to the third computer processor further includes: compressing the first scene point cloud by the first computer processor before sending it to the third computer processor, and decompressing the first scene point cloud by the third computer processor after sending it to the third computer processor; compressing the second scene point cloud by the second computer processor before sending it to the third computer processor, and decompressing the second scene point cloud by the third computer processor after sending it to the third computer processor.

[0011] According to another method, the point clouds of the first scene and the point clouds of the second scene are compressed / decompressed using an octree-based point cloud compression method.

[0012] According to another aspect, the method further includes: after decompressing the first scene point cloud and the second scene point cloud, applying an overlap searching algorithm to the first scene point cloud and the second scene point cloud using a third computer processor to identify the overlapping region between the first scene point cloud and the second scene point cloud.

[0013] According to another aspect, the method further includes: after identifying the overlapping region between the first scene point cloud and the second scene point cloud, applying a point cloud alignment algorithm based on iterative closest point-based point to the overlapping region between the first scene point cloud and the second scene point cloud by the third computer processor.

[0014] According to several aspects of this disclosure, a system for creating a 3D volumetric scene includes: a first visual sensor located on a first vehicle to acquire a first visual image; a plurality of first motion sensors located on the first vehicle to acquire first motion data; a first computer processor located on the first vehicle to generate a first scene point cloud using the first visual image and the first motion data; a second visual sensor located on a second vehicle to acquire a second visual image; a plurality of second motion sensors located on the second vehicle to acquire second motion data; and a second computer processor located on the second vehicle to generate a second scene point cloud using the second visual image and the second motion data. The first computer processor is further configured to transmit the first scene point cloud to a third computer processor, and the second computer processor is further configured to transmit the second scene point cloud to the third computer processor. The third computer processor is located within an edge / cloud infrastructure and is configured to merge the first scene point cloud and the second scene point cloud to create a stitched point cloud.

[0015] According to another aspect, the first computer processor is further configured to: generate a first original point cloud using a first visual image, and generate a first coarsely transformed point cloud using first motion data to transform the first original point cloud; the second computer processor is further configured to: generate a second original point cloud using a second visual image; and generate a second coarsely transformed point cloud using second motion data to transform the second original point cloud.

[0016] According to another aspect, the first computer processor is also configured to: generate a first scene point cloud using a high-definition map, and apply a normal distribution transformation algorithm to the first coarsely transformed point cloud; the second computer processor is also configured to: generate a second scene point cloud using a high-definition map, and apply a normal distribution transformation algorithm to the second coarsely transformed point cloud.

[0017] According to another aspect, the first computer processor is further configured to: remove dynamic objects from the first coarsely transformed point cloud before applying the normal distribution transformation algorithm; the second computer processor is further configured to: remove dynamic objects from the second coarsely transformed point cloud before applying the normal distribution transformation algorithm.

[0018] According to another aspect, the first computer processor is further configured to: reuse the obtained first transformation matrix by inserting the obtained first transformation matrix back into the normal distribution transformation algorithm to improve the accuracy of the first scene point cloud; the second computer processor is further configured to: reuse the obtained second transformation matrix by inserting the obtained second transformation matrix back into the normal distribution transformation algorithm to improve the accuracy of the second scene point cloud.

[0019] According to another aspect, the first computer processor is further configured to: generate a first original point cloud using a first visual image; generate a first scene point cloud using first motion data to transform the first original point cloud; the second computer processor is further configured to: generate a second original point cloud using a second visual image; generate a second scene point cloud using second motion data to transform the second original point cloud.

[0020] According to another aspect, the first computer processor is further configured to: compress the first scene point cloud before sending it to the third computer processor, and the third computer processor is configured to: decompress the first scene point cloud after sending it to the third computer processor; the second computer processor is further configured to: compress the second scene point cloud before sending it to the third computer processor, and the third computer processor is configured to: decompress the second scene point cloud after sending it to the third computer processor.

[0021] On the other hand, the point clouds of the first scene and the point clouds of the second scene are compressed / decompressed respectively by an octree-based point cloud compression method.

[0022] According to another aspect, the third computer processor is also used to: after decompressing the first scene point cloud and the second scene point cloud, apply an overlap search algorithm to the first scene point cloud and the second scene point cloud to identify the overlapping area between the first scene point cloud and the second scene point cloud.

[0023] According to another aspect, the third computer processor is also used to: after identifying the overlapping region between the first scene point cloud and the second scene point cloud, apply a point cloud alignment algorithm based on iterative nearest point to the overlapping region between the first scene point cloud and the second scene point cloud.

[0024] Further areas of application will become apparent from the descriptions made herein. It should be understood that these descriptions and specific examples are for illustrative purposes only and are not intended to limit the scope of this disclosure. Attached Figure Description

[0025] The accompanying drawings described herein are for illustrative purposes only and are not intended to limit the scope of this disclosure in any way.

[0026] Figure 1 A schematic diagram of a system that is an exemplary embodiment of this disclosure;

[0027] Figure 2 A diagram of a traffic intersection where multiple modes of transportation exist;

[0028] Figure 3 A flowchart illustrating a method of an exemplary embodiment;

[0029] Figure 4A flowchart illustrating a normal distribution transformation algorithm of an exemplary embodiment; and

[0030] Figure 5 The flowchart illustrates an exemplary embodiment of a point cloud alignment algorithm based on iterative nearest points. Detailed Implementation

[0031] The following description is exemplary in nature and is not intended to limit this disclosure, application, or use.

[0032] refer to Figure 1 A system 10 for creating 3D volumetric scenes includes: a first visual sensor 12 located on a first vehicle 14 to acquire a first visual image; and a plurality of first motion sensors 16 located on the first vehicle 14 to acquire first motion data. The system 10 also includes: a second visual sensor 18 located on a second vehicle 20 to acquire a second visual image; and a plurality of second motion sensors 22 located on the second vehicle 20 to acquire second motion data.

[0033] The first visual sensor 12 and the second visual sensor 18 can consist of one or more different sensor types, including but not limited to cameras, radar, and lidar. The cameras and sensors can see and interpret objects on the road, much like a human driver does with their eyes. Typically, cameras are positioned at every angle around the vehicle to maintain a 360-degree field of view, providing a broader picture of the surrounding traffic conditions. The cameras display highly detailed and realistic images and automatically detect, classify, and determine the distances of objects such as other cars, pedestrians, cyclists, traffic signs and signals, road markings, bridges, and guardrails to the vehicle.

[0034] Radar (Radio Detection and Ranging) sensors emit radio waves to detect objects and measure their distance and speed relative to vehicles in real time. Both short-range and long-range radar sensors can be used. LiDAR (Light Detection and Ranging) sensors work similarly to radar sensors, the only difference being that they use lasers instead of radio waves. In addition to measuring the distance to various objects on the road, LiDAR can create 3D images of detected objects and map the surrounding environment. Furthermore, LiDAR can be configured to create a complete 360-degree map around vehicles, without relying on a narrow field of view.

[0035] Multiple first motion sensors 16 and second motion sensors 22 are used to provide data related to the orientation and motion of the first vehicle 14 and the second vehicle 20. In an exemplary embodiment, both the multiple first motion sensors 16 and the second motion sensors 22 include an inertial measurement unit (IMU) and a global positioning system (GPS). An IMU is an electronic device that uses a combination of accelerometers, gyroscopes, and magnetometers to measure and report the specific force, angular velocity, and sometimes the orientation of a subject. IMUs are commonly used to control aircraft (attitude and heading reference systems), including unmanned aerial vehicles (UAVs) and spacecraft, including satellites and landers. Recent developments have allowed the production of GPS devices with integrated IMUs. IMUs enable GPS receivers to operate when GPS signals are unavailable, such as in tunnels, inside buildings, or in the presence of electronic interference.

[0036] In land vehicles, IMUs can be integrated into GPS-based automated navigation systems or vehicle tracking systems, giving the system dead reckoning capabilities and the ability to collect as much accurate data as possible about the vehicle's current speed, turning rate, heading, roll, and acceleration. In navigation systems, data reported by the IMU is input into a processor, which calculates attitude, velocity, and position. This information can be integrated with angular rates from gyroscopes to calculate angular position. This is fused with gravity vector measurements from accelerometers in a Kalman filter to estimate attitude. Attitude estimation is used to transform acceleration measurements into inertial reference frames (hence the term inertial navigation), where they are integrated once to obtain linear velocity and twice to obtain linear position. The Kalman filter applies an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimates of unknown variables by estimating the joint probability distribution of the variables over each time frame. These estimates tend to be more accurate than estimates based on only a single measurement.

[0037] A first computer processor 24 is located on a first vehicle 14 to generate a first scene point cloud using a first visual image and first motion data. A second computer processor 26 is located on a second vehicle 20 to generate a second scene point cloud using a second visual image and second motion data. The computer processors 24 and 26 described herein are non-generalized electronic control devices that have a pre-programmed digital computer or processor, memory or non-transitory computer-readable medium for storing data, such as control logic, software applications, instructions, computer code, data, lookup tables, etc., and transceivers or input / output ports capable of sending / receiving data via WLAN, 4G, or 5G networks, etc. Computer-readable media includes any type of media accessible by a computer, such as read-only memory (ROM), random access memory (RAM), hard disk drive, CD, DVD, or any other type of memory. "Non-transitory" computer-readable media does not include wired, wireless, optical, or other communication links that transmit transient electrical signals or other signals. Non-transitory computer-readable media include media that can permanently store data and media that can store data and subsequently overwrite it, such as rewritable optical discs or erasable storage devices. Computer code includes any type of program code, including source code, object code, and executable code.

[0038] A point cloud is a set of data points in space. These points can represent 3D shapes or objects. Each point location has its own set of Cartesian coordinates (X, Y, Z). Point clouds have a wide range of applications, including creating 3D CAD models for manufactured parts, for metrology and quality inspection, and for various visualization, animation, rendering, and mass customization applications. In automation applications, vehicles use data collected by motion and vision sensors to create point clouds, which are 3D representations of the vehicle's surroundings. 3D point clouds allow vehicles to "see" their environment, especially other vehicles nearby, enabling safe operation and navigation. This is especially important when the vehicle is an autonomous vehicle and its navigation is entirely controlled by its onboard systems.

[0039] The first computer processor 24 is further configured to send a first scene point cloud to a third computer processor 28, and the second computer processor 26 is further configured to send a second scene point cloud to the third computer processor 28. In an exemplary embodiment, the third computer processor 28 is located within the edge / cloud infrastructure 30 and is configured to merge the first and second scene point clouds to create a stitched point cloud. The stitched point cloud acquires all data from the first and second scene point clouds, aligns and merges the data to provide a more accurate 3D volumetric representation of the traffic scene.

[0040] refer to Figure 2The diagram illustrates an intersection 32, where a first vehicle 14 approaches the intersection from one direction, while a second vehicle 20 approaches from the opposite direction. The first vehicle 14 and the second vehicle 20 collect different data about the intersection 32 from visual sensors 12 and 18, and motion sensors 16 and 22 located on the first vehicle 14 and the second vehicle 20, respectively. Therefore, the first vehicle 14 and the second vehicle 20 will independently create different 3D volumetric representations of the intersection 32.

[0041] For example, a first vehicle 14 approaches intersection 32 from the north, a second vehicle 20 approaches intersection 32 from the south, and an emergency vehicle 34 is entering intersection 32 from the east. The visual sensor 12 and motion sensor 16 on the first vehicle 14 will easily detect the presence of the emergency vehicle 34. The first scene point cloud created by the first computer processor 24 will include the emergency vehicle 34, and the onboard system of the first vehicle 14 can react appropriately. However, a large tanker truck 36 passing through intersection 32 obstructs the visual sensor 18 on the second vehicle 20. The second scene point cloud created by the second computer processor 26 will not include the emergency vehicle 34. The second vehicle 20 will not be aware of the presence of the emergency vehicle 34 and therefore may not react appropriately based on its presence. When the first and second scene point clouds are merged by the third computer processor 28, the resulting stitched point cloud will include features invisible to both the first vehicle 14 and the second vehicle 20, such as the presence of the emergency vehicle 34. When the stitched point cloud is sent back to the first vehicle 14 and the second vehicle 20, both the first vehicle 14 and the second vehicle 20 will have a better 3D volumetric representation of the surrounding environment.

[0042] In one exemplary embodiment, the first computer processor 24 is further configured to: generate a first original point cloud using a first visual image; and generate a first coarsely transformed point cloud using first motion data to transform the first original point cloud. The first original point cloud is created in a coordinate system based on the first visual sensor 12, such as a LiDAR coordinate system. The first computer processor 24 uses position and orientation data of the first vehicle 14 collected by a plurality of first motion sensors 16 to transform the first original point cloud into a world coordinate system. The first coarsely transformed point cloud is based on the world coordinate system. Similarly, the second computer processor 26 is further configured to: generate a second original point cloud using a second visual image; and generate a second coarsely transformed point cloud based on the world coordinate system using second motion data to transform the second original point cloud.

[0043] In another exemplary embodiment, the first computer processor 24 is further configured to: generate a first scene point cloud using a high-definition map, and apply a normal distribution transformation algorithm to the first coarsely transformed point cloud; the second computer processor 26 is further configured to: generate a second scene point cloud using the high-definition map, and apply a normal distribution transformation algorithm to the second coarsely transformed point cloud. Both the first computer processor 24 and the second computer processor 26 have nearby HD maps in which the first vehicle 14 and the second vehicle 20 travel. The HD maps can be obtained by real-time download from a cloud-based source via a WLAN, 4G, or 5G network, or they can be stored in the memory of the first computer processor 24 and the second computer processor 26.

[0044] First computer processor 24 uses an HD map to align a first coarsely transformed point cloud to create a first scene point cloud, which is more accurate than the first coarsely transformed point cloud. Second computer processor 26 uses an HD map to align a second coarsely transformed point cloud to create a second scene point cloud, which is more accurate than the second coarsely transformed point cloud. Furthermore, the HD map is aligned with the world coordinate system. Therefore, after applying a normal distribution transformation algorithm to the first and second scene point clouds based on data from the HD map, the first and second scene point clouds will be aligned with each other.

[0045] In one exemplary embodiment, the first computer processor 24 is further configured to: remove dynamic objects 38 from the first coarsely transformed point cloud before applying the normal distribution transformation algorithm; the second computer processor 26 is further configured to: remove dynamic objects 38 from the second coarsely transformed point cloud before applying the normal distribution transformation algorithm. The dynamic objects 38 in the first and second coarsely transformed point clouds become noise when the normal distribution transformation algorithm is applied. Therefore, when the normal distribution transformation algorithm is applied only to the static objects 40 in the first and second coarsely transformed point clouds, the resulting transformation matrix is more accurate.

[0046] In yet another exemplary embodiment, the first computer processor 24 is further configured to: improve the accuracy of the first scene point cloud by reusing the obtained first transformation matrix by inserting it back into the normal distribution transformation algorithm; the second computer processor 26 is further configured to: improve the accuracy of the second scene point cloud by reusing the obtained second transformation matrix by inserting it back into the normal distribution transformation algorithm. By reusing the first and second transformation matrices that provide results satisfying the scoring threshold, the obtained first and second transformation matrices are used as a baseline to reapply the normal distribution transformation algorithm. This results in fewer iterations of the normal distribution transformation matrix, and the final first and second scene point clouds are more accurate. After applying the normal distribution transformation algorithm, dynamic objects are replaced within the first and second scene point clouds before sending them to the third computer processor 28.

[0047] Finally, in yet another exemplary embodiment, the first computer processor 24 and the second computer processor 26 are used to remove static data from the first scene point cloud and the second scene point cloud. This can be done to reduce the file size of the first and second point clouds wirelessly transmitted to the third computer processor 28. The third computer processor 28, like the first computer processor 24 and the second computer processor 26, has access to the HD map, and therefore static elements can be reinserted into the first and second point clouds after they have been transmitted to the third computer processor 28.

[0048] In an alternative exemplary embodiment of system 10, the first computer processor 24 is further configured to: generate a first original point cloud using a first visual image and generate a first scene point cloud using first motion data to transform the first original point cloud; the second computer processor 26 is configured to: generate a second original point cloud using a second visual image and generate a second scene point cloud using second motion data to transform the second original point cloud. The first and second original point clouds are created in a coordinate system based on the first visual sensor 12 and the second visual sensor 18, such as a LiDAR coordinate system. The first computer processor 24 uses position and orientation data of the first vehicle 14 collected by a plurality of first motion sensors 16, and the second computer processor 26 uses position and orientation data of the second vehicle 20 collected by a plurality of second motion sensors 22 to transform the first and second original point clouds to a world coordinate system. The first and second scene point clouds are based on a world coordinate system.

[0049] In one exemplary embodiment, the first computer processor 24 is further configured to: compress the first scene point cloud before sending it to the third computer processor 28, while the third computer processor 28 is configured to: decompress the first scene point cloud after sending it to the third computer processor 28. Similarly, the second computer processor 26 is further configured to: compress the second scene point cloud before sending it to the third computer processor 28, while the third computer processor 28 is configured to: decompress the second scene point cloud after sending it to the third computer processor 28. The first and second scene point clouds are compressed to reduce the file size wirelessly transmitted from the first computer processor 24 and the second computer processor 26 to the third computer processor 28. In one exemplary embodiment, the first and second scene point clouds are compressed / decompressed using an octree-based point cloud compression method.

[0050] In another exemplary embodiment, the third computer processor 28 is further configured to: after decompressing the first scene point cloud and the second scene point cloud, apply an overlap search algorithm to the first scene point cloud and the second scene point cloud to identify overlapping regions between the first scene point cloud and the second scene point cloud. The first scene point cloud and the second scene point cloud include different data resulting from different fields of view provided by the first visual sensor 12 and the second visual sensor 18 within the first vehicle 14 and the second vehicle 20. The overlap search algorithm identifies data points appearing in the first scene point cloud and the second scene point cloud to identify overlapping regions between the first scene point cloud and the second scene point cloud.

[0051] The third computer processor 28 is further configured to: after identifying the overlapping region between the first scene point cloud and the second scene point cloud, apply an iterative nearest point alignment algorithm to the overlapping region between the first scene point cloud and the second scene point cloud. The iterative nearest point alignment algorithm aligns the first scene point cloud and the second scene point cloud based on over-overlapping or common data points to orient the first scene point cloud and the second scene point cloud to a common coordinate system.

[0052] refer to Figure 3A method 100 for creating a 3D volumetric scene using the aforementioned system 10 includes: starting from block 102, acquiring a first visual image from a first visual sensor 12 on a first vehicle 14 and acquiring a second visual image from a second visual sensor 18 on a second vehicle 20; moving to block 104, acquiring first motion data from a plurality of first motion sensors 16 on the first vehicle 14 and acquiring second motion data from a plurality of second motion sensors 22 on the second vehicle 20; moving to block 106, the method 100 includes: generating a first scene point cloud using the first visual image and the first motion data via a first computer processor 24 on the first vehicle 14; generating a second scene point cloud using the second visual image and the second motion data via a second computer processor 26 on the second vehicle 20; moving to block 108, the first computer processor 24 generates a first original point cloud using the first visual image, and the second computer processor 26 generates a second original point cloud using the second visual image.

[0053] Moving to blocks 110 and 112, the above method 100 includes: sending the first scene point cloud and the second scene point cloud to a third computer processor 28 located within the edge / cloud infrastructure 30. Moving to blocks 114 and 116, the first scene point cloud and the second scene point cloud are merged to create a stitched point cloud.

[0054] Starting from block 108, in an exemplary embodiment of method 100, the process moves to block 118. At block 106, a first scene point cloud is generated using a first visual image and first motion data via a first computer processor 24 on the first vehicle 14, and a second scene point cloud is generated using a second visual image and second motion data via a second computer processor 26 on the second vehicle 20. This includes: generating a first coarsely transformed point cloud using the first motion data via the first computer processor 24 to transform the first original point cloud, and generating a second coarsely transformed point cloud using the second motion data via the second computer processor 26 to transform the second original point cloud. This transformation aligns the first and second coarsely transformed point clouds with the world coordinate system.

[0055] Moving to block 120, the method further includes: generating a first scene point cloud using a high-definition map via a first computer processor 24, and applying a normal distribution transformation algorithm to the first coarsely transformed point cloud; generating a second scene point cloud using a high-definition map via a second computer processor 26, and applying a normal distribution transformation algorithm to the second coarsely transformed point cloud. In an exemplary embodiment, before applying the normal distribution transformation algorithm, dynamic objects 38 are removed from the first and second coarsely transformed point clouds, prior to applying the normal distribution transformation algorithm. A portion of the normal distribution transformation algorithm includes: reusing the first transformation matrix obtained by applying the normal distribution transformation algorithm by inserting the obtained first transformation matrix back into the normal distribution transformation algorithm to improve the accuracy of the first scene point cloud; and reusing the second transformation matrix obtained by applying the normal distribution transformation algorithm by inserting the obtained second transformation matrix back into the normal distribution transformation algorithm to improve the accuracy of the second scene point cloud.

[0056] Reference Figure 4 Flowchart 122 illustrates the application of the normal distribution transformation algorithm, including: starting from block 124, voxelizing the first coarsely transformed scene point cloud and the second coarsely transformed scene point cloud, moving to block 126, and performing probability distribution modeling on each voxel of the first coarsely transformed scene point cloud and the second coarsely transformed scene point cloud using a formula:

[0057]

[0058] Moving to block 128, as described above, dynamic objects 38 are removed from the first and second coarsely transformed point clouds before applying the normal distribution transformation algorithm. Dynamic objects 38 in the first and second coarsely transformed point clouds become noise when the normal distribution transformation algorithm is applied. Therefore, the transformation matrix obtained is more accurate when the normal distribution transformation algorithm is applied only to static objects 40 in the first and second coarsely transformed point clouds.

[0059] Move to block 130 and apply the normal distribution transformation algorithm. Move to block 132 and calculate the probability of each source point residing in the corresponding voxel using the formula. Score the first transformation matrix obtained by applying the normal distribution transformation algorithm to the first coarsely transformed point cloud.

[0060]

[0061] At block 134, the scores obtained from each of the first and second transformation matrices are compared with a threshold. If the score is lower than the threshold, move to block 136 and iteratively repeat the process until the score of the resulting transformation matrix is higher than the threshold. When the score at block 134 is higher than the threshold, move to block 138 and re-insert the well-scoring first transformation matrix obtained by applying the normal distribution transformation algorithm into the normal distribution transformation algorithm to improve the accuracy of the first scene point cloud. Similarly, when the score at block 134 is higher than the threshold, move to block 138 and re-insert the well-scoring second transformation matrix obtained by applying the normal distribution transformation algorithm into the normal distribution transformation algorithm to improve the accuracy of the second scene point cloud.

[0062] Moving to square 140, the normal distribution transformation algorithm is applied again using the first and second transformation matrices, which provide results satisfying the scoring threshold, as a baseline. This results in fewer iterations of the normal distribution transformation matrix and more accurate first and second scene point clouds. Dynamic objects are replaced within the first and second scene point clouds after applying the normal distribution transformation algorithm and before sending them to the third computer processor.

[0063] Finally, in another exemplary embodiment, moving to block 142, the first computer processor 24 and the second computer processor 26 are used to move static data from the first scene point cloud and the second scene point cloud. This can be done to reduce the file size of the first and second point clouds wirelessly transmitted to the third computer processor 28. The third computer processor 28, like the first computer processor 24 and the second computer processor 26, has access to the HD map, and therefore static elements can be reinserted into the first and second point clouds after they have been transmitted to the third computer processor 28.

[0064] Starting again from block 108, in another exemplary embodiment of method 100, the process moves to block 144. At block 106, a first scene point cloud is generated using a first visual image and first motion data via a first computer processor 24 on the first vehicle 14, and a second scene point cloud is generated using a second visual image and second motion data via a second computer processor 26 on the second vehicle 20. This includes: generating the first scene point cloud using the first motion data via the first computer processor 24 to transform the first original point cloud, and generating the second scene point cloud using the second motion data via the second computer processor 26 to transform the second original point cloud. This transformation aligns the first and second scene point clouds with the world coordinate system.

[0065] Moving to block 146, at block 112, before sending the first scene point cloud and the second scene point cloud to the third computer processor 28, the first computer processor compresses the first scene point cloud, and the second computer processor compresses the second scene point cloud. Moving to block 148, after being received by the third computer processor 28, the third computer processor 28 decompresses the first scene point cloud and the second scene point cloud. In an exemplary embodiment, the compression / decompression of the first scene point cloud and the second scene point cloud is performed using an octree-based point cloud compression method.

[0066] Moving to block 150, the above method includes: after decompressing the first scene point cloud and the second scene point cloud, applying an overlap search algorithm to the first scene point cloud and the second scene point cloud using a third computer processor 28 to identify the overlapping region between the first scene point cloud and the second scene point cloud. The overlap search algorithm identifies data points appearing in the first scene point cloud and the second scene point cloud to identify the overlapping region between the first scene point cloud and the second scene point cloud.

[0067] Moving to block 152, the method described above includes: after identifying the overlapping region between the first scene point cloud and the second scene point cloud, applying an iterative nearest-point alignment algorithm to the overlapping region between the first scene point cloud and the second scene point cloud via a third computer processor 28. The iterative nearest-point alignment algorithm aligns the first scene point cloud and the second scene point cloud based on excessive overlap or common data points to orient the first scene point cloud and the second scene point cloud to a common coordinate system. It should be understood that system 10 and the method 100 described herein are applicable to collecting data from any number of vehicles. Any such equipped vehicle can upload data to the third computer processor 28.

[0068] Reference Figure 5 The flowchart illustrates the application of an iterative nearest-point point cloud alignment algorithm. Starting at block 154, first and second scene point clouds are obtained, and correspondence matching between the target and source scene point clouds begins. Moving to block 156, the transformation matrix is estimated, and at block 158, the transformation is applied. Moving to block 160, the transformation error is compared with an error threshold using a formula:

[0069]

[0070] If the error is greater than the threshold, return to block 152 and repeat the process until a transformation matrix with an error not exceeding the threshold is obtained at block 162, which means that the first point cloud and the second point cloud are aligned in a common coordinate system.

[0071] As described above, the method described herein is not only applicable to the first vehicle 14 and the second vehicle 20. When scene point clouds are obtained for multiple applicable vehicles, one scene point cloud is designated as the source point cloud and all other scene point clouds are designated as target point clouds. An iterative nearest-point point cloud algorithm aligns each target scene point cloud to the source scene point cloud. Upon completion, all received scene point clouds are aligned to the coordinate system of the source scene point cloud.

[0072] The methods and systems disclosed herein have the advantage of providing more accurate 3D volumetric point clouds of the vehicle environment, enabling vehicles to make more accurate navigation and safety decisions.

[0073] The description in this disclosure is exemplary in nature only, and variations thereof that do not depart from the spirit and scope of this disclosure are intended to fall within its scope. Such variations should not be considered as departing from the spirit and scope of this disclosure.

Claims

1. A method for creating a 3D volumetric scene, comprising: First visual images are acquired from a first visual sensor on a first vehicle; First motion data is acquired from multiple first motion sensors on the first vehicle, and a first coarse transformed point cloud is generated based on the first motion data; A first scene point cloud is generated using the first visual image and the first motion data via a first computer processor on the first vehicle, wherein the first scene point cloud is created by aligning the first coarsely transformed point cloud with an HD map. After deleting dynamic objects from the first coarsely transformed point cloud, the normal distribution transformation algorithm is then applied to the first coarsely transformed point cloud. Acquire second visual images from a second visual sensor on a second vehicle; Second motion data is acquired from multiple second motion sensors on the second vehicle, and a second coarsely transformed point cloud is generated based on the second motion data; A second scene point cloud is generated using the second visual image and the second motion data via a second computer processor on the second vehicle, wherein the second scene point cloud is created by aligning the second coarsely transformed point cloud using an HD map; Dynamic objects are removed from the second coarsely transformed point cloud, and then the normal distribution transformation algorithm is applied to the second coarsely transformed point cloud. Specifically, after applying the normal distribution transformation algorithm to the first scene point cloud and the second scene point cloud based on data from the HD map, the first scene point cloud and the second scene point cloud will be aligned with each other. The dynamic object is replaced within the first scene point cloud and the second scene point cloud, and then the first scene point cloud and the second scene point cloud are sent to a third computer processor located within the edge / cloud infrastructure; and The third computer processor merges the first scene point cloud and the second scene point cloud to create a stitched point cloud.

2. The method according to claim 1, further comprising: The first computer processor generates a first raw point cloud using the first visual image. The first computer processor transforms the first original point cloud using the generated first coarse transformed point cloud. The second computer processor generates a second original point cloud using the second visual image. as well as The second computer processor transforms the second original point cloud using the generated second coarse transformed point cloud.

3. The method according to claim 1, characterized in that: The step of generating the first scene point cloud using a high-definition map via the first computer processor and applying the normal distribution transformation algorithm to the first coarsely transformed point cloud further includes: reusing the obtained first transformation matrix by inserting it back into the normal distribution transformation algorithm to improve the accuracy of the first scene point cloud; and The step of generating the second scene point cloud using a high-definition map via the second computer processor and applying the normal distribution transformation algorithm to the second coarsely transformed point cloud further includes: reusing the obtained second transformation matrix by inserting the obtained second transformation matrix back into the normal distribution transformation algorithm to improve the accuracy of the second scene point cloud.

4. The method according to claim 1, further comprising: The first computer processor generates a first raw point cloud using the first visual image. The first computer processor uses the first motion data to generate the first scene point cloud, thereby transforming the first original point cloud; The second computer processor generates a second original point cloud using the second visual image. as well as The second computer processor uses the second motion data to generate the second scene point cloud, thereby transforming the second original point cloud.

5. The method according to claim 4, characterized in that, Sending the first scene point cloud and the second scene point cloud to the third computer processor also includes: Before sending the first scene point cloud to the third computer processor, the first scene point cloud is compressed by the first computer processor, and after sending the first scene point cloud to the third computer processor, the first scene point cloud is decompressed by the third computer processor; and Before sending the second scene point cloud to the third computer processor, the second scene point cloud is compressed by the second computer processor, and after sending the second scene point cloud to the third computer processor, the second scene point cloud is decompressed by the third computer processor.

6. The method according to claim 5, characterized in that, The first scene point cloud and the second scene point cloud are compressed / decompressed using an octree-based point cloud compression method.

7. The method according to claim 5, further comprising: After decompressing the first scene point cloud and the second scene point cloud, the overlapping area between the first scene point cloud and the second scene point cloud is identified by using the third computer processor to apply an overlap search algorithm to the first scene point cloud and the second scene point cloud.

8. The method according to claim 7, further comprising: After identifying the overlapping region between the first scene point cloud and the second scene point cloud, the third computer processor applies an iterative nearest point alignment algorithm to the overlapping region between the first scene point cloud and the second scene point cloud.