A multi-sensor data fusion method and device
By employing the Hungarian algorithm and ID memory scheme for target matching and fusion across multiple sensors, the problems of data error and computational burden when using a single sensor are solved. This enables low-computational-load multi-camera fusion and target ID management, adapts to different camera configurations, and improves the engineering sophistication of the algorithm.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WUHAN JIMU INTELLIGENT TECH CO LTD
- Filing Date
- 2022-11-18
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies, when using a single sensor and having large ranging and velocity measurement errors, suffer from large data errors and increased computational load due to Kalman filter tracking fusion, and the complex track management logic is not conducive to algorithm engineering.
The Hungarian algorithm is used to perform target matching and fusion for multiple adjacent sensors. It uses fused ID tags and single-view ID tags, and solves the problem of target ID duplication and confusion through ID memory and update scheme, thereby reducing the amount of computation and adapting to different numbers of cameras.
It achieves low computational cost multi-sensor data fusion, solves the problems of disordered matching order and duplicate target IDs among multiple cameras, adapts to different camera configurations, and improves the engineering level of the algorithm.
Smart Images

Figure CN115795218B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of vehicle sensor data processing technology, specifically relating to a multi-sensor data fusion method and apparatus. Background Technology
[0002] The post-fusion technology for vehicle sensors varies from company to company. Common fusion methods use the Hungarian algorithm and Kalman filtering to fuse and track targets. Apollo's (Baidu Apollo autonomous driving platform) perception fusion module uses the Hungarian algorithm to match various objects extracted by different sensors. Finally, the matched data is fused using Kalman filtering, assigned a corresponding track_id (tracking fusion ID), and the processed data is categorized and stored. Tracking objects (track_ids) are created for new matching objects and saved. At the end of each frame, targets that failed to be tracked are removed. The data is categorized into two main parts—track (tracking fusion targets) and sensor (single-sensor targets). The former represents targets that have already been fused and tracked, while the latter represents targets to be fused. The matching scores between all targets to be fused and all already tracked targets are calculated using Hungarian matching. After obtaining the matching relationships, pairwise fusion is performed using Kalman filtering, and then the track is updated. The aforementioned techniques fuse the data matched using the Hungarian algorithm with Kalman filtering, making them suitable for fusion tasks involving multiple types of sensors. However, they are not very practical when using a single sensor (only a camera) with significant ranging and velocity measurement errors. The data fusion error using Kalman filtering for tracking remains substantial, rendering its tracking effectiveness minimal and increasing computational load. Furthermore, the existing technology's trajectory management logic is complex, requiring extensive computation during matching, which hinders algorithm engineering. Summary of the Invention
[0003] In view of this, the present invention provides a multi-sensor data fusion method, which adopts a one-to-one sequential matching and updating of global fusion points between adjacent lenses to solve the problem of disordered matching order of multiple cameras, and can be adapted to different numbers of selected cameras. It has good independence and is plug-and-play. Based on the ID memory and update scheme, it solves the problem of target ID duplication and disorder. It adopts the fusion memory function to solve the problem of target jumping in a single frame.
[0004] To achieve the above-mentioned technical objectives, the specific technical solution adopted by the present invention is as follows:
[0005] A multi-sensor data fusion method is provided for processing detection data of a target from multiple adjacent sensors, wherein the sensor group includes at least two sensors for sensing the target; the multi-sensor data fusion method is as follows:
[0006] Extract target information of each target obtained by each sensor; based on the Hungarian algorithm, fuse the same target perceived by two sensors with the same sensing area in the target information to generate a fused target;
[0007] During the operation of the sensor group, real-time data fusion is performed on the fusion target based on the relevant data of the two target information that make up the fusion target;
[0008] Wherein: the fusion ID marker that marks the fusion target is inherited from the target that forms the fusion target and has a longer survival time in the two sensors;
[0009] The method for marking single-view targets detected by the individual sensors is as follows:
[0010] When the single-view target is the same as the fused target before the timeline, the single-view target is marked based on the fused target's fusion ID.
[0011] When the single-view target is not the same as the fused target before the timeline, the single-view target is marked with a single-view ID based on ID memory to ensure that the single-view ID mark and the fused ID mark are temporally continuous and do not repeat.
[0012] Furthermore, the method for determining whether the single-view target and the fused target before the timeline are the same target is as follows:
[0013] Based on the ID of the single-view target recorded in the previous frame of the fused target, compare the current single-view target ID to find the fused target of the previous frame of the single-view target; calculate the IOU and position state error between the current single-view target and the fused target of the previous frame to confirm whether the single-view target and the fused target of the previous frame are the same target.
[0014] Furthermore, the method to ensure that the fused ID tag and the single-view ID tag are not repeated is as follows: when the sensor group senses a frame, extract all the fused ID tags and single-view ID tags of the sensor group in the previous frame and continue them; check whether all ID tags in the same frame are repeated. If there are duplicates, roll over and take a new ID tag to assign a value to the newly appearing target; if the fused ID tag and / or single-view ID tag in the previous frame have changed and continue to exist in the current frame, then continue the change.
[0015] Furthermore, the sensor is a vision sensor.
[0016] Furthermore, the target information includes time-calibrated image frame information.
[0017] Furthermore, the relevant data includes the sensor's sensing time, the target's ID, the target's category, the target's world coordinates, and the target's movement speed in those world coordinates.
[0018] Furthermore, the fused ID tag and the single-view ID tag are set to scroll.
[0019] Furthermore, the rolling setting method is as follows: when all the sensors fail to detect a fused target or a single-view target, the fused target or the single-view target is set as a failed target and the fused ID tag or the single-view ID tag corresponding to the failed target is deleted.
[0020] Furthermore, the method for determining the order in which the target is perceived by each of the sensors is based on the survival time of the target during the perception of each of the sensors.
[0021] The present invention also provides a multi-sensor data fusion device for processing detection data of a target from multiple adjacent sensors, wherein the sensor group includes at least two sensors for sensing the target; the device includes:
[0022] The target fusion module is used to extract target information of each target obtained by each of the sensors; based on the Hungarian algorithm, it fuses the same targets perceived by two sensors with the same sensing area in the target information to generate a fused target;
[0023] The data fusion module is used to perform real-time data fusion of the fusion target based on the relevant data of the two target information that make up the fusion target during the operation of the sensor group;
[0024] Wherein: a fusion ID marker that marks the fusion target, the fusion ID marker being inherited from the target that forms the fusion target having the longer survival time among the two sensors;
[0025] The marking module is used to mark single-view targets detected by the individual sensors through the following steps:
[0026] When the single-view target is the same as the fused target before the timeline, the single-view target is marked based on the fused target's fusion ID.
[0027] When the single-view target is not the same as the fused target before the timeline, the single-view target is marked with a single-view ID based on ID memory to ensure that the single-view ID mark and the fused ID mark are temporally continuous and do not repeat.
[0028] By adopting the above technical solution, the present invention can bring the following beneficial effects:
[0029] This invention does not use Kalman filtering for tracking fusion, resulting in significantly lower computational cost compared to mainstream algorithms. Instead, it employs the Hungarian algorithm to perform pairwise target matching between two cameras with overlapping regions, reducing the number of targets in a single match. Furthermore, this invention only tracks targets that have already been matched and fused. Targets not simultaneously present in multiple cameras are added to the global target list by simply modifying their IDs. Since each individual camera already tracks the target, targets that do not require fusion are not tracked, avoiding redundant computation caused by secondary tracking. Attached Figure Description
[0030] To more clearly illustrate the technical solutions of the embodiments of this disclosure, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this disclosure. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0031] Figure 1 This is a flowchart illustrating the multi-sensor target marking and output in a specific embodiment of the present invention.
[0032] Figure 2 This is a flowchart illustrating the matching process for two sensors with overlapping areas in a specific embodiment of the present invention.
[0033] Figure 3 This is a structural block diagram of a multi-sensor data fusion device in a specific embodiment of the present invention. Detailed Implementation
[0034] The embodiments of this disclosure will now be described in detail with reference to the accompanying drawings.
[0035] The following specific examples illustrate the implementation of this disclosure. Those skilled in the art can easily understand other advantages and effects of this disclosure from the content disclosed in this specification. Obviously, the described embodiments are only a part of the embodiments of this disclosure, and not all of them. This disclosure can also be implemented or applied through other different specific embodiments, and the details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of this disclosure. It should be noted that, in the absence of conflict, the following embodiments and features in the embodiments can be combined with each other. Based on the embodiments in this disclosure, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this disclosure.
[0036] It should be noted that various aspects of embodiments within the scope of the appended claims are described below. It will be apparent that the aspects described herein can be embodied in a wide variety of forms, and any particular structure and / or function described herein is merely illustrative. Based on this disclosure, those skilled in the art will understand that one aspect described herein can be implemented independently of any other aspect, and two or more of these aspects can be combined in various ways. For example, any number of aspects set forth herein can be used to implement the device and / or practice the method. Additionally, this device and / or method can be implemented using other structures and / or functionalities besides one or more of the aspects set forth herein.
[0037] It should also be noted that the illustrations provided in the following embodiments are only schematic representations of the basic concept of this disclosure. The illustrations only show the components related to this disclosure and are not drawn according to the number, shape and size of the components in actual implementation. In actual implementation, the form, quantity and proportion of each component can be arbitrarily changed, and the layout of the components may also be more complex.
[0038] Furthermore, specific details are provided in the following description to facilitate a thorough understanding of the examples. However, those skilled in the art will understand that the described aspects can be practiced without these specific details.
[0039] In one embodiment of the present invention, a multi-sensor data fusion method is proposed for processing detection data of a target from multiple adjacent sensors. The sensor group includes at least two sensors for sensing the target. The multi-sensor data fusion method is as follows:
[0040] Extract target information of each target obtained by each sensor; based on the Hungarian algorithm, fuse the same targets perceived by two sensors with the same sensing area in the target information to generate fused targets;
[0041] During the operation of the sensor array, real-time data fusion of the fusion target is performed based on the relevant data of the two target information that make up the fusion target;
[0042] Among them: the fusion ID marker for the fusion target is inherited from the target that survives longer in the two sensors;
[0043] The method for labeling single-view targets detected by a single sensor is as follows:
[0044] When a single-view target is the same as a fused target from a previous timeline, the single-view target is marked based on the fused target's fusion ID.
[0045] When a single-view target is not the same as a fused target from a previous timeline, the single-view target is marked with a single-view ID based on ID memory, ensuring that the single-view ID mark and the fused ID mark are temporally consistent and do not overlap.
[0046] In this embodiment, the method for determining whether a single-view target and a fused target from a previous timeline are the same target is as follows:
[0047] Based on the ID of the single-view target recorded in the previous frame of the fusion target, compare the current single-view target ID to find the fusion target of the single-view target in the previous frame; calculate the IOU (intelligent dual-layer network) and position state error between the current single-view target and the fusion target in the previous frame to confirm whether the single-view target and the fusion target in the previous frame are the same target.
[0048] In this embodiment, the method to ensure that the fused ID tag and the single-view ID tag are not repeated is as follows: when the sensor group senses a frame, extract all the fused ID tags and single-view ID tags of the sensor group in the previous frame and continue them; check whether all ID tags in the same frame are repeated. If there are duplicates, roll to get a new ID tag and assign it to the newly appearing target; if the fused ID tag and / or single-view ID tag in the previous frame have changed and continue to exist in the current frame, then continue the change.
[0049] In this embodiment, the sensor is a vision sensor.
[0050] In this embodiment, the target information includes time-calibrated image frame information.
[0051] In this embodiment, the relevant data includes the sensor's sensing time, the target's ID, the target's category, the target's world coordinates, and the target's movement speed in the world coordinates.
[0052] In this embodiment, the fusion ID tag and single-view ID tag scrolling settings are implemented.
[0053] In this embodiment, the rolling setting method is as follows: when all sensors cannot detect a fused target or a single-view target, the fused target or single-view target is set as a failed target and the fused ID tag or single-view ID tag corresponding to the failed target is deleted.
[0054] In this embodiment, the method for determining the order in which the target is perceived by each sensor is based on the survival time of the target in the perception of each sensor.
[0055] like Figure 1This is the overall flowchart for multi-sensor target labeling and output in this embodiment. After inputting target information from all cameras, the survival time of the fused target is first updated by calculating the survival time of the target in each individual camera, determining which camera the target first appeared in. Then, the Hungarian algorithm is used to perform pairwise target matching for cameras with overlapping areas. Here, the parameters for calculating the correlation matrix are adjusted based on actual testing to optimize the matching effect. The information of the fused target is updated, added, or deleted based on the matching results. Finally, the IDs of targets on other unmatched targets (those never captured repeatedly) are modified. This ID memory ensures that all target IDs are temporally consistent and do not overlap. Finally, the fused target is output according to the required data structure.
[0056] Figure 2 This is a flowchart illustrating the matching process for two cameras with overlapping areas in this embodiment. If the Hungarian algorithm calculates that there are two matching targets in the two cameras, these two targets are used to update the corresponding fused target. If a target does not match a target in the other camera, a single target is used to update the fused target. The fused target records the original target ID of the camera corresponding to the fusion in the previous frame and which two cameras were fused in the previous frame. By comparing the original target ID recorded in the previous frame with the current target, the corresponding fused target can be found. If only one camera contains a previously fused target, the IOU and position state error between the current target and the fused target in the previous frame need to be calculated. After confirming that it is the same target, the fused target is updated.
[0057] Figure 3 This is a flowchart illustrating the update process for both fused ID tags and single ID tags in this embodiment. Updating IDs primarily ensures that global IDs are not duplicated and achieves continuous tracking. This part requires the global ID from the previous frame, as well as any changes to all IDs in the previous frame. If a target ID changed in the previous frame, then in the next frame, if the target still exists, the change from the previous frame will be carried over. Generally, the ID of the fused target inherits the target's ID from before fusion or its original ID. In special cases, duplicates may occur, so it is also necessary to check for ID duplication in the fused target.
[0058] This embodiment uses the Hungarian algorithm and a custom tracking scheme to solve the problem of the large amount of computation required for target fusion.
[0059] This embodiment uses a one-to-one sequential matching and updating of global fusion points between adjacent lenses to solve the problem of chaotic matching order among multiple cameras. It can also adapt to different numbers of selected cameras, has good independence, and is plug-and-play.
[0060] This embodiment employs a target ID memory and update scheme to solve the problem of duplicate and disordered target IDs.
[0061] This embodiment uses a fusion memory function to solve the problem of target jitter in a single frame.
[0062] Based on the same inventive concept, this invention also provides a multi-sensor data fusion apparatus, as described in the following embodiments. Since the principle of the multi-sensor data fusion apparatus in solving the problem is similar to that of the multi-sensor data fusion method, the implementation of the multi-sensor fusion apparatus can refer to the implementation of the multi-sensor fusion method, and repeated details will not be elaborated further. As used below, the terms "unit" or "module" can refer to a combination of software and / or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.
[0063] Figure 3 This is a structural block diagram of a multi-sensor data fusion device according to an embodiment of the present invention, such as... Figure 3 As shown, it includes: a target fusion module, used to extract target information of each target obtained by each sensor; and based on the Hungarian algorithm, to fuse the same targets perceived by two sensors with the same sensing area in the target information to generate a fused target.
[0064] The data fusion module is used to perform real-time data fusion of the fusion target based on the relevant data of the two target information that make up the fusion target during the operation of the sensor group.
[0065] Among them: the fusion ID marker that marks the fusion target, the fusion ID marker is inherited from the target that has the longer survival time in the two sensors that forms the fusion target;
[0066] The labeling module is used to label single-view targets detected by a single sensor through the following steps:
[0067] When a single-view target is the same as a fused target from a previous timeline, the single-view target is marked based on the fused target's fusion ID.
[0068] When a single-view target is not the same as a fused target from a previous timeline, the single-view target is marked with a single-view ID based on ID memory to ensure that the single-view ID mark and the fused ID mark are temporally consistent and do not overlap.
[0069] The above description is merely a specific embodiment of this disclosure, but the scope of protection of this disclosure is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this disclosure should be included within the scope of protection of this disclosure. Therefore, the scope of protection of this disclosure should be determined by the scope of the claims.
Claims
1. A multi-sensor data fusion method for processing target detection data from multiple adjacent sensors, employing a one-to-one sequential matching and updating of global fusion points between adjacent lenses, wherein the sensor group includes at least two sensors for sensing the target; characterized in that, The multi-sensor data fusion method includes: Extract target information of each target obtained by each sensor; based on the Hungarian algorithm, fuse the same target perceived by two sensors with the same sensing area in the target information to generate a fused target; During the operation of the sensor group, real-time data fusion is performed on the fusion target based on the relevant data of the two target information that make up the fusion target; Wherein: a fusion ID marker that marks the fusion target, the fusion ID marker being inherited from the target that forms the fusion target having the longer survival time among the two sensors; The method for marking single-view targets detected by the individual sensors is as follows: When the single-view target is the same as the fused target before the timeline, the single-view target is marked based on the fused target's fusion ID. When the single-view target is not the same as the fused target before the timeline, the single-view target is marked with a single-view ID based on ID memory to ensure that the single-view ID mark and the fused ID mark are temporally continuous and do not repeat. The method for determining whether the single-view target and the fused target before the timeline are the same target is as follows: Based on the ID of the single-view target recorded in the previous frame of the fusion target, compare the current single-view target ID to find the previous frame fusion target of the single-view target; calculate the IOU and position state error between the current single-view target and the previous frame fusion target to confirm whether the single-view target and the previous frame fusion target are the same target. Also includes: The scrolling settings for the fused ID marker and the single-view ID marker include: When all the sensors fail to detect a fused target or a single-view target, the fused target or the single-view target is set as a failed target and the fused ID tag or the single-view ID tag corresponding to the failed target is deleted. The method for determining the order in which the target is perceived by each sensor is based on the survival time of the target in the perception of each sensor.
2. The multi-sensor data fusion method according to claim 1, characterized in that, A method for ensuring that the fused ID tag and the single-view ID tag are not duplicated includes: When the sensor group senses a frame, it extracts all fused ID tags and single-view ID tags of the sensor group in the previous frame and continues them; it checks whether all ID tags in the same frame are duplicated. If there are duplicates, it rolls over and takes a new ID tag to assign a value to the newly appearing target; if the fused ID tags and / or single-view ID tags in the previous frame have changed and continue to exist in the current frame, it continues the changes.
3. The multi-sensor data fusion method according to claim 1, characterized in that, The sensor is a vision sensor.
4. The multi-sensor data fusion method according to claim 3, characterized in that, The target information includes time-calibrated image frame information.
5. The multi-sensor data fusion method according to claim 4, characterized in that, The relevant data includes the sensor's sensing time, the target's ID, the target's category, the target's world coordinates, and the target's movement speed in those world coordinates.
6. A multi-sensor data fusion apparatus for implementing the multi-sensor data fusion method as described in any one of claims 1-5, for processing detection data of a target from multiple adjacent sensors, wherein the sensor group includes at least two sensors for sensing the target; characterized in that, The device includes: The target fusion module is used to extract target information of each target obtained by each of the sensors; based on the Hungarian algorithm, it fuses the same targets perceived by two sensors with the same sensing area in the target information to generate a fused target; The data fusion module is used to perform real-time data fusion of the fusion target based on the relevant data of the two target information that make up the fusion target during the operation of the sensor group; Wherein: a fusion ID marker that marks the fusion target, the fusion ID marker being inherited from the target that forms the fusion target having the longer survival time among the two sensors; The marking module is used to mark single-view targets detected by the individual sensors through the following steps: When the single-view target is the same as the fused target before the timeline, the single-view target is marked based on the fused target's fusion ID. When the single-view target is not the same as the fused target before the timeline, the single-view target is marked with a single-view ID based on ID memory to ensure that the single-view ID mark and the fused ID mark are temporally continuous and do not repeat.