Target tracking method and apparatus

By introducing a target model with height and type information into radar and camera target tracking, the problem of miscorrelation in radar and camera target tracking is solved, and higher-precision target tracking is achieved.

CN114167404BActive Publication Date: 2026-06-12YINWANG INTELLIGENT TECHNOLOGIES CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
YINWANG INTELLIGENT TECHNOLOGIES CO LTD
Filing Date
2020-09-11
Publication Date
2026-06-12

Smart Images

  • Figure CN114167404B_ABST
    Figure CN114167404B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide a target tracking method and device, relate to the technical field of data processing, and can be used for security, assisted driving and automatic driving. The method comprises: obtaining a camera target tracking result and a radar target tracking result; obtaining a target tracking result according to a target model corresponding to the camera target tracking result and the radar target tracking result; and the target model is used to indicate an association relationship between a target in the radar target tracking result and height information of the target. In the embodiments of the present application, because the target model includes the height information of the target, when the camera target tracking result and the radar target tracking result are associated, the target tracking result monitored by the radar can be combined with the height information of the target, so that the range of the target monitored by the radar is effectively expanded, and an accurate target tracking result is obtained by association.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology, and in particular to a target tracking method and apparatus. Background Technology

[0002] With societal development, intelligent transportation equipment, smart home devices, robots, and other intelligent terminals are gradually entering people's daily lives. Sensors play a crucial role in these intelligent terminals. Various sensors installed on intelligent terminals, such as millimeter-wave radar, lidar, imaging radar, ultrasonic radar, and cameras, enable them to perceive their surroundings, collect data, identify and track moving objects, recognize stationary scenes such as lane lines and signs, and perform route planning by combining navigation and map data. For example, in fields such as autonomous driving, security, and surveillance, sensors can be used for target tracking, and specific strategies can be implemented based on this tracking. For instance, in autonomous driving, driving strategies can be formulated based on target tracking; in security and surveillance, alerts can be issued for unsafe factors such as illegal intrusions based on target tracking.

[0003] Among related technologies, there are methods for target tracking based on radar and cameras. For example, the position and velocity of a target can be detected by both a camera and radar, and then an association algorithm can be used to identify targets with similar positions and velocities detected by both the camera and radar as the same target.

[0004] However, the aforementioned technologies are prone to false associations when identifying the same target, resulting in low accuracy in target tracking. Summary of the Invention

[0005] This application provides a target tracking method and apparatus that can improve the accuracy of target tracking using radar and cameras.

[0006] In a first aspect, embodiments of this application provide a target tracking method, comprising: acquiring camera target tracking results and radar target tracking results; obtaining target tracking results based on target models corresponding to the camera target tracking results and radar target tracking results; wherein the target model is used to indicate the correlation between the target and the target's height information in the radar target tracking results. Thus, because the target model includes the target's height information, when associating the camera target tracking results and the radar target tracking results, the radar-monitored target tracking results can be combined with the target's height information, thereby effectively expanding the range of targets monitored by the radar and obtaining accurate target tracking results.

[0007] One possible implementation includes: obtaining the target's height information based on the target type information in the radar target tracking results; and fusing the target's height information with the target information in the radar target tracking results to obtain a target model. This yields a target model that can characterize the target's position and height, which can then be used to obtain accurate target tracking results.

[0008] In one possible implementation, a predefined or pre-set correspondence exists between the target's type information and its height information. This allows the target's height information to be easily obtained based on its type information.

[0009] In one possible implementation, the target tracking result is obtained based on the target model corresponding to the camera target tracking result and the radar target tracking result. This includes: projecting the target model onto the camera coordinate system to obtain the projected radar target tracking result; and obtaining the target tracking result based on the camera target tracking result and the projected radar target tracking result. In this way, an accurate target tracking result can be subsequently obtained in the camera coordinate system based on the camera target tracking result and the projected radar target tracking result.

[0010] In one possible implementation, projecting the target model onto the camera coordinate system includes: transforming the target model into the camera coordinate system according to a pre-set or defined height transformation relationship; wherein different height information corresponds to different height transformation relationships, and the height transformation relationship is used to transform the target tracking results with height in the radar coordinate system to the camera coordinate system. This allows for convenient transformation of the target model into the camera coordinate system based on this height transformation relationship.

[0011] In one possible implementation, the height transformation relationship corresponds to different region types. Because different regions correspond to different horizons—for example, the visual height of the same target in a low-lying area is usually different from that in a flat area—setting different height transformation relationships for different regions allows for accurate conversion when transforming target tracking results with height information from the radar coordinate system to the camera coordinate system.

[0012] In one possible implementation, the region type includes one or more of the following: a region with undulating terrain, a region with slopes, or a region with flat terrain. This allows for accurate coordinate system transformations for common terrain types.

[0013] In one possible implementation, the target model is transformed into the camera coordinate system according to a pre-set or defined height transformation relationship, including: determining the target region type corresponding to the target model; and transforming the target model into the camera coordinate system according to the target height transformation relationship that matches the height information of the target model in the height transformation relationship corresponding to the target region type.

[0014] In one possible implementation, the target tracking result is obtained based on the camera target tracking result and the projected radar target tracking result. This includes determining that the camera target tracking result and the projected radar target tracking result represent the same target based on the overlap ratio between them; wherein the overlap ratio is greater than a first value. In this way, the overlap ratio can be used to conveniently and accurately determine that the camera target tracking result and the projected radar target tracking result represent the same target.

[0015] In one possible implementation, determining that the camera target tracking result and the projected radar target tracking result represent the same target based on the overlap ratio includes: determining that the camera target tracking result and the projected radar target tracking result represent the same target when the overlap ratio is greater than a first value and the position and / or velocity of the overlapping target in the camera target tracking result and the projected radar target tracking result meet preset conditions. This allows for a more accurate determination by further combining the position and / or velocity of the overlapping target with the calculation of the overlap ratio.

[0016] In one possible implementation, the preset conditions include: the difference between the position and / or velocity of the overlapping target in the camera target tracking result and the position and / or velocity of the overlapping target in the radar target tracking result is less than a second value.

[0017] In one possible implementation, the radar target tracking result comes from the imaging radar; the target model also includes the target's size information. This allows for the simultaneous calculation of the overlap ratio between the visual bounding box, height information, and size. When the overlap ratio is greater than or equal to a certain value, the targets are associated with the same target. Because the size is included, more accurate target association can be achieved compared to millimeter-wave radar, thus enabling more accurate target tracking.

[0018] In one possible implementation, the camera target tracking result includes the target bounding box; the radar target tracking result includes the target point cloud. Thus, by utilizing the target bounding box and the target point cloud, target tracking can be performed efficiently and accurately.

[0019] Secondly, embodiments of this application provide a target tracking device.

[0020] The target tracking device can be a vehicle with target tracking capabilities, or other components with target tracking capabilities. The target tracking device includes, but is not limited to, vehicle-mounted terminals, vehicle-mounted controllers, vehicle-mounted modules, vehicle-mounted components, vehicle-mounted chips, vehicle-mounted units, vehicle-mounted radar, or vehicle-mounted cameras, and other sensors. Vehicles can implement the methods provided in this application through these vehicle-mounted terminals, vehicle-mounted controllers, vehicle-mounted modules, vehicle-mounted components, vehicle-mounted chips, vehicle-mounted units, vehicle-mounted radar, or cameras.

[0021] The target tracking device can be a smart terminal, or it can be installed in other smart terminals with target tracking capabilities besides vehicles, or it can be installed in a component of the smart terminal. The smart terminal can be a smart transportation device, a smart home device, a robot, or other terminal device. The target tracking device includes, but is not limited to, the smart terminal or its controller, chip, radar or camera, other sensors, and other components.

[0022] The target tracking device can be a general-purpose device or a dedicated device. In specific implementations, the device can also be a desktop computer, laptop computer, web server, PDA (personal digital assistant), mobile phone, tablet computer, wireless terminal device, embedded device, or other device with processing capabilities. This application does not limit the type of target tracking device.

[0023] The target tracking device can also be a chip or processor with processing capabilities, and may include at least one processor. The processor can be a single-core CPU or a multi-core CPU. The chip or processor with processing capabilities may be located within the sensor, or it may be located at the receiving end of the sensor's output signal. The processor includes, but is not limited to, at least one of a central processing unit (CPU), graphics processing unit (GPU), microcontroller unit (MCU), microprocessor unit (MPU), and coprocessor.

[0024] The target tracking device can also be a terminal device, or a chip or chip system within the terminal device. The target tracking device may include a processing unit and a communication unit. When the target tracking device is a terminal device, the processing unit may be a processor. The target tracking device may also include a storage unit, which may be a memory. The storage unit is used to store instructions, and the processing unit executes the instructions stored in the storage unit to cause the terminal device to implement a target tracking method described in the first aspect or any possible implementation of the first aspect. When the target tracking device is a chip or chip system within the terminal device, the processing unit may be a processor. The processing unit executes the instructions stored in the storage unit to cause the terminal device to implement a target tracking method described in the first aspect or any possible implementation of the first aspect. The storage unit may be a storage unit within the chip (e.g., a register, cache, etc.), or a storage unit located outside the chip within the terminal device (e.g., read-only memory, random access memory, etc.).

[0025] For example, the communication unit is used to acquire the target tracking results from the camera and the target tracking results from the radar; the processing unit is used to obtain the target tracking results based on the target models corresponding to the target tracking results from the camera and the radar; wherein, the target model is used to indicate the correlation between the target and the target's height information in the radar target tracking results.

[0026] In one possible implementation, the processing unit is further configured to obtain the target's height information based on the target type information in the radar target tracking result; the processing unit is further configured to fuse the target's height information and the target in the radar target tracking result to obtain a target model.

[0027] In one possible implementation, there is a predefined or pre-set correspondence between the target's type information and its height information.

[0028] In one possible implementation, the processing unit is specifically used to project the target model onto the camera coordinate system to obtain the projected radar target tracking result; and to obtain the target tracking result based on the camera target tracking result and the projected radar target tracking result.

[0029] In one possible implementation, the processing unit is specifically used to transform the target model to the camera coordinate system according to a pre-set or defined height transformation relationship; wherein, different height information corresponds to different height transformation relationships, and the height transformation relationship is used to transform the target tracking results with height in the radar coordinate system to the camera coordinate system.

[0030] In one possible implementation, the height conversion relationship corresponding to the height information differs for different region types.

[0031] In one possible implementation, the region type includes one or more of the following: a region with undulating terrain, a region with slopes, or a region with flat terrain.

[0032] In one possible implementation, the processing unit is specifically used to determine the target region type corresponding to the target model; and to transform the target model to the camera coordinate system based on the target height transformation relationship that matches the height information of the target model in the height transformation relationship corresponding to the target region type.

[0033] In one possible implementation, the processing unit is specifically used to determine that the camera target tracking result and the projected radar target tracking result are the same target based on the overlap ratio of the camera target tracking result and the projected radar target tracking result; wherein the overlap ratio is greater than a first value.

[0034] In one possible implementation, the processing unit is specifically used to determine that the camera target tracking result and the projected radar target tracking result are the same target when the overlap ratio is greater than a first value and the position and / or velocity of the overlapping target in the camera target tracking result and the projected radar target tracking result meet preset conditions.

[0035] In one possible implementation, the preset conditions include: the difference between the position and / or velocity of the overlapping target in the camera target tracking result and the position and / or velocity of the overlapping target in the radar target tracking result is less than a second value.

[0036] In one possible implementation, the radar target tracking results come from the imaging radar; the target model also includes the target's size information.

[0037] In one possible implementation, the camera target tracking result includes the target bounding box; the radar target tracking result includes the target point cloud.

[0038] Thirdly, embodiments of this application also provide a sensor system for providing target tracking functionality for a vehicle. It includes at least one target tracking device mentioned in the above embodiments of this application, as well as other sensors such as cameras and radar. The at least one sensor device within the system can be integrated into a single unit or device, or the at least one sensor device within the system can be independently configured as a component or device.

[0039] Fourthly, this application also provides a system for use in autonomous driving or intelligent driving, which includes at least one of the target tracking device, camera, radar and other sensors mentioned in the above embodiments of this application. The at least one device in the system can be integrated into a whole machine or device, or the at least one device in the system can be set as an independent component or device.

[0040] Furthermore, any of the above systems can interact with the vehicle's central controller to provide detection and / or fusion information for the vehicle's driving decisions or control.

[0041] Fifthly, embodiments of this application also provide a terminal, which includes at least one target tracking device or any of the systems mentioned in the above embodiments of this application. Further, the terminal can be a smart home device, smart manufacturing equipment, smart industrial equipment, smart transportation equipment (including drones, vehicles, etc.), etc.

[0042] In a sixth aspect, embodiments of this application also provide a chip, including at least one processor and an interface; the interface is used to provide program instructions or data to at least one processor; at least one processor is used to execute program line instructions to implement any of the methods in the first aspect or possible implementations of the first aspect.

[0043] In a seventh aspect, embodiments of this application provide a target tracking device, including at least one processor for calling a program in memory to implement any method in the first aspect or any possible implementation of the first aspect.

[0044] Eighthly, embodiments of this application provide a target tracking device, comprising: at least one processor and an interface circuit, the interface circuit being configured to provide information input and / or information output to the at least one processor; the at least one processor being configured to execute code instructions to implement any method of the first aspect or any possible implementation thereof.

[0045] Ninthly, embodiments of this application provide a computer-readable storage medium storing instructions that, when executed, implement any method of the first aspect or any possible implementation thereof.

[0046] It should be understood that the second to ninth aspects of this application correspond to the technical solutions of the first aspect of this application, and the beneficial effects achieved by each aspect and the corresponding feasible implementation are similar, and will not be repeated here. Attached Figure Description

[0047] Figure 1 This is a schematic diagram illustrating target determination based on visual bounding boxes and radar point clouds.

[0048] Figure 2 A schematic diagram illustrating target determination based on visual bounding boxes and radar point clouds, provided for embodiments of this application;

[0049] Figure 3 This is a functional block diagram of the vehicle 100 provided in the embodiments of this application;

[0050] Figure 4 for Figure 3 A schematic diagram of the structure of a computer system in the image;

[0051] Figure 5 This is a schematic diagram of a chip hardware structure provided in an embodiment of this application;

[0052] Figure 6 This is a schematic diagram of an application scenario provided by an embodiment of this application;

[0053] Figure 7 This is a schematic flowchart of a target tracking method provided in an embodiment of this application;

[0054] Figure 8 A schematic diagram of probability height provided for an embodiment of this application;

[0055] Figure 9 A schematic diagram of height calibration provided for an embodiment of this application;

[0056] Figure 10 This application provides a schematic diagram of different region types as an embodiment of the present application;

[0057] Figure 11 A target association diagram provided for an embodiment of this application;

[0058] Figure 12 This is a schematic diagram of another target tracking method provided in an embodiment of this application;

[0059] Figure 13 This is a schematic diagram of the structure of a target tracking device provided in an embodiment of this application;

[0060] Figure 14 This is a schematic diagram of the structure of a chip provided in an embodiment of this application;

[0061] Figure 15 This is a schematic diagram of another target tracking device provided in an embodiment of this application;

[0062] Figure 16 This is a structural schematic diagram of a vehicle provided in an embodiment of this application. Detailed Implementation

[0063] To facilitate a clear description of the technical solutions in the embodiments of this application, the terms "first" and "second" are used in the embodiments of this application to distinguish identical or similar items with essentially the same function and effect. For example, the first value and the second value are only used to distinguish different values ​​and do not limit their order. Those skilled in the art will understand that the terms "first" and "second" do not limit the quantity or execution order, and the terms "first" and "second" are not necessarily different.

[0064] It should be noted that, in this application, the terms "exemplary" or "for example" are used to indicate that something is being described as an example, illustration, or illustration. Any embodiment or design described as "exemplary" or "for example" in this application should not be construed as being more preferred or advantageous than other embodiments or design solutions. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a concrete manner.

[0065] In this application, "at least one" means one or more, and "more than one" means two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can mean: A alone, A and B simultaneously, or B alone, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple.

[0066] Radar-based target tracking and / or camera-based target tracking are possible ways to achieve target tracking.

[0067] Radar can be a device based on radio detection. Radar can measure the position of targets in the air, on the ground, and at sea; it may also be called radio positioning. For example, radar uses a directional antenna to emit radio waves into the air. When these waves encounter a target, they are reflected back and received by the radar. By measuring the time it takes for the radio waves to travel through the air, the distance to the target is obtained. Based on the direction of the antenna beam, the angle of the target is determined, thus enabling target tracking. Typically, radar can obtain accurate velocity and position information and has a long field of view. However, in cluttered environments, radar performance can be negatively affected by clutter, leading to poor target tracking.

[0068] A camera projects an optical image of a scene through its lens onto the surface of an image sensor. This image is then converted into an electrical signal, which is converted into a digital image signal after digital-to-analog conversion. This digital image signal can then be processed in a digital signal processing (DSP) chip. Images captured by a camera can be used for target classification and the detection of target position and / or velocity, thereby enabling target tracking. However, in low-light environments, the image quality captured by the camera may be poor, resulting in poor target tracking performance.

[0069] Fusing radar-based target tracking results with camera-based target tracking results (hereinafter referred to as radar-camera fusion) can leverage the strengths of both radar and cameras to achieve more accurate target tracking. The implementation of radar-camera fusion may include object-level data fusion methods and data-level data fusion methods.

[0070] Among possible implementations, the target-level radar-camera fusion method includes: obtaining the target's visual bounding box using the camera; transforming the visual bounding box using a transformation matrix between camera coordinates (also known as visual coordinates) and radar coordinates (also known as top-view coordinates) to obtain the target's position and velocity in radar coordinates; obtaining the target using radar-detected point clouds and obtaining the target's position and velocity in radar coordinates; using an association algorithm related to the target's position and velocity to associate the target detected by radar and the target detected by the camera to confirm that they are the same target; and performing state estimation on the target to obtain the fused target position and velocity.

[0071] Among possible implementations, the measurement-grade radar-camera fusion method includes: using the point cloud (or radar point cloud or point cloud data, etc.) of the target monitored by radar, projecting the radar-detected point cloud onto the camera coordinate system; using the visual bounding box of the target obtained by the camera and the association algorithm, associating the projection of the radar point cloud and the visual bounding box obtained by the camera to confirm the same target; and performing state estimation on the target to obtain the fused target position and velocity.

[0072] However, in the above implementation methods, when the target-level fusion method and the measurement-level fusion method identify the same target, they need to use position information to associate the target obtained by the camera with the target obtained by the radar. The position information of the target obtained by the camera usually depends on the accuracy of the bottom edge of the visual bounding box. However, due to weather, environment and other reasons, the accuracy of the bottom edge of the visual bounding box may not be high. The position information of the target obtained by the radar usually depends on the target point cloud. However, in environments such as clutter or ground undulation, the accuracy of the target point cloud may not be high, which can easily lead to false association.

[0073] For example, Figure 1 A schematic diagram illustrating target determination based on visual bounding boxes and radar point clouds is shown. Figure 1As shown, due to factors such as the similarity between the color of a person's legs and the color of the ground, when the visual bounding box 10 defines the person, the bottom edge of the visual bounding box 10 is set on the upper body of the person, while the radar point cloud 11 may detect the lower body of the person (e.g., feet). Therefore, when performing target fusion, because the positions of the visual bounding box 10 and the radar point cloud 11 are far apart, the target defined by the visual bounding box 10 and the target determined by the radar point cloud 11 may not be associated as the same target, resulting in false association.

[0074] Based on this, in the target tracking method of this application embodiment, when associating the results of camera target tracking and radar target tracking, the target's height information is introduced into the radar target tracking result. For example, a target model is obtained to indicate the association relationship between the target and the target's height information in the radar target tracking result. When associating the camera target tracking result and the radar target tracking result, the target tracking result can be obtained based on the camera target tracking result and the target model. Because the target model includes the target's height information, the range of the target monitored by the radar can be effectively expanded, thereby enabling the association to obtain an accurate target tracking result.

[0075] For example, Figure 2 This diagram illustrates a method for determining a target using a visual bounding box and radar point cloud, according to an embodiment of this application. Figure 2 As shown, due to factors such as the similarity between the color of a person's legs and the ground, when the visual bounding box 20 defines the person, the bottom edge of the visual bounding box 20 is set to the upper body of the person, while the radar point cloud 21 may detect the lower body of the person (e.g., feet). However, in this embodiment, the height information of the person is introduced. For example, the line segment 23 used to represent the height information can be determined. Since there is a lot of overlap between the visual bounding box 20 and the line segment 23 used to represent the height information, it is very likely that the target defined by the visual bounding box 20 and the target determined by the radar point cloud 21 will be associated as the same target. Therefore, in this embodiment, the accuracy of the bottom edge of the visual bounding box and the accuracy of the radar point cloud are no longer relied upon. Regardless of the environment with poor lighting (e.g., night), the low progress of the bottom edge of the visual bounding box (e.g., only defining the upper body of the person), or the inaccurate point cloud data detected by the radar in a cluttered environment, the accurate target can be associated based on the height information and the visual bounding box, thereby improving the accuracy and stability of the associated target.

[0076] In possible implementations, the target tracking method of this application embodiment can be applied to scenarios such as autonomous driving, security, or monitoring. For example, in an autonomous driving scenario, the target tracking method of this application embodiment can be used to track targets such as obstacles, and then autonomous driving strategies can be formulated based on the target tracking. For example, in a security or monitoring scenario, the target tracking method of this application embodiment can be used to track targets such as people, and then alarms can be issued based on the target tracking for unsafe factors such as illegal intrusion.

[0077] For example, in an autonomous driving scenario, the target tracking method of this application embodiment can be applied to a vehicle, or a chip within a vehicle, etc. For example, Figure 3 A functional block diagram of a vehicle 100 provided in an embodiment of this application is shown. In one embodiment, the vehicle 100 is configured in a fully or partially automated driving mode. For example, when the vehicle 100 is configured in a partially automated driving mode, the vehicle 100 can still determine the current state of the vehicle and its surrounding environment through human operation while in the automated driving mode. For example, it can determine the possible behavior of at least one other vehicle in the surrounding environment and determine the confidence level corresponding to the probability that the other vehicle will perform the possible behavior, and control the vehicle 100 based on the determined information. For example, when the vehicle 100 is in a fully automated driving mode, the vehicle 100 can be set to automatically perform driving-related operations without human interaction.

[0078] Vehicle 100 may include various subsystems, such as a mobility system 102, a sensor system 104, a control system 106, one or more peripheral devices 108, a power supply 110, a computer system 112, and a user interface 116. Optionally, vehicle 100 may include more or fewer subsystems, and each subsystem may include multiple components. Furthermore, each subsystem and component of vehicle 100 may be interconnected via wired or wireless means.

[0079] The propulsion system 102 may include components that provide powered motion to the vehicle 100. In one embodiment, the propulsion system 102 may include an engine 118, an energy source 119, a transmission 120, and wheels / tires 121. The engine 118 may be an internal combustion engine, an electric motor, an air-compressed engine, or other types of engine combinations, such as a hybrid engine consisting of a gasoline engine and an electric motor, or a hybrid engine consisting of an internal combustion engine and an air-compressed engine. The engine 118 converts the energy source 119 into mechanical energy.

[0080] Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity. Energy source 119 may also provide energy to other systems of vehicle 100.

[0081] The transmission 120 can transmit mechanical power from the engine 118 to the wheels 121. The transmission 120 may include a gearbox, a differential, and a drive shaft. In one embodiment, the transmission 120 may also include other components, such as a clutch. The drive shaft may include one or more axles that can be coupled to one or more wheels 121.

[0082] Sensor system 104 may include several sensors for sensing information about the environment surrounding vehicle 100. For example, sensor system 104 may include a positioning system 122 (which may be a GPS system, a BeiDou system, or another positioning system), an inertial measurement unit (IMU) 124, a radar 126, a laser rangefinder 128, and a camera 130. Sensor system 104 may also include sensors for the internal systems of the monitored vehicle 100 (e.g., an in-vehicle air quality monitor, fuel gauge, oil temperature gauge, etc.). Sensor data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, orientation, speed, etc.). This detection and identification is a key function for the safe operation of autonomous vehicle 100.

[0083] The positioning system 122 can be used to estimate the geographic location of the vehicle 100. The IMU 124 is used to sense changes in the position and orientation of the vehicle 100 based on inertial acceleration. In one embodiment, the IMU 124 can be a combination of an accelerometer and a gyroscope.

[0084] Radar 126 can use radio signals to sense objects in the surrounding environment of vehicle 100. In some embodiments, in addition to sensing objects, radar 126 can also be used to sense the speed and / or direction of travel of objects.

[0085] The laser rangefinder 128 can use lasers to sense objects in the environment in which the vehicle 100 is located. In some embodiments, the laser rangefinder 128 may include one or more laser sources, a laser scanner, and one or more detectors, as well as other system components.

[0086] Camera 130 can be used to capture multiple images of the surrounding environment of vehicle 100. Camera 130 can be a still camera or a video camera.

[0087] The control system 106 controls the operation of the vehicle 100 and its components. The control system 106 may include various elements, including a steering system 132, a throttle 134, a braking unit 136, a computer vision system 140, a route control system 142, and an obstacle avoidance system 144.

[0088] The steering system 132 is operable to adjust the forward direction of the vehicle 100. For example, in one embodiment, it may be a steering wheel system.

[0089] Throttle 134 is used to control the operating speed of engine 118 and thus the speed of vehicle 100.

[0090] Braking unit 136 is used to control the deceleration of vehicle 100. Braking unit 136 can use friction to slow down wheel 121. In other embodiments, braking unit 136 can convert the kinetic energy of wheel 121 into electric current. Braking unit 136 may also take other forms to slow down the rotational speed of wheel 121 to control the speed of vehicle 100.

[0091] The computer vision system 140 is operable to process and analyze images captured by the camera 130 to identify objects and / or features in the environment surrounding the vehicle 100. The objects and / or features may include traffic signals, road boundaries, and obstacles. The computer vision system 140 may use object recognition algorithms, structure from motion (SFM) algorithms, video tracking, and other computer vision techniques. In some embodiments, the computer vision system 140 may be used to map the environment, track objects, estimate object velocities, and so on.

[0092] The route control system 142 is used to determine the driving route of the vehicle 100. In some embodiments, the route control system 142 may combine data from the global positioning system (GPS) 122 and one or more predetermined maps to determine the driving route of the vehicle 100.

[0093] The obstacle avoidance system 144 is used to identify, assess, and avoid or otherwise traverse potential obstacles in the environment of the vehicle 100.

[0094] Of course, in one instance, the control system 106 may include additional or alternative components besides those shown and described. Alternatively, some of the components shown above may be reduced.

[0095] Vehicle 100 interacts with external sensors, other vehicles, other computer systems, or users via peripheral devices 108. Peripheral devices 108 may include a wireless communication system 146, an onboard computer 148, a microphone 150, and / or a speaker 152.

[0096] In some embodiments, peripheral device 108 provides a means for a user of vehicle 100 to interact with user interface 116. For example, on-board computer 148 may provide information to a user of vehicle 100. User interface 116 may also operate on-board computer 148 to receive user input. On-board computer 148 may be operated via a touchscreen. In other cases, peripheral device 108 may provide a means for vehicle 100 to communicate with other devices located within the vehicle. For example, microphone 150 may receive audio (e.g., voice commands or other audio input) from a user of vehicle 100. Similarly, speaker 152 may output audio to a user of vehicle 100.

[0097] In one possible implementation, the display screen of the vehicle computer 148 may also display the target tracked by the target tracking algorithm according to the embodiments of this application, so that the user can perceive the environment around the vehicle on the display screen.

[0098] The wireless communication system 146 can communicate wirelessly with one or more devices directly or via a communication network. For example, the wireless communication system 146 can use 3G cellular communication, such as code division multiple access (CDMA), EVDO, Global System for Mobile Communications (GSM) / General Packet Radio Service (GPRS), or 4G cellular communication, such as LTE, or 5G cellular communication. The wireless communication system 146 can communicate with a wireless local area network (WLAN) using wireless-fidelity (WiFi). In some embodiments, the wireless communication system 146 can communicate directly with devices using an infrared link, Bluetooth, or ZigBee. Other wireless protocols, such as various vehicle communication systems, are also possible. For example, the wireless communication system 146 may include one or more dedicated shortrange communications (DSRC) devices that can enable public and / or private data communication between vehicles and / or roadside stations.

[0099] Power source 110 can provide power to various components of vehicle 100. In one embodiment, power source 110 can be a rechargeable lithium-ion or lead-acid battery. One or more such battery packs can be configured to provide power to various components of vehicle 100. In some embodiments, power source 110 and energy source 119 can be implemented together, as is the case in some fully electric vehicles.

[0100] Some or all of the functions of vehicle 100 are controlled by computer system 112. Computer system 112 may include at least one processor 113, which executes instructions 115 stored in a non-transitory computer-readable medium such as data storage device 114. Computer system 112 may also be multiple computing devices that control individual components or subsystems of vehicle 100 in a distributed manner.

[0101] Processor 113 can be any conventional processor, such as a commercially available central processing unit (CPU). Alternatively, the processor can be a special-purpose device such as an application-specific integrated circuit (ASIC) or other hardware-based processor for a specific application. Although Figure 3 The processor, memory, and other elements of the computer system 112 within the same block are functionally illustrated; however, those skilled in the art will understand that the processor, computer, or memory may actually include multiple processors, computers, or memories that may or may not be stored in the same physical enclosure. For example, memory may be a hard disk drive or other storage media located in an enclosure different from that of the computer. Therefore, references to processors or computers will be understood to include references to a collection of processors or computers or memories that may or may not operate in parallel. Unlike using a single processor to perform the steps described herein, some components, such as steering and deceleration components, may each have their own processor that performs calculations only related to the component's specific function.

[0102] In the various aspects described herein, the processor may be located remotely from the vehicle and communicate wirelessly with the vehicle. In other aspects, some of the processes described herein are executed on a processor located within the vehicle, while others are executed by a remote processor, including taking the necessary steps to perform a single operation.

[0103] In some embodiments, the data storage device 114 may include instructions 115 (e.g., program logic) that can be executed by the processor 113 to perform various functions of the vehicle 100, including those described above. The data storage device 114 may also include additional instructions, including instructions for transmitting data, receiving data from, interacting with, and / or controlling one or more of the mobility system 102, sensor system 104, control system 106, and peripheral devices 108.

[0104] In addition to instruction 115, data storage device 114 may also store data such as road maps, route information, vehicle position, direction, speed, and other such vehicle data, as well as other information. This information can be used by vehicle 100 and computer system 112 during operation of vehicle 100 in autonomous, semi-autonomous, and / or manual modes.

[0105] User interface 116 is used to provide information to or receive information from users of vehicle 100. Optionally, user interface 116 may include one or more input / output devices within a set of peripheral devices 108, such as wireless communication system 146, vehicle-to-everything (V2X) computer 148, microphone 150, and speaker 152.

[0106] Computer system 112 can control the functions of vehicle 100 based on input received from various subsystems (e.g., driving system 102, sensor system 104, and control system 106) and from user interface 116. For example, computer system 112 can utilize input from control system 106 to control steering system 132 to avoid obstacles detected by sensor system 104 and obstacle avoidance system 144. In some embodiments, computer system 112 is operable to provide control over many aspects of vehicle 100 and its subsystems.

[0107] Alternatively, one or more of these components may be installed separately from or associated with vehicle 100. For example, data storage device 114 may exist partially or completely separately from vehicle 100. The components may be communicatively coupled together in a wired and / or wireless manner.

[0108] Optionally, the components described above are merely examples. In actual applications, components in each of the above modules may be added or removed as needed. Figure 3 This should not be construed as a limitation on the embodiments of this application.

[0109] An autonomous vehicle traveling on a road, such as vehicle 100 mentioned above, can track objects in its surrounding environment according to the target tracking method of this application embodiment to determine its own adjustments to its current speed or driving route. The object can be other vehicles, traffic control equipment, or other types of objects.

[0110] In addition to providing instructions to adjust the speed or route of the autonomous vehicle, the computing device can also provide instructions to modify the steering angle of the vehicle 100 so that the autonomous vehicle follows a given trajectory and / or maintains a safe lateral and longitudinal distance from obstacles near the autonomous vehicle (e.g., vehicles in adjacent lanes on the road).

[0111] The aforementioned vehicle 100 can be a car, truck, motorcycle, bus, ship, airplane, helicopter, lawnmower, recreational vehicle, amusement park vehicle, construction equipment, tram, golf cart, train, and handcart, etc., and this application embodiment does not impose any special limitations.

[0112] Figure 4 for Figure 3 A schematic diagram of the structure of computer system 112. (See diagram below.) Figure 4 As shown, computer system 112 includes processor 113, which is coupled to system bus 105. Processor 113 can be one or more processors, each of which can include one or more processor cores. A video adapter 107 drives a display 109, which is coupled to system bus 105. System bus 105 is coupled to input / output (I / O) bus via bus bridge 111. I / O interface 145 is coupled to the I / O bus. I / O interface 145 communicates with various I / O devices, such as input devices 117 (e.g., keyboard, mouse, touchscreen), media tray 121 (e.g., CD-ROM, multimedia interface), transceiver 123 (capable of sending and / or receiving radio communication signals), camera 155 (capable of capturing still and moving digital video images), and external USB interface 125. Optionally, the interface connected to I / O interface 145 can be a universal serial bus (USB) interface.

[0113] The processor 113 can be any conventional processor, including a Reduced Instruction Set Computing (“RISC”) processor, a Complex Instruction Set Computing (“CISC”) processor, or a combination thereof. Optionally, the processor can be a special-purpose device such as an Application-Specific Integrated Circuit (“ASIC”). Optionally, the processor 113 can be a neural network processor or a combination of a neural network processor and the conventional processors described above.

[0114] Alternatively, in the various embodiments described herein, the computer system may be located remotely from the autonomous vehicle and may communicate wirelessly with the autonomous vehicle. In other aspects, some processes described herein are executed on a processor located within the autonomous vehicle, while others are executed by a remote processor, including taking actions necessary to perform a single manipulation.

[0115] Computer system 112 can communicate with software deployment server 149 via network interface 129. Network interface 129 is a hardware network interface, such as a network interface card (NIC). Network 127 can be an external network, such as the Internet, or an internal network, such as Ethernet or a Virtual Private Network (VPN). Optionally, network 127 can also be a wireless network, such as a WiFi network or a cellular network.

[0116] The hard disk drive interface 131 is coupled to the system bus 105. The hard disk drive interface 131 is connected to the hard disk drive 133. The system memory 135 is coupled to the system bus 105. The software running in the system memory 135 may include the operating system (OS) 137 and application programs 143 of the computer system 112.

[0117] An operating system consists of a shell (139) and a kernel (141). The shell (139) is an interface between the user and the operating system kernel. The shell is the outermost layer of the operating system. It manages the interaction between the user and the operating system: waiting for user input, interpreting user input for the operating system, and processing various operating system outputs.

[0118] The kernel 141 consists of the parts of the operating system used to manage memory, files, peripherals, and system resources. Interacting directly with the hardware, the operating system kernel 141 typically runs processes and provides inter-process communication, CPU time-slice management, interrupts, memory management, I / O management, and so on.

[0119] Application 143 includes programs related to controlling autonomous driving, such as programs managing the interaction between the autonomous vehicle and obstacles on the road, programs controlling the autonomous vehicle's route or speed, and programs controlling the interaction between the autonomous vehicle and other autonomous vehicles on the road. Application 143 also resides on the system of the software deployment server 149. In one embodiment, when application 143 needs to be executed, the computer system can download application 143 from the deployment server 149.

[0120] Sensor 153 is associated with the computer system. Sensor 153 is used to detect the environment surrounding the computer system 112. For example, sensor 153 can detect animals, cars, obstacles, and pedestrian crossings. Furthermore, the sensor can also detect the environment around these objects, such as the environment around the animal (e.g., other animals nearby), weather conditions, and ambient light levels. Optionally, if the computer system 112 is located on an autonomous vehicle, the sensor can be a camera, infrared sensor, chemical detector, microphone, etc.

[0121] Figure 5 This is a schematic diagram of a chip hardware structure provided in an embodiment of this application. Figure 5 As shown, the chip may include a neural network processor 50. This chip can be applied to... Figure 3 Among the vehicles shown, or Figure 4 The computer system shown.

[0122] The neural network processor 50 can be any processor suitable for large-scale XOR operations, such as a neural network processing unit (NPU), tensor processing unit (TPU), or graphics processing unit (GPU). Taking an NPU as an example: the NPU can be mounted as a coprocessor on the host CPU, which then assigns tasks to it. The core of the NPU is the arithmetic circuit 503, which, controlled by a controller 504, retrieves matrix data from the memory (input memory 501 and weight memory 502) and performs multiplication and addition operations.

[0123] In some implementations, the arithmetic circuit 503 internally includes multiple process engines (PEs). In some implementations, the arithmetic circuit 503 is a two-dimensional pulsating array. The arithmetic circuit 503 can also be a one-dimensional pulsating array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 503 is a general-purpose matrix processor.

[0124] For example, suppose we have an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit 503 retrieves the weight data of matrix B from the weight memory 502 and caches it in each PE (Engineer Component) of the arithmetic circuit 503. The arithmetic circuit 503 retrieves the input data of matrix A from the input memory 501, performs matrix operations based on the input data of matrix A and the weight data of matrix B, and stores the partial or final result of the obtained matrix in the accumulator 508.

[0125] Unified memory 506 is used to store input and output data. Weight data is directly transferred to weight memory 502 via direct memory access controller (DMAC) 505. Input data is also transferred to unified memory 506 via DMAC.

[0126] The bus interface unit (BIU) 510 is used for interaction between the DMAC and the instruction fetch buffer 509; the bus interface unit 510 is also used by the instruction fetch buffer 509 to fetch instructions from external memory; the bus interface unit 510 is also used by the memory access controller 505 to fetch the original data of the input matrix A or the weight matrix B from external memory.

[0127] The DMAC is mainly used to move input data from external memory DDR to unified memory 506, or to weight data to weight memory 502, or to input data to input memory 501.

[0128] The vector computation unit 507 comprises multiple processing units that, when necessary, further process the output of the computation circuit 503, such as vector multiplication, vector addition, exponential operations, logarithmic operations, and size comparisons. The vector computation unit 507 is primarily used for computations in non-convolutional layers or fully connected layers (FC) of neural networks, specifically handling calculations such as pooling and normalization. For example, the vector computation unit 507 can apply nonlinear functions to the output of the computation circuit 503, such as a vector of accumulated values, to generate activation values. In some implementations, the vector computation unit 507 generates normalized values, merged values, or both.

[0129] In some implementations, the vector computation unit 507 stores the processed vector in the unified memory 506. In some implementations, the vector processed by the vector computation unit 507 can be used as the activation input of the arithmetic circuit 503.

[0130] The instruction fetch buffer 509 connected to the controller 504 is used to store the instructions used by the controller 504;

[0131] Unified memory 506, input memory 501, weighted memory 502, and instruction fetch memory 509 are all on-chip memories. External memory is independent of this NPU hardware architecture.

[0132] For example, in security or surveillance scenarios, the target tracking method of this application embodiment can be applied to electronic devices. The electronic device can be a terminal device with computing capabilities, a server, or a chip, etc. The terminal device can include a mobile phone, computer, or tablet, etc. For example, Figure 6 The illustration shows a scenario in which the target tracking method of this application is applied to security or monitoring.

[0133] like Figure 6 As shown, in a security or monitoring scenario, a radar 601, a camera 602, and an electronic device 603 may be included. The radar 601 and camera 602 can be installed in locations such as utility poles, providing them with good fields of view. The radar 601 and camera 602 can communicate with the electronic device 603. The point cloud data measured by the radar 601 and the images captured by the camera 602 can be transmitted to the electronic device 603. The electronic device 603 can then use the target tracking method of this embodiment to track, for example, a person 604, based on the point cloud data from the radar 601 and the images captured by the camera 602.

[0134] In possible implementations, when electronic device 603 detects that person 604 has illegally entered an unsafe area, it can issue a warning through screen display, voice warning, and / or warning through a warning device, etc. This application embodiment does not specifically limit this.

[0135] The following description of the terminology used in the embodiments of this application is provided for clarity. It should be understood that this description is intended to provide a clearer explanation of the embodiments of this application and does not necessarily constitute a limitation thereof.

[0136] The camera target tracking results described in this application embodiment may include: target bounding boxes (or visual bounding boxes, etc.) obtained by target framing of images captured by the camera, or other data used for target identification, etc. In possible implementations, the camera target tracking results may also include one or more of the following: target position or velocity, etc. The number of camera target tracking results may be one or more. This application embodiment does not specifically limit the specific content and number of camera target tracking results.

[0137] The radar target tracking results described in this application embodiment may include: target point clouds acquired by radar, or other data used for target calibration, etc. In possible implementations, the radar target tracking results may also include one or more of the following: target position or velocity, etc. The number of radar target tracking results may be one or more. This application embodiment does not specifically limit the specific content and number of radar target tracking results.

[0138] The radar described in this application embodiment may include millimeter-wave radar or imaging radar, etc. Among them, imaging radar can obtain more point cloud data than millimeter-wave radar. Therefore, when using imaging radar for target tracking, the size of the target can be obtained based on the larger amount of point cloud data collected by the imaging radar. Then, the radar and camera are fused together with the target size to obtain more accurate target tracking than millimeter-wave radar.

[0139] The camera target tracking results described in the embodiments of this application can be target tracking results calibrated in the camera coordinate system. The radar target tracking results described in the embodiments of this application can be target tracking results calibrated in the radar coordinate system.

[0140] The camera coordinate system described in this application embodiment can be a coordinate system centered on the camera. For example, in the camera coordinate system, the camera is at the origin, the x-axis is to the right, the z-axis is forward (towards the screen or the camera), and the y-axis is upward (not above the world but above the camera itself). In possible implementations, the camera coordinate system may also be called the visual coordinate system.

[0141] The radar coordinate system described in the embodiments of this application can be a radar-centered coordinate system. In possible implementations, the radar coordinate system may also be called a top-view coordinate system or a bird's-eye view (BEV) coordinate system, etc.

[0142] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. The following specific embodiments can be implemented independently or in combination with each other. The same or similar concepts or processes may not be described again in some embodiments.

[0143] Figure 7 This is a flowchart illustrating a target tracking method provided in an embodiment of this application, as shown below. Figure 7 As shown, the method includes:

[0144] S701: Acquires target tracking results from cameras and radar.

[0145] In this embodiment, the camera can be used to capture images, and the radar can be used to detect point cloud data. The camera, radar, and device for performing the target tracking method can be combined in one device, or they can be installed independently, or they can be combined in pairs in one device. This embodiment does not specifically limit this.

[0146] In one possible implementation, the camera could have computing capabilities, allowing it to obtain target tracking results from captured images and send these results to a device that performs the target tracking method.

[0147] In one possible implementation, the radar could have computing capabilities, allowing it to obtain radar target tracking results from point cloud data and send these results to a device used to execute the target tracking method.

[0148] In one possible implementation, the device used to perform the target tracking method can acquire captured images from a camera and acquire point cloud data from a radar. The device can then obtain camera target tracking results based on the captured images and radar target tracking results based on the point cloud data.

[0149] In a possible understanding, the camera target tracking result can be the target tracking result obtained by using possible camera tracking algorithms, etc., and the radar target tracking result can be the target tracking result obtained by using possible radar tracking algorithms, etc. The embodiments of this application do not limit the specific method of obtaining the camera target tracking result and the radar target tracking result.

[0150] S702: Obtain the target tracking result based on the target model corresponding to the camera target tracking result and the radar target tracking result; wherein, the target model is used to indicate the correlation between the target and the target's height information in the radar target tracking result.

[0151] The target model described in this application's embodiments is used to indicate the correlation between the target and its height information in the radar target tracking results. For example, the target model can be a model fused in the radar coordinate system, combining the target's height information and position information. In a possible interpretation, this application's embodiments can expand the scattered, limited point cloud data in the radar target tracking results into a target model with height information covering a larger area.

[0152] Camera target tracking results are usually related to the shape of the target. For example, the aforementioned camera target tracking results may include a target bounding box used to define the target. In this embodiment, the target model corresponding to the radar target tracking results is related to the height of the target, which can effectively expand the range of the target monitored by the radar. Therefore, when the target is associated based on the camera target tracking results and the target model corresponding to the radar target tracking results, the association range of the camera target tracking results and the target model corresponding to the radar target tracking results can be effectively expanded, thereby obtaining accurate target tracking results.

[0153] The target tracking results described in this application embodiment may include one or more of the following: target type, position or speed, etc., and the number of targets may be one or more. This application embodiment does not specifically limit the specific content and number of target tracking results.

[0154] In summary, in the target tracking method of this application embodiment, when associating the results of camera target tracking and radar target tracking, the target's height information is introduced into the radar target tracking result. Specifically, a target model can be obtained to indicate the association relationship between the target and the target's height information in the radar target tracking result. When associating the camera target tracking result and the radar target tracking result, the target tracking result can be obtained based on the camera target tracking result and the target model. Because the target model includes the target's height information, it can effectively expand the range of targets monitored by the radar, thereby enabling the association to obtain accurate target tracking results.

[0155] exist Figure 7 Based on the corresponding embodiments, in possible implementations, before S702, the following may be included: obtaining the target's height information based on the target type information in the radar target tracking result; fusing the target's height information and the target in the radar target tracking result to obtain a target model.

[0156] For example, targets in radar target tracking results obtained by radar detection can be classified according to common radar classification algorithms (such as RD-map or micro-Doppler spectrum, etc., the embodiments of this application do not specifically limit the radar classification algorithm). For example, targets can be classified according to radar classification algorithms to obtain target type information including: car, pedestrian, animal, or bicycle, etc.

[0157] The height information of a target can be determined based on its type information. For example, the height information of a target can be estimated based on its type information. Alternatively, a correspondence between target type information and target height information can be predefined or pre-set, so that after determining the target type information, the corresponding height information can be matched within this correspondence. The height information can be a specific height value or a height range. For example, the correspondence could include: car height 0.8-1.2 meters (m), pedestrian height 1.0-1.8 meters, and animal height 0.4-1.0 meters.

[0158] The correspondence between target type information and target height information can be obtained based on Gaussian distribution, statistics, or machine learning methods, and this application does not impose specific limitations. For example, Figure 8 This diagram illustrates the correspondence between target type and probability height based on a Gaussian distribution. (For example...) Figure 8 As shown, height distribution 1, height distribution 2, and height distribution 3 represent the probability height distributions corresponding to different target types.

[0159] Based on the target's altitude information, the target's altitude information and the target from radar target tracking results can be fused to obtain a target model. For example, after determining the target's type information, the altitude value with the highest or largest probability can be selected from the correspondence between the target's type information and its altitude information. The altitude segment corresponding to this altitude value can then be obtained. This altitude segment and the target's position from the target tracking results are then fused to obtain the target model. Because the target model can be a model containing altitude segments, it can also be called a probabilistic altitude model or a probabilistic altitude segment model, etc.

[0160] exist Figure 7 Based on the corresponding embodiments, in a possible implementation, S702 includes: projecting the target model onto the camera coordinate system to obtain the projected radar target tracking result; and obtaining the target tracking result based on the camera target tracking result and the projected radar target tracking result.

[0161] In this embodiment, since the target model contains the target's height information, when the target model is projected onto the camera coordinate system, it can be understood that two-dimensional height information is introduced into the one-dimensional projection plane of the camera coordinate system. Then, based on the camera target tracking result (e.g., the target bounding box) and the projected radar target tracking result (e.g., line segments representing height and position), the target jointly determined by the camera target tracking result and the projected radar target tracking result can be determined, and the target tracking result can be obtained.

[0162] One possible implementation involves projecting the target model onto the camera coordinate system, including: transforming the target model into the camera coordinate system according to a pre-set or defined height transformation relationship. In this implementation, the height transformation relationship can be pre-set or defined based on experiments, etc. After obtaining the target model, the corresponding height transformation relationship can be matched to transform the target model into the camera coordinate system.

[0163] The height transformation relationships described in this application are used to transform target tracking results with height in the radar coordinate system to the camera coordinate system. Different height information corresponds to different height transformation relationships. In possible implementations, the height transformation relationship may include a height transformation matrix or a set of height transformation matrices, etc., and this application does not specifically limit the height transformation relationship.

[0164] For example, Figure 9A schematic diagram illustrating the height transformation relationship calibration of a target model is shown. For example... Figure 9 As shown, the target model can be a line segment with height. Assuming the target model is set on the ground, in the camera coordinate system, different positions of the line segment (such as the two ends or any position in the middle) can correspond to different height transformation matrices. For example, the height transformation matrix can be related to the distance d from the target to the origin of the camera coordinate system and the included angle φ. Height transformation matrices can be constructed for multiple positions of the line segment. Then, the set of height transformation matrices (or height matrix sequence) composed of multiple height transformation matrices can be used to transform the line segment to the camera coordinate system.

[0165] Among the possible implementations, the height conversion relationship corresponding to different region types is different.

[0166] The region types described in this application can be used to describe the ground type of the area where the target is located. For example, the region type may include one or more of the following: undulating ground (e.g., grass or undulating road surface), sloping ground (e.g., ramp), or flat ground (e.g., flat road surface). In different region types, the ground plane where the target is located may be different, and the height of the target relative to the origin of the camera coordinate system may be different in different regions. Therefore, if the same height transformation relationship is used for the same target located in different regions, the transformed height may not match the height of the target relative to the origin of the camera coordinate system, which may lead to inaccuracies in subsequent radar-camera fusion.

[0167] Based on this, in the embodiments of this application, the height conversion relationship corresponding to different region types is different, so that the target model can be accurately converted according to the height conversion relationship of various region types.

[0168] For example, Figure 10 This diagram illustrates a scene comprising multiple region types. For example, region 1 represents grassland, region 2 represents a slope, and region 3 represents a flat road. The same height information corresponds to different height transformation relationships in regions 1, 2, and 3. When transforming the target model to the camera coordinate system, the target region type (e.g., region 1, region 2, or region 3) can be determined. Based on the height transformation relationships corresponding to the target region type, the target height transformation relationship that matches the height information of the target model is used to transform the target model to the camera coordinate system. This allows for accurate transformation of the target model to the camera coordinate system using the height transformation relationships of each region.

[0169] One possible implementation involves obtaining the target tracking result based on the camera target tracking result and the projected radar target tracking result. This can include: using any association algorithm to calculate the correlation degree between the camera target tracking result and the projected radar target tracking result, and identifying the camera target tracking result and the projected radar target tracking result with high correlation degree as the same target. Association algorithms could include one or more of the following: global nearest neighbor (GNN), probabilistic data association (PDA), joint probabilistic data association (JPDA), or intersection overunion (IoU), etc.

[0170] For example, the camera target tracking result and the projected radar target tracking result can be used to determine that they are the same target based on the overlap ratio (or crossover ratio) between the camera target tracking result and the projected radar target tracking result; wherein the overlap ratio is greater than a first value.

[0171] The greater the overlap between the camera target tracking result and the projected radar target tracking result (or the larger the overlap ratio), the more it indicates that the camera target tracking result and the projected radar target tracking result are pointing to the same target. Therefore, when the overlap ratio between the camera target tracking result and the projected radar target tracking result is greater than or equal to a first value, it can be determined that the camera target tracking result and the projected radar target tracking result are the same target and associated. For example, the first value can be any value between 0.5 and 1, and this application embodiment does not specifically limit the first value. It can be understood that in the usual IoU calculation, the first value has a confidence distribution and is stable. Therefore, when using IoU calculation for association, the first value does not need to be manually adjusted, improving the versatility of the association calculation in this application embodiment.

[0172] In one possible implementation, when there is one camera target tracking result and one projected radar target tracking result, the camera target tracking result and the projected radar target tracking result can be determined to be the same target when the overlap ratio between the camera target tracking result and the projected radar target tracking result is greater than a first value.

[0173] In another possible implementation, when there are multiple camera target tracking results and multiple projected radar target tracking results, a camera target tracking result and a projected radar target tracking result can be paired, and the overlap ratio of each pair of camera target tracking results and projected radar target tracking results can be calculated. Each pair of camera target tracking results and projected radar target tracking results with an overlap ratio greater than or equal to a first value is identified as the same target.

[0174] In possible implementations, if the overlap ratio between the camera target tracking result and the projected radar target tracking result is less than or equal to a first value, then it is considered that the camera target tracking result and the projected radar target tracking result do not correspond to the same target.

[0175] It is understood that when the overlap ratio is equal to the first value, it can be set according to the actual application scenario to determine that the camera target tracking result and the radar target tracking result after projection are the same target, or it can be set according to the actual application scenario to determine that the camera target tracking result and the radar target tracking result after projection are not the same target. This application embodiment does not make specific limitations on this.

[0176] In possible implementations, there may be situations where multiple camera target tracking results overlap with a single projected radar target tracking result (hereinafter referred to as multi-CR association), or where a single camera target tracking result overlaps with multiple projected radar target tracking results (hereinafter referred to as multi-RC association). If, in multi-CR association or multi-RC association, the calculated overlap ratios are both greater than or equal to a first value, then multiple camera target tracking results or multiple projected radar target tracking results may be mistakenly associated with the same target. In such cases, further determination can be made based on the position and / or velocity of the overlapping target in the camera target tracking result and the position and / or velocity of the overlapping target in the radar target tracking result to determine whether the camera target tracking result and the projected radar target tracking result are the same target.

[0177] For example, if the overlap ratio is greater than a first value, and the positions and / or velocities of the overlapping targets in the camera target tracking result and the projected radar target tracking result meet preset conditions, it can be determined that the camera target tracking result and the projected radar target tracking result are the same target. For example, the preset conditions include: the difference between the position and / or velocity of the overlapping target in the camera target tracking result and the position and / or velocity of the overlapping target in the radar target tracking result is less than a second value.

[0178] For example, Figure 11 A schematic diagram of multiple RC and multiple CR is shown.

[0179] like Figure 11 As shown, in the multi-RC model, the radar target tracking results 1001 and 1002 after projection overlap with the camera target tracking result 1003.

[0180] Therefore, if the overlap ratio of the projected radar target tracking result 1002 and the camera target tracking result 1003 is greater than or equal to the first value, and the overlap ratio of the projected radar target tracking result 1001 and the camera target tracking result 1003 is less than the first value, it can be determined that the projected radar target tracking result 1002 and the camera target tracking result 1003 are the same target, and it can be determined that the projected radar target tracking result 1001 and the camera target tracking result 1003 are not the same target.

[0181] If the overlap ratio of the projected radar target tracking result 1002 and the camera target tracking result 1003 is greater than or equal to a first value, and the overlap ratio of the projected radar target tracking result 1001 and the camera target tracking result 1003 is also greater than or equal to a first value, then further, it can be determined whether the distance between the target position in the projected radar target tracking result 1001 and the target position in the camera target tracking result 1003 is greater than a distance threshold, and / or, whether the distance between the target position in the projected radar target tracking result 1002 and the target position in the camera target tracking result 1003 is greater than a distance threshold, and / or, whether the difference between the velocity of the target in the projected radar target tracking result 1001 and the velocity of the target in the camera target tracking result 1003 is greater than a velocity difference threshold, and / or, whether the difference between the velocity of the target in the projected radar target tracking result 1002 and the velocity of the target in the camera target tracking result 1003 is greater than a velocity difference threshold. Furthermore, when the distance between the target position in the projected radar target tracking result 1001 and the target position in the camera target tracking result 1003 is less than or equal to a distance threshold, and / or the difference between the velocity of the target in the projected radar target tracking result 1001 and the velocity of the target in the camera target tracking result 1003 is less than or equal to a velocity difference threshold, it can be determined that the projected radar target tracking result 1001 and the camera target tracking result 1003 are the same target. Similarly, when the distance between the target position in the projected radar target tracking result 1002 and the target position in the camera target tracking result 1003 is less than or equal to a distance threshold, and / or the difference between the velocity of the target in the projected radar target tracking result 1002 and the velocity of the target in the camera target tracking result 1003 is less than or equal to a velocity difference threshold, it can be determined that the projected radar target tracking result 1002 and the camera target tracking result 1003 are the same target. In other cases, it can be determined that the projected radar target tracking result 1001 and / or the projected radar target tracking result 1001 and the camera target tracking result 1003 are not the same target.

[0182] Similarly, such as Figure 11 As shown, in the multi-CR (Multi-Range Target Tracking) method, both camera target tracking results 1004 and 1005 overlap with the projected radar target tracking result 1006. A method similar to that described in the multi-RC method can be used to determine whether camera target tracking results 1004 or 1005 and projected radar target tracking result 1006 represent the same target; this will not be elaborated upon here.

[0183] For example, taking the device that performs the target tracking method of this application embodiment (hereinafter referred to as the target tracking device), the camera, and the radar as three independent devices, combined with Figure 12 The target tracking method of the embodiments of this application will be described in detail, such as Figure 12 As shown, Figure 12 This is a flowchart illustrating another target tracking method provided in an embodiment of this application. The method includes:

[0184] S1201: The target tracking device acquires the target tracking results from the camera.

[0185] In one possible scenario, a camera can be set up in the location where target tracking is required. The camera can capture images, and the target tracking device can acquire these images from the camera.

[0186] The target tracking device can perform image recognition and other processing on the images acquired from the camera to achieve bounding box tracking. The result of bounding box tracking is used as the target tracking result of the camera.

[0187] In one possible interpretation, the camera target tracking result can refer to the target bounding box used to define the target in the camera coordinate system. The number of target bounding boxes can be one or more.

[0188] S1202: The target tracking device acquires the radar target tracking results.

[0189] In one possible scenario, radar can be set up in locations where target tracking is required. The radar can detect targets, and the target tracking device can obtain the data detected by the radar.

[0190] Target tracking equipment can process the data detected by radar to obtain point cloud data of the target, which can then be used as the radar target tracking result.

[0191] In one possible interpretation, radar target tracking results could refer to point clouds used to identify targets. The number of point clouds corresponding to a single target could be related to radar performance, and the number of targets could be one or more.

[0192] S1203: The target tracking device obtains the target type information in the radar target tracking results through a point cloud classification algorithm.

[0193] The classification algorithm and target type information of the embodiments of this application can be referred to the description in the glossary section, and will not be repeated here.

[0194] In this embodiment, the target tracking device can determine the type information of the target in the radar tracking results based on the analysis of the radar target tracking results. For example, it may determine that the target in the radar tracking results is a person and / or a vehicle, etc. This embodiment does not limit the number of targets or the type information of the targets.

[0195] S1204: The target tracking device matches the target's height information based on the target's type information.

[0196] For example, taking two targets, one a person and the other a vehicle, the height information of the person can be 1.0-1.8m and the height information of the vehicle can be 0.4-1.0m.

[0197] S1205: The target tracking device performs RC calibration on the image domain at different heights to obtain the transformation matrix corresponding to different heights.

[0198] In this embodiment, the image domain can be a region in the image within the camera coordinate system. Different regions correspond to different height transformation matrices. In this way, by identifying the specific region of the target in the image, the corresponding height transformation matrix can be selected for the target, achieving a more accurate tracking effect.

[0199] It should be noted that S1205 can be a pre-executed step, or it can be understood that S1205 can be set at any position before, in the middle or after S1201-S1204. The execution steps of S1025 in this embodiment are not specifically limited.

[0200] S1206: The target tracking device projects the target model containing height information onto the image domain (which can be understood as projecting onto the camera coordinate system) through the transformation matrix corresponding to different heights.

[0201] Embodiment S1206 of this application can be referred to the description of the foregoing embodiments, and will not be repeated here.

[0202] S1207: The target tracking device associates the target tracking results from the camera with the projected radar target tracking results.

[0203] The specific association methods in this application embodiment can be referred to the description in the foregoing embodiment, and will not be repeated here.

[0204] In a possible implementation, the target jointly determined by the camera target tracking result and the radar target tracking result can be obtained through the association of S1207.

[0205] For example, combining Figure 2 For example, at location A in a given area, when tracking a target using a camera, a camera target tracking result 20 is obtained at location A; when tracking a target using radar, a radar target tracking result 21 is obtained at location A. Based on the target model in this case, when the radar target tracking result is projected onto the camera coordinate system, it can be projected onto line segment 23 in the camera coordinate system. Because the overlap between the camera target tracking result 20 and line segment 23 is relatively large, it can be assumed that the camera target tracking result and the radar target tracking result at location A represent the same target. Therefore, the camera target tracking result and the radar target tracking result corresponding to this same target can be fused to obtain a more accurate and complete tracking result. Alternatively, for example, the bottom edge of the camera target tracking result 20 can be pulled down by combining the length of line segment 23 to achieve more accurate target determination.

[0206] It is understandable that if there are multiple camera target tracking results and multiple radar target tracking results, the above method can be used to associate any one camera target tracking result and any one radar target tracking result, thereby obtaining the target that is tracked by both the camera target tracking result and the radar target tracking result. The number of targets can be one or more. It is also understandable that if the overlap ratio between the camera target tracking result and the projected radar target tracking result is small, it can be considered that the target tracked by the camera target tracking result and the target tracked by the radar target tracking result are not the same target. In scenarios with multiple camera target tracking results and multiple radar target tracking results, if the overlap ratio between one camera target tracking result and any projected radar target tracking result is small, it can be determined that one of the camera target tracking results is erroneous, and subsequent tracking of the target corresponding to that camera target tracking result can be discontinued. Similarly, if the overlap ratio between any target tracking result and any projected radar target tracking result is small, it can be determined that one of the projected radar target tracking results is erroneous, and subsequent tracking of the target corresponding to that projected radar target tracking result can be discontinued.

[0207] S1208: The target tracking device tracks the target based on the associated results.

[0208] For example, one or more targets obtained from the above association can be tracked separately. The specific implementation of target tracking is not limited in the embodiments of this application.

[0209] In possible interpretations, although similar feature-level association methods (e.g., utilizing bounding boxes and height information) are used in the embodiments of this application, the basic framework uses a target-level fusion framework that is more efficient and stable (because tracking is performed at the target granularity), thus achieving higher computational efficiency.

[0210] Furthermore, in this embodiment, the height information of the target is incorporated, which greatly reduces the dependence on the accuracy of the midpoint of the bottom edge of the bounding box and the positional accuracy in the radar point cloud. Even at night, in low light conditions, in complex clutter environments, or on undulating ground, a high target tracking accuracy can still be achieved.

[0211] Reference Figure 12 In one possible implementation, Figure 12 The radar target tracking results can come from imaging radar. Compared to millimeter-wave radar, imaging radar has more point cloud data. Therefore, when using imaging radar for target tracking, the target size can be obtained based on the larger amount of point cloud data collected by the imaging radar. Then, in S1206, the target size and target model can be further projected onto the camera coordinate system. In the camera coordinate system, a similar three-dimensional data relationship including visual bounding box, height information, and size can be obtained. S1207 can be replaced by using visual bounding box, height information, and size for target association. For example, the overlap ratio between visual bounding box, height information, and size can be calculated simultaneously. When the overlap ratio is greater than or equal to a certain value, they are associated as the same target. In this embodiment, because the size is added, more accurate target association can be achieved compared to millimeter-wave radar, thereby achieving more accurate target tracking.

[0212] From the above description of the present application, it can be understood that each of the above-described devices includes corresponding hardware structures and / or software units for performing each function in order to achieve the above-described functions. Those skilled in the art should readily recognize that, based on the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein, the present application can be implemented in hardware or a combination of hardware and computer software. Whether a function is executed in hardware or by computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0213] like Figure 13 As shown in the figure, an embodiment of this application provides a target tracking device, which includes a processor 1300, a memory 1301, and a transceiver 1302.

[0214] Processor 1300 is responsible for managing the bus architecture and general processing, while memory 1301 stores data used by processor 1300 during operation. Transceiver 1302 is used to receive and send data under the control of processor 1300 and communicate with memory 1301.

[0215] The bus architecture can include any number of interconnected buses and bridges, specifically linking various circuits together, represented by one or more processors (processor 1300) and memory (memory 1301). The bus architecture can also link various other circuits, such as peripheral devices, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. The bus interface provides the interface. Processor 1300 is responsible for managing the bus architecture and general processing, and memory 1301 can store data used by processor 1300 during operation.

[0216] The processes disclosed in this application can be applied to or implemented by the processor 1300. During implementation, each step of the target tracking process can be completed by integrated logic circuits in the hardware of the processor 1300 or by instructions in software form. The processor 1300 can be a general-purpose processor, digital signal processor, application-specific integrated circuit, field-programmable gate array (FPGA), or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component, and can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly implemented by the hardware processor, or implemented by a combination of hardware and software modules in the processor. The software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory 1301. The processor 1300 reads the information in memory 1301 and, in conjunction with its hardware, completes the steps of the signal processing process.

[0217] In one optional embodiment of this application, the processor 1300 is configured to read a program from the memory 1301 and execute it as follows: Figure 7 The method flow shown in S701-S702 or as follows Figure 12 The method flow in S1201-S1208 is shown.

[0218] Figure 14 This is a schematic diagram of a chip structure provided in an embodiment of this application. Chip 1400 includes one or more processors 1401 and interface circuitry 1402. Optionally, chip 1400 may further include a bus 1403. Wherein:

[0219] Processor 1401 may be an integrated circuit chip with signal processing capabilities. In implementation, each step of the above method can be completed through integrated logic circuits in the hardware of processor 1401 or through software instructions. Processor 1401 may be one or more of a general-purpose processor, digital signal processor (DSP), application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, MCU, MPU, CPU, or coprocessor. It can implement or execute the methods and steps disclosed in the embodiments of this application. The general-purpose processor may be a microprocessor or any conventional processor.

[0220] The interface circuit 1402 can be used to send or receive data, instructions or information. The processor 1401 can use the data, instructions or other information received by the interface circuit 1402 to process the data, instructions or other information, and can send the processed information out through the interface circuit 1402.

[0221] Optionally, the chip may also include memory, which may include read-only memory and random access memory, and provide operation instructions and data to the processor. A portion of the memory may also include non-volatile random access memory (NVRAM).

[0222] Optionally, the memory stores executable software modules or data structures, and the processor can perform corresponding operations by calling the operation instructions stored in the memory (which may be stored in the operating system).

[0223] Optionally, the chip can be used in the target tracking device involved in the embodiments of this application. Optionally, the interface circuit 1402 can be used to output the execution result of the processor 1401. For the target tracking methods provided by one or more embodiments of this application, please refer to the foregoing embodiments, which will not be repeated here.

[0224] It should be noted that the functions of the processor 1401 and the interface circuit 1402 can be implemented through hardware design, software design, or a combination of hardware and software; no restrictions are imposed here.

[0225] like Figure 15 As shown in the figure, this application provides a target tracking device, which includes a transceiver module 1500 and a processing module 1501.

[0226] The transceiver module 1500 is used to acquire target tracking results from the camera and target tracking results from the radar.

[0227] The processing module 1501 is used to obtain target tracking results based on the target model corresponding to the camera target tracking results and the radar target tracking results; wherein, the target model is used to indicate the correlation between the target and the target's height information in the radar target tracking results.

[0228] In one possible implementation, the processing module is further configured to obtain the target's height information based on the target type information in the radar target tracking result; the processing module is further configured to fuse the target's height information and the target in the radar target tracking result to obtain a target model.

[0229] In one possible implementation, there is a predefined or pre-set correspondence between the target's type information and its height information.

[0230] In one possible implementation, the processing module is specifically used to project the target model onto the camera coordinate system to obtain the projected radar target tracking result; and to obtain the target tracking result based on the camera target tracking result and the projected radar target tracking result.

[0231] In one possible implementation, the processing module is specifically used to transform the target model to the camera coordinate system according to a pre-set or defined height transformation relationship; wherein, different height information corresponds to different height transformation relationships, and the height transformation relationship is used to transform the target tracking results with height in the radar coordinate system to the camera coordinate system.

[0232] In one possible implementation, the height conversion relationship corresponding to the height information differs for different region types.

[0233] In one possible implementation, the region type includes one or more of the following: a region with undulating terrain, a region with slopes, or a region with flat terrain.

[0234] In one possible implementation, the processing module is specifically used to determine the target region type corresponding to the target model; based on the target region type, the target model is transformed into the camera coordinate system according to the target height transformation relationship that matches the height information of the target model.

[0235] In one possible implementation, the processing module is specifically used to determine that the camera target tracking result and the projected radar target tracking result are the same target based on the overlap ratio of the camera target tracking result and the projected radar target tracking result; wherein the overlap ratio is greater than a first value.

[0236] In one possible implementation, the processing module is specifically used to determine that the camera target tracking result and the projected radar target tracking result are the same target when the overlap ratio is greater than a first value and the position and / or velocity of the overlapping target in the camera target tracking result and the projected radar target tracking result meet preset conditions.

[0237] In one possible implementation, the preset conditions include: the difference between the position and / or velocity of the overlapping target in the camera target tracking result and the position and / or velocity of the overlapping target in the radar target tracking result is less than a second value.

[0238] In one possible implementation, the radar target tracking results come from the imaging radar; the target model also includes the target's size information.

[0239] In one possible implementation, the camera target tracking result includes the target bounding box; the radar target tracking result includes the target point cloud.

[0240] Among the possible implementations, the above Figure 15 The functions of the transceiver module 1500 and the processing module 1501 shown can be executed by the processor 1300 running the program in the memory 1301, or by the processor 1300 alone.

[0241] like Figure 16 As shown, this application provides a vehicle, the device including at least one camera 1601, at least one memory 1602, at least one transceiver 1603, at least one processor 1604, and radar 1605.

[0242] The camera 1601 is used to acquire images, and the images are used to obtain the target tracking results of the camera.

[0243] The radar 1605 is used to acquire target point clouds, which are then used to obtain radar target tracking results.

[0244] The memory 1602 is used to store one or more programs and data information; wherein the one or more programs include instructions.

[0245] The transceiver 1603 is used for data transmission with communication devices in the vehicle and for data transmission with the cloud.

[0246] The processor 1604 is used to acquire camera target tracking results and radar target tracking results; and to obtain target tracking results based on the target models corresponding to the camera target tracking results and radar target tracking results; wherein, the target model is used to indicate the correlation between the target and the target height information in the radar target tracking results.

[0247] In some possible implementations, various aspects of the target tracking method provided in the embodiments of this application can also be implemented in the form of a program product, which includes program code that, when run on a computer device, causes the computer device to perform the steps in the target tracking method according to various exemplary embodiments of this application as described in this specification.

[0248] The program product may employ any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of readable storage media (a non-exhaustive list) include: an electrical connection having one or more wires, a portable disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0249] The target tracking program product according to embodiments of this application may employ a portable compact disc read-only memory (CD-ROM) and include program code, and may run on a server device. However, the program product of this application is not limited thereto. In this document, the readable storage medium may be any tangible medium containing or storing a program that may be used or combined with a communication transmission, apparatus, or device.

[0250] A readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying readable program code. This propagated data signal may take many forms, including—but not limited to—electromagnetic signals, optical signals, or any suitable combination thereof. A readable signal medium may also be any readable medium other than a readable storage medium, capable of sending, propagating, or transmitting a program for use by or in conjunction with a periodic network operating system, apparatus, or device.

[0251] The program code contained on the readable medium may be transmitted using any suitable medium, including—but not limited to—wireless, wired, optical fiber, RF, or any suitable combination thereof.

[0252] Program code for performing the operations of this application can be written in any combination of one or more programming languages, including object-oriented programming languages ​​such as Java and C++, and conventional procedural programming languages ​​such as C or similar languages. The program code can execute entirely on the user's computing device, partially on the user's computing device, as a standalone software package, partially on the user's computing device and partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network—including a local area network (LAN) or a wide area network (WAN)—or it can be connected to an external computing device.

[0253] This application also provides a computing device readable storage medium for target tracking, meaning that the content is not lost after power failure. This storage medium stores software programs, including program code. When the program code runs on a computing device, the software program, when read and executed by one or more processors, can implement any of the target tracking schemes described in this application.

[0254] This application also provides an electronic device. When each functional module is divided according to its corresponding functions, the electronic device includes: a processing module for supporting the target tracking device to perform the steps in the above embodiments, such as performing the operations of S701 to S702, or other processes of the technology described in this application.

[0255] All relevant content of each step involved in the above method embodiments can be referenced from the functional description of the corresponding functional module, and will not be repeated here.

[0256] Of course, the target tracking device includes, but is not limited to, the unit modules listed above. Furthermore, the specific functions that the aforementioned functional units can achieve include, but are not limited to, the functions corresponding to the method steps described in the above examples. For detailed descriptions of other units of the electronic device, please refer to the detailed descriptions of their corresponding method steps; these will not be repeated here in the embodiments of this application.

[0257] When using integrated units, the electronic device involved in the above embodiments may include: a processing module, a storage module, and a communication module. The storage module is used to store the electronic device's program code and data. The communication module is used to support communication between the electronic device and other network entities to realize functions such as voice calls, data interaction, and Internet access.

[0258] The processing module is used to control and manage the operation of the electronic device. The processing module can be a processor or a controller. The communication module can be a transceiver, RF circuit, or communication interface, etc. The storage module can be a memory.

[0259] Furthermore, the electronic device may also include an input module and a display module. The display module may be a screen or a monitor. The input module may be a touchscreen, a voice input device, or a fingerprint sensor, etc.

[0260] The present application has been described above with reference to block diagrams and / or flowcharts illustrating methods, apparatus (systems), and / or computer program products according to embodiments of the present application. It should be understood that a block of a block diagram and / or flowchart, as well as combinations of blocks of block diagrams and / or flowcharts, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, and / or other programmable data processing means to produce a machine such that the instructions, executable via the computer processor and / or other programmable data processing means, create methods for implementing the functions / actions specified in the blocks of the block diagrams and / or flowcharts.

[0261] Accordingly, this application can also be implemented using hardware and / or software (including firmware, resident software, microcode, etc.). Furthermore, this application can take the form of a computer program product on a computer-usable or computer-readable storage medium, having computer-usable or computer-readable program code implemented in the medium for use by or in conjunction with an instruction execution system. In the context of this application, a computer-usable or computer-readable medium can be any medium that can contain, store, communicate, transmit, or deliver a program for use by or in conjunction with an instruction execution system, apparatus, or device.

[0262] This application describes several embodiments in detail with reference to multiple flowcharts. However, it should be understood that these flowcharts and their corresponding descriptions are merely illustrative for ease of understanding and should not constitute any limitation on this application. Not every step in each flowchart is necessarily required; for example, some steps can be skipped. Furthermore, the execution order of each step is not fixed and is not limited to what is shown in the figures. The execution order of each step should be determined by its function and internal logic.

[0263] The various embodiments described in this application can be arbitrarily combined or their steps can be executed in an overlapping manner. The execution order of each embodiment and the execution order of the steps in each embodiment are not fixed and are not limited to those shown in the figures. The execution order of each embodiment and the overlapping execution order of the steps in each embodiment should be determined by their functions and internal logic.

[0264] Although this application has been described in conjunction with specific features and embodiments, it is obvious that various modifications and combinations can be made thereto without departing from the spirit and scope of this application. Accordingly, this specification and drawings are merely illustrative descriptions of the application as defined by the appended claims, and are considered to cover any and all modifications, variations, combinations, or equivalents within the scope of this application. Clearly, those skilled in the art can make various alterations and modifications to this application without departing from its scope. Thus, if such modifications and modifications fall within the scope of the claims and their equivalents, this application is also intended to include such modifications and modifications.

Claims

1. A target tracking method, characterized in that, include: Acquire target tracking results from both the camera and the imaging radar; wherein the radar target tracking results include target point cloud, target type information, and target location information; Based on the target type information, the height information of the target is obtained from a predefined or pre-set correspondence between the target type information and the target height information. The correspondence is obtained based on Gaussian distribution, statistics or machine learning. The target type is a vehicle, pedestrian, animal or bicycle, and different target types correspond to different height ranges. By fusing the target's height information and the target's position information, a target model corresponding to the radar target tracking result is obtained; the target model is used to expand the range of targets monitored by the radar. Determine the target region type corresponding to the target model; different region types and / or different height information correspond to different height conversion relationships; In the height transformation relationship corresponding to the target area type, determine the target height transformation relationship that matches the height information of the target model and is associated with the distance and angle between the target and the origin of the camera coordinate system; transform the target model to the camera coordinate system according to the target height transformation relationship to obtain the projected radar target tracking result; The target tracking result is obtained based on the target tracking results from the camera and the target tracking results from the projected radar.

2. The method according to claim 1, characterized in that, The area type includes one or more of the following: undulating terrain, sloping terrain, or flat terrain.

3. The method according to claim 1, characterized in that, The step of obtaining the target tracking result based on the camera target tracking result and the projected radar target tracking result includes: Based on the overlap ratio between the camera target tracking result and the projected radar target tracking result, it is determined that the camera target tracking result and the projected radar target tracking result are the same target; wherein, the overlap ratio is greater than a first value.

4. The method according to claim 3, characterized in that, Determining that the camera target tracking result and the projected radar target tracking result represent the same target based on the overlap ratio of the camera target tracking result and the projected radar target tracking result includes: If the overlap ratio is greater than the first value, and the position and / or velocity of the overlapping target in the camera target tracking result and the projected radar target tracking result meet preset conditions, then the camera target tracking result and the projected radar target tracking result are determined to be the same target.

5. The method according to claim 4, characterized in that, The preset conditions include: The difference between the position and / or velocity of the overlapping target in the camera target tracking result and the position and / or velocity of the overlapping target in the radar target tracking result is less than a second value.

6. The method according to claim 1, characterized in that, The target model also includes the target's size information.

7. The method according to any one of claims 1, 3-5, characterized in that, The camera target tracking results include the target bounding box.

8. A target tracking device, characterized in that, include: The communication unit is used to acquire the target tracking results from the camera and the radar target tracking results from the imaging radar; wherein, the radar target tracking results include target point cloud, target type information and target position information; The processing unit is configured to obtain the height information of the target based on the target type information and the target height information in a predefined or pre-set correspondence between the target type information and the target height information, wherein the correspondence is obtained based on Gaussian distribution, statistics or machine learning, and the target type is a vehicle, pedestrian, animal or bicycle, and different target types correspond to different height ranges. By fusing the target's height information and the target's position information, a target model corresponding to the radar target tracking result is obtained; the target model is used to expand the range of targets monitored by the radar. Determine the target region type corresponding to the target model; different region types and / or different height information correspond to different height conversion relationships; In the height transformation relationship corresponding to the target area type, determine the target height transformation relationship that matches the height information of the target model and is associated with the distance and angle between the target and the origin of the camera coordinate system; transform the target model to the camera coordinate system according to the target height transformation relationship to obtain the projected radar target tracking result; The target tracking result is obtained based on the target tracking results from the camera and the target tracking results from the projected radar.

9. The apparatus according to claim 8, characterized in that, The area type includes one or more of the following: undulating terrain, sloping terrain, or flat terrain.

10. The apparatus according to claim 8, characterized in that, The processing unit is specifically configured to determine that the camera target tracking result and the projected radar target tracking result are the same target based on the overlap ratio of the camera target tracking result and the projected radar target tracking result; wherein the overlap ratio is greater than a first value.

11. The apparatus according to claim 10, characterized in that, The processing unit is specifically configured to determine that the camera target tracking result and the projected radar target tracking result are the same target when the overlap ratio is greater than the first value and the position and / or velocity of the overlapping target in the camera target tracking result and the projected radar target tracking result meet preset conditions.

12. The apparatus according to claim 11, characterized in that, The preset conditions include: the difference between the position and / or velocity of the overlapping target in the camera target tracking result and the position and / or velocity of the overlapping target in the radar target tracking result is less than a second value.

13. The apparatus according to claim 8, wherein the target model further includes the size information of the target.

14. The apparatus according to any one of claims 8, 10-12, characterized in that, The camera target tracking results include the target bounding box.

15. A target tracking device, characterized in that, include: At least one processor is configured to invoke a program in memory to perform the method according to any one of claims 1-7.

16. A target tracking device, characterized in that, include: At least one processor and an interface circuit, the interface circuit being configured to provide information input and / or information output to the at least one processor, the at least one processor being configured to perform the method according to any one of claims 1 to 7.

17. A chip, characterized in that, Includes at least one processor and interface; The interface is used to provide program instructions or data to the at least one processor; The at least one processor is used to execute the program instructions to implement the method as described in any one of claims 1-7.

18. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed, cause a computer to perform the method as described in any one of claims 1-7.