Water surface target tracking method and device based on multiple cameras, computer device and storage medium

By establishing a world coordinate system on an unmanned vessel, the target detection results from multiple cameras are transformed and merged with appearance features, thus solving the stability problem of target tracking in multi-camera systems and achieving stable tracking and path planning support for multiple targets.

CN115760930BActive Publication Date: 2026-06-19ORCA-TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ORCA-TECH
Filing Date
2022-12-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Traditional target tracking algorithms are mainly based on single cameras and cannot effectively solve the target tracking problem in multi-camera systems. Especially when cameras with different focal lengths and field of view are used, the processing complexity is high and erroneous tracking is prone to occur.

Method used

A multi-camera water surface target tracking method is adopted. By establishing a world coordinate system, the target detection results of multiple cameras are transformed, merged and the appearance features are extracted, and then input into a multi-target tracker for iterative tracking to achieve stable tracking of multi-camera targets.

🎯Benefits of technology

It achieves stable target tracking in multi-camera systems, avoids erroneous tracking, keeps the target ID unchanged, and provides excellent path planning support.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115760930B_ABST
    Figure CN115760930B_ABST
Patent Text Reader

Abstract

This invention relates to a method, apparatus, computer device, and storage medium for tracking water surface targets based on multiple cameras. The method includes: acquiring image data from multiple cameras to obtain target detection results and image features from the multiple cameras; converting the target detection results into detection results in a world coordinate system; merging the detection results in the world coordinate system to obtain a fused position result, and extracting target appearance features from the image features; inputting the fused position result and target appearance features into a multi-target tracker, iterating the tracker parameters to obtain the multi-target tracker result; and outputting the multi-camera target tracking result in real time at the next moment. This invention achieves multi-target tracking results from multiple cameras. Furthermore, through appearance feature correlation, it enables stable tracking even when the detected target is occluded. The provided tracking results have good usability for subsequent path planning by unmanned surface vessels.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of water surface target tracking technology, and in particular to water surface target tracking methods, devices, computer equipment, and storage media based on multiple cameras. Background Technology

[0002] In recent years, unmanned surface vessels (USVs) have been widely used in scientific research and surface operations, and they are gradually developing towards intelligence. With significant breakthroughs in deep learning technology in image processing, USVs, equipped with cameras, can intelligently perceive their surrounding aquatic environment. Multiple cameras can be used to perform 360-degree omnidirectional detection of surrounding vessels and obstacles, acquiring detection results, which is crucial for the safe navigation of USVs. In USV target detection systems, in addition to acquiring the target, a target tracking algorithm is needed to obtain the target ID. The target position and target ID are then provided to the planning module for decision-making and control.

[0003] Traditional target tracking algorithms are often based on a single camera, and they often cannot solve the problem of target tracking across multiple cameras. In addition, the multiple cameras carried by unmanned surface vessels (USVs) may have different characteristics, such as different focal lengths and different field of view angles, which makes the target tracking algorithms of USVs require more complex processing methods. Summary of the Invention

[0004] The purpose of this invention is to overcome the shortcomings of the prior art and provide a method, apparatus, computer equipment and storage medium for tracking water surface targets based on multiple cameras.

[0005] To solve the above-mentioned technical problems, the present invention adopts the following technical solution:

[0006] In a first aspect, this embodiment provides a water surface target tracking method based on multiple cameras, including the following steps:

[0007] Acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras;

[0008] Convert the target detection results into detection results in the world coordinate system;

[0009] The detection results of the world coordinate system are merged to obtain the fused position result, and the target appearance features are extracted from the image features based on the fused position result;

[0010] The fused location results and target appearance features are input into the multi-target tracker, and the tracker parameters are iterated to obtain the multi-target tracker results;

[0011] Based on the results of the multi-target tracker, the multi-camera target tracking results are output in real time at the next moment.

[0012] The further technical solution is as follows: acquiring image data from multiple cameras to obtain target detection results and image features from multiple cameras includes the following steps:

[0013] A world coordinate system is established with the location of the unmanned ship at the moment of power-up as the origin and due north and due east as the positive directions;

[0014] At a set time, acquire image data from n cameras;

[0015] Image data from n cameras are input into the target detection model to obtain target detection results and image features.

[0016] The further technical solution is as follows: the conversion of the target detection result into the detection result in the world coordinate system includes the following steps:

[0017] The extrinsic and intrinsic parameter matrices are obtained in advance through calibration.

[0018] A coordinate system for the unmanned vessel is established with its position at a set time as the origin and its orientation and rightward direction as the positive directions.

[0019] Based on the extrinsic and intrinsic parameter matrices, the target detection results are transformed into the unmanned vessel coordinate system;

[0020] The target detection results are transferred from the unmanned vessel coordinate system to the world coordinate system to obtain the detection results in the world coordinate system.

[0021] The further technical solution is as follows: merging the detection results in the world coordinate system to obtain a fused position result, and extracting target appearance features from image features based on the fused position result, includes the following steps:

[0022] The detection results from the world coordinate system are merged to obtain the fused position result;

[0023] Based on the index of the fusion location result, obtain the image region corresponding to the original target detection result;

[0024] Based on the image features, the corresponding image region is cropped out as an image set feature, and the image set feature is then extracted by dimensionality reduction to obtain the corresponding target appearance feature.

[0025] Secondly, this embodiment provides a water surface target tracking device based on multiple cameras, including: an acquisition unit, a conversion unit, a merging and extraction unit, an input iteration unit, and an output unit;

[0026] The acquisition unit is used to acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras;

[0027] The conversion unit is used to convert the target detection results into detection results in the world coordinate system;

[0028] The merging and extraction unit is used to merge the detection results of the world coordinate system to obtain the fused position result, and extract the target appearance features from the image features based on the fused position result;

[0029] The input iteration unit is used to input the fused position result and target appearance feature into the multi-target tracker, iterate the tracker parameters, and obtain the multi-target tracker result.

[0030] The output unit is used to output the multi-camera target tracking results in real time at the next moment based on the results of the multi-target tracker.

[0031] The further technical solution is as follows: the acquisition unit includes: a first establishment module, a first acquisition module, and an input module;

[0032] The first establishment module is used to establish a world coordinate system with the location of the unmanned ship at the time of power-on as the origin and the due north and due east directions as positive directions;

[0033] The first acquisition module is used to acquire image data from n cameras at a set time.

[0034] The input module is used to input image data from n cameras into the target detection model to obtain target detection results and image features.

[0035] The further technical solution is as follows: the merging and extraction unit includes: a calibration module, a second establishment module, a conversion module, and a transfer module;

[0036] The calibration module is used to obtain the extrinsic parameter matrix and intrinsic parameter matrix in advance through calibration;

[0037] The second establishment module is used to establish an unmanned vessel coordinate system with the unmanned vessel's position at a set time as the origin and the unmanned vessel's orientation direction and right-hand direction as the positive direction.

[0038] The conversion module is used to convert the target detection results into the unmanned vessel coordinate system based on the extrinsic and intrinsic parameter matrices.

[0039] The transfer module is used to transfer the target detection results from the unmanned vessel coordinate system to the world coordinate system to obtain the detection results in the world coordinate system.

[0040] The further technical solution is as follows: the input iteration unit includes: a merging module, a second acquisition module, and a pruning and dimensionality reduction extraction module;

[0041] The merging module is used to merge the detection results of the world coordinate system to obtain the fused position result;

[0042] The second acquisition module is used to acquire the image region corresponding to the original target detection result based on the index of the fusion location result;

[0043] The cropping and dimensionality reduction extraction module is used to crop out the image set features of the corresponding image region based on the image features, and then perform dimensionality reduction extraction on the image set features to obtain the corresponding target appearance features.

[0044] Thirdly, this embodiment provides a computer device, which includes a memory and a processor. The memory stores a computer program, and when the processor executes the computer program, it implements the water surface target tracking method based on multiple cameras as described above.

[0045] Fourthly, this embodiment provides a storage medium storing a computer program, the computer program including program instructions, which, when executed by a processor, can implement the multi-camera-based water surface target tracking method described above.

[0046] The advantages of this invention compared to existing technologies are as follows: by using multiple cameras mounted on an unmanned surface vessel to perform 360-degree omnidirectional target detection on the water surface, the target detection results are obtained, enabling multi-target tracking results for multiple cameras. Furthermore, when a large target spans multiple cameras, multiple detection results can be fused to avoid erroneous tracking of multiple targets. In addition, by associating appearance features, the detected target can still be stably tracked even when it is occluded, and the target ID does not change. The tracking results provided have good usability for subsequent path planning of the unmanned surface vessel.

[0047] The present invention will be further described below with reference to the accompanying drawings and specific embodiments. Attached Figure Description

[0048] To more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0049] Figure 1 A flowchart illustrating the multi-camera-based water surface target tracking method provided in this embodiment of the invention. Figure 1 ;

[0050] Figure 2 A flowchart illustrating the multi-camera-based water surface target tracking method provided in this embodiment of the invention. Figure 2 ;

[0051] Figure 3 This is a schematic diagram illustrating an application scenario of the multi-camera-based water surface target tracking method provided in an embodiment of the present invention.

[0052] Figure 4 A flowchart illustrating the multi-camera-based water surface target tracking method provided in this embodiment of the invention. Figure 3 ;

[0053] Figure 5 A flowchart illustrating the multi-camera-based water surface target tracking method provided in this embodiment of the invention. Figure 4 ;

[0054] Figure 6 This is a block diagram of a sampling multilayer sensor provided in an embodiment of the present invention;

[0055] Figure 7 This is a schematic diagram illustrating the tracker optimization iteration process provided in an embodiment of the present invention;

[0056] Figure 8 A schematic diagram of a multi-camera-based water surface target tracking device provided in an embodiment of the present invention. Figure 1 ;

[0057] Figure 9 A schematic diagram of a multi-camera-based water surface target tracking device provided in an embodiment of the present invention. Figure 2 ;

[0058] Figure 10 A schematic diagram of a multi-camera-based water surface target tracking device provided in an embodiment of the present invention. Figure 3 ;

[0059] Figure 11 A schematic diagram of a multi-camera-based water surface target tracking device provided in an embodiment of the present invention. Figure 4 ;

[0060] Figure 12 A schematic block diagram of a computer device provided for an embodiment of the present invention. Detailed Implementation

[0061] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0062] It should be understood that, when used in this specification and the appended claims, the terms "comprising" and "including" indicate the presence of the described features, integrals, steps, operations, elements and / or components, but do not exclude the presence or addition of one or more other features, integrals, steps, operations, elements, components and / or collections thereof.

[0063] It should also be understood that the terminology used in this specification is for the purpose of describing particular embodiments only and is not intended to limit the invention. As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms unless the context clearly indicates otherwise.

[0064] It should also be further understood that the term "and / or" as used in this specification and the appended claims refers to any combination of one or more of the associated listed items and all possible combinations, and includes such combinations.

[0065] Please see Figure 1 The specific embodiment shown in the present invention discloses a water surface target tracking method based on multiple cameras, comprising the following steps:

[0066] S1, acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras;

[0067] In one embodiment, see Figures 2 to 3 As shown, the process of acquiring image data from multiple cameras to obtain target detection results and image features from multiple cameras includes the following steps:

[0068] S11, with the position of the unmanned ship at the time of power-on as the origin, and the due north and due east directions as positive directions, establish a world coordinate system;

[0069] Specifically, based on the location of the unmanned ship at the time of power-on. With the origin as the reference point and due north as the reference point... The positive direction of the axis is east. Establish a world coordinate system along the positive axis.

[0070] S12, at a set time, acquire image data from n cameras;

[0071] Specifically, the time is set as At any given moment, the unmanned surface vessel acquires image data from n cameras. Among them, acquiring n images at the same time can be triggered by hardware synchronization or by matching adjacent frame data through software.

[0072] S13, input the image data from n cameras into the target detection model to obtain the target detection results and image features.

[0073] Specifically, camera image data Inputting the data into the object detection model yields the object detection results. Image features F are obtained from the backbone feature extraction network in the object detection model.

[0074] Furthermore, when processing image data from the camera, each image can be sequentially input into the object detection model to obtain a single image result, or all images can be simultaneously input into a large object detection model to obtain all object detection results at the same time.

[0075] The target detection results undergo tracking preprocessing. This includes the category of the detected bounding box and the location of the bounding box. First, the bounding box is normalized to obtain the new bounding box location (x, y, a, b):

[0076] ;

[0077] ;

[0078] ;

[0079] ;

[0080] Where (x1, y1) and (x2, y2) are the top-left and bottom-right coordinates of the target detection box, respectively, and w and h are the width and height of the image. x, y, a, and b are the normalized x-axis center, y-axis center, x-axis width, and y-axis height, respectively.

[0081] S2 converts the target detection results into detection results in the world coordinate system;

[0082] In one embodiment, see Figure 4 As shown, the process of converting the target detection results into world coordinate system detection results includes the following steps:

[0083] S21, the extrinsic and intrinsic parameter matrices are obtained in advance through calibration;

[0084] Specifically, the unmanned surface vessel obtains its external parameter matrix in advance through calibration. and intrinsic parameter matrix .

[0085] Specifically, by placing the unmanned vessel indoors and keeping it stationary, placing a calibration checkerboard around the unmanned vessel, collecting checkerboard data using a camera, and obtaining the extrinsic and intrinsic parameter matrices of the unmanned vessel's camera through intrinsic and extrinsic parameter calibration algorithms (using existing conventional algorithms for intrinsic and extrinsic parameter calibration).

[0086] S22, establish the unmanned vessel coordinate system with the unmanned vessel's position at the set time as the origin and the unmanned vessel's orientation and right-hand direction as the positive direction;

[0087] Specifically, considering that the images acquired by the unmanned surface vessel (USV) are based on the USV's coordinate system, it is necessary to unify the results from all cameras to the world coordinate system to facilitate subsequent data integration and fusion processing. First, an USV coordinate system is established, based on the USV's coordinate system. The location at any given time is taken as the origin, and the direction in which the unmanned vessel is facing is taken as the reference point. On the positive axis, the right side of the unmanned vessel is... Establish the coordinate system of the unmanned vessel in the positive direction of the axis.

[0088] S23, based on the extrinsic and intrinsic parameter matrices, transform the target detection results into the unmanned vessel coordinate system;

[0089] Specifically, the target detection results Transform to the unmanned surface vessel coordinate system:

[0090] ;

[0091] in, Let i be the i-th target detection box in the coordinate system of the unmanned vessel.

[0092] S24, transfer the target detection results from the unmanned vessel coordinate system to the world coordinate system to obtain the detection results in the world coordinate system.

[0093] Specifically, the target detection results are transferred from the unmanned surface vessel's coordinate system to the world coordinate system, allowing the unmanned surface vessel to acquire... GPS location at any time and yaw angle data , target detection results Along The target detection result in the world coordinate system is obtained after rotating in the opposite direction of the angle and then performing position compensation. :

[0094] .

[0095] S3 merges the detection results of the world coordinate system to obtain the fused position result, and extracts the target appearance features from the image features based on the fused position result;

[0096] In one embodiment, see Figure 5 As shown, the process of merging the detection results in the world coordinate system to obtain a fused position result, and extracting target appearance features from image features based on the fused position result, includes the following steps:

[0097] S31, merge the detection results of the world coordinate system to obtain the fused position result;

[0098] Specifically, the detection results in the world coordinate system are merged to obtain the fused position result D. The merging rule is to calculate the intersection-union ratio (IOU) of the position of one target box with the positions of the other n-1 target boxes in turn. If the IOU is greater than 0.2, the two targets are considered to overlap as one target; otherwise, they are treated as two targets.

[0099] Furthermore, strategies can be designed based on the target bounding box category, such as fusing only targets of the same category, for example, fusing only detected manned cruise ship targets, or selecting multiple target categories to form a large category for IOU fusion.

[0100] S32, based on the index of the fusion location result, obtain the image region corresponding to the original target detection result;

[0101] Specifically, the original target detection result is obtained based on the index j of the fused location result D. Corresponding image region .

[0102] S33, based on the image features, crop out the image set features of the corresponding image region, and perform dimensionality reduction extraction on the image set features to obtain the corresponding target appearance features.

[0103] Specifically, image set features corresponding to the image region are cropped based on image features F, and then dimensionality reduction extraction is performed on the image set features. In order to obtain the corresponding target appearance features ;

[0104] .

[0105] Further, see Figure 6 As shown, the dimensionality reduction extraction network is a sampling multilayer perceptron, which obtains reduced-dimensional image appearance features through 3x3 convolutional layers, four times upsampling layers, and activation layers. When hour, The number of target bounding boxes after fusion is used to obtain all target appearance features. .

[0106] S4. Input the fused position results and target appearance features into the multi-target tracker, iterate the tracker parameters, and obtain the multi-target tracker results.

[0107] Specifically, see Figure 7 As shown, the tracker state space X is first constructed, which includes the target bounding box's x-coordinate, y-coordinate, horizontal width a, vertical length b, and horizontal velocity in the world coordinate system. , vertical axis rate Horizontal width change rate rate of change of longitudinal length :

[0108] ;

[0109] Construct the state transition matrix based on the unmanned vessel dynamics model. Then, based on the state transition matrix predict State space at any given moment :

[0110] ;

[0111] Specifically, the unmanned vessel's dynamics model is a common Newtonian mechanics model, where the new position = original position + velocity * time difference. The state transition matrix represents the state transition from time t to time t+1. dt is the time difference from time t to time t+1.

[0112] according to State space at any given moment and state transition matrix predict State space at any given moment :

[0113] ;

[0114] Further, the covariance matrix L of the state space is calculated, and then the following is calculated: Optimal estimation of state space at time step :

[0115] ;

[0116] in, This is the noise estimation matrix, which defaults to normally distributed noise.

[0117] pass Optimal estimation of state space at time step The position matching result is obtained by calculating the Euclidean distance with the state space X (Euclidean distance calculation is a commonly used method and will not be elaborated here). ;

[0118] Position matching results Matching results with appearance features The multi-target tracker results are obtained by summing the coefficients. :

[0119] ;

[0120] in, Results of the multi-target tracker (for hyperparameters) Include The position of the tracking target bounding box and the category of the tracking target at any given time.

[0121] S5, based on the results of the multi-target tracker, outputs the multi-camera target tracking results in real time at the next moment.

[0122] Specifically, in the next moment, the unmanned vessel will output the multi-camera target tracking results in real time through the above steps S1-S4.

[0123] Specifically, the unmanned surface vessel (USV) obtains pre-trained target detection model weights from a deep learning cloud server and loads them onto the USV. The USV then acquires raw image data from multiple cameras, raw GPS location data, and raw IMU data in real time from sensors. This data is then pre-processed and parsed to obtain image data, GPS data, and yaw angle data from the multiple cameras. Further, the pre-processed sensor data is used for real-time inference. Through the steps S1-S4 described above, the USV outputs the multi-camera target tracking result R in real time.

[0124] This invention achieves target detection results by using multiple cameras mounted on an unmanned surface vessel to perform 360-degree omnidirectional target detection on the water surface. It enables multi-target tracking results from multiple cameras, and when a large target spans multiple cameras, it can fuse multiple detection results to avoid erroneous tracking of multiple targets. In addition, by associating appearance features, it can still stably track the detected target even when it is occluded, and the tracked target ID does not change. The tracking results provided have good usability for the subsequent path planning of the unmanned surface vessel.

[0125] Please see Figure 8 As shown, the present invention also discloses a water surface target tracking device based on multiple cameras, including: an acquisition unit 10, a conversion unit 20, a merging and extraction unit 30, an input iteration unit 40, and an output unit 50;

[0126] The acquisition unit 10 is used to acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras;

[0127] The conversion unit 20 is used to convert the target detection result into a detection result in the world coordinate system;

[0128] The merging and extraction unit 30 is used to merge the detection results of the world coordinate system to obtain the fused position result, and extract the target appearance features from the image features based on the fused position result.

[0129] The input iteration unit 40 is used to input the fused position result and target appearance features into the multi-target tracker, iterate the tracker parameters, and obtain the multi-target tracker result.

[0130] The output unit 50 is used to output the multi-camera target tracking results in real time at the next moment based on the results of the multi-target tracker.

[0131] In one embodiment, please refer to Figure 9 As shown, the acquisition unit 10 includes: a first establishment module 11, a first acquisition module 12, and an input module 13;

[0132] The first establishment module 11 is used to establish a world coordinate system with the position of the unmanned ship at the time of power-on as the origin and the due north and due east directions as positive directions;

[0133] The first acquisition module 12 is used to acquire image data from n cameras at a set time.

[0134] The input module 13 is used to input image data from n cameras into the target detection model to obtain target detection results and image features.

[0135] In one embodiment, please refer to Figure 10 As shown, the merging and extraction unit 20 includes: a calibration module 21, a second establishment module 22, a conversion module 23, and a transfer module 24;

[0136] The calibration module 21 is used to obtain the extrinsic parameter matrix and intrinsic parameter matrix in advance through calibration;

[0137] The second establishment module 22 is used to establish an unmanned vessel coordinate system with the unmanned vessel's position at a set time as the origin and the unmanned vessel's orientation direction and right-hand direction as the positive direction;

[0138] The conversion module 23 is used to convert the target detection results into the unmanned vessel coordinate system based on the external parameter matrix and the internal parameter matrix.

[0139] The transfer module 24 is used to transfer the target detection result from the unmanned vessel coordinate system to the world coordinate system to obtain the detection result in the world coordinate system.

[0140] In one embodiment, please refer to Figure 11 As shown, the input iteration unit 30 includes: a merging module 31, a second acquisition module 32, and a pruning and dimensionality reduction extraction module 33;

[0141] The merging module 31 is used to merge the detection results of the world coordinate system to obtain the fused position result;

[0142] The second acquisition module 32 is used to acquire the image region corresponding to the original target detection result based on the index of the fusion position result;

[0143] The cropping and dimensionality reduction extraction module 33 is used to crop out the image set features of the corresponding image region according to the image features, and perform dimensionality reduction extraction on the image set features to obtain the corresponding target appearance features.

[0144] It should be noted that those skilled in the art can clearly understand that the specific implementation process of the above-mentioned multi-camera water surface target tracking device and each unit can be referred to the corresponding description in the foregoing method embodiments. For the sake of convenience and brevity, it will not be repeated here.

[0145] The aforementioned multi-camera-based water surface target tracking device can be implemented as a computer program, which can, for example... Figure 12 It runs on the computer device shown.

[0146] Please see Figure 12 , Figure 12 This is a schematic block diagram of a computer device 500 provided in an embodiment of this application; the computer device 500 can be a terminal or a server, wherein the terminal can be an electronic device with communication functions such as a smartphone, tablet computer, laptop computer, desktop computer, personal digital assistant, and wearable device. The server can be a standalone server or a server cluster composed of multiple servers.

[0147] See Figure 12 The computer device 500 includes a processor 502, a memory, and a network interface 505 connected via a system bus 501. The memory may include a non-volatile storage medium 503 and internal memory 504.

[0148] The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform a multi-camera-based water surface target tracking method.

[0149] The processor 502 provides computing and control capabilities to support the operation of the entire computer device 500.

[0150] The internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can execute a water surface target tracking method based on multiple cameras.

[0151] This network interface 505 is used for network communication with other devices. Those skilled in the art will understand that... Figure 12 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device 500 to which the present application is applied. The specific computer device 500 may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0152] The processor 502 is used to run a computer program 5032 stored in the memory to perform the following steps:

[0153] Step S1: Acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras;

[0154] Step S2: Convert the target detection results into detection results in the world coordinate system;

[0155] Step S3: Merge the detection results of the world coordinate system to obtain the fused position result, and extract the target appearance features from the image features based on the fused position result;

[0156] Step S4: Input the fused position result and target appearance features into the multi-target tracker, iterate the tracker parameters, and obtain the multi-target tracker result;

[0157] Step S5: Based on the results of the multi-target tracker, output the multi-camera target tracking results in real time at the next moment.

[0158] It should be understood that in the embodiments of this application, the processor 502 may be a central processing unit (CPU), or it may be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor.

[0159] It will be understood by those skilled in the art that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program includes program instructions and can be stored in a storage medium, which is a computer-readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the process steps of the embodiments of the above methods.

[0160] Therefore, the present invention also provides a storage medium. This storage medium can be a computer-readable storage medium. The storage medium stores a computer program, wherein the computer program includes program instructions that, when executed by a processor, can implement the above-described multi-camera-based water surface target tracking method. The storage medium stores a computer program, which includes program instructions that, when executed by a processor, can implement the above-described method. The program instructions include the following steps:

[0161] Step S1: Acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras;

[0162] Step S2: Convert the target detection results into detection results in the world coordinate system;

[0163] Step S3: Merge the detection results of the world coordinate system to obtain the fused position result, and extract the target appearance features from the image features based on the fused position result;

[0164] Step S4: Input the fused position result and target appearance features into the multi-target tracker, iterate the tracker parameters, and obtain the multi-target tracker result;

[0165] Step S5: Based on the results of the multi-target tracker, output the multi-camera target tracking results in real time at the next moment.

[0166] The storage medium can be any computer-readable storage medium capable of storing program code, such as a USB flash drive, portable hard drive, read-only memory (ROM), magnetic disk, or optical disk.

[0167] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0168] In the several embodiments provided by this invention, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative. For example, the division of each unit is merely a logical functional division, and there may be other division methods in actual implementation. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed.

[0169] The steps in the method of this invention can be adjusted, merged, or reduced in order according to actual needs. The units in the device of this invention can be merged, divided, or reduced according to actual needs. Furthermore, the functional units in the various embodiments of this invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0170] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, a terminal, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention.

[0171] The above embodiments are preferred implementations of the present invention. In addition, the present invention can be implemented in other ways. Any obvious substitutions without departing from the concept of the present technical solution are within the protection scope of the present invention.

Claims

1. A multi-camera based water surface target tracking method, characterized in that, Includes the following steps: Acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras; Convert the target detection results to world coordinate system detection results; The detection results of the world coordinate system are merged to obtain the fused position result, and the target appearance features are extracted from the image features based on the fused position result; The fused location results and target appearance features are input into the multi-target tracker, and the tracker parameters are iterated to obtain the multi-target tracker results; Based on the results of the multi-target tracker, the multi-camera target tracking results are output in real time at the next moment; The process of acquiring image data from multiple cameras to obtain target detection results and image features from multiple cameras includes the following steps: A world coordinate system is established with the location of the unmanned ship at the moment of power-up as the origin and due north and due east as the positive directions; Specifically, based on the location of the unmanned ship at the time of power-on. With the origin as the reference point and due north as the reference point... The positive direction of the axis is east. Establish a world coordinate system along the positive axis; At a set time, acquire image data from n cameras; Specifically, the time is set as At any given moment, the unmanned surface vessel acquires image data from n cameras. Among them, acquiring n images at the same time is achieved by hardware synchronization triggering or software matching of adjacent frame image data; Image data from n cameras are input into the target detection model to obtain target detection results and image features; Specifically, camera image data Inputting the data into the object detection model yields the object detection results. The image features F are obtained from the backbone feature extraction network in the target detection model. Furthermore, when processing image data from the camera, each image is sequentially input into the target detection model to obtain a single image result, or all images are simultaneously input into a large target detection model to obtain all target detection results at the same time; The target detection results undergo tracking preprocessing. This includes the category of the detected bounding box and the location of the bounding box; first, the bounding box is normalized to obtain the new bounding box location (x, y, a, b): ; ; ; ; Where (x1,y1) and (x2,y2) are the coordinates of the top left and bottom right corners of the target detection box, respectively; w and h are the width and height of the image; and x,y,a,b are the normalized center of the horizontal axis, center of the vertical axis, width of the horizontal axis, and height of the vertical axis, respectively. The process of converting the target detection results into world coordinate system detection results includes the following steps: The extrinsic and intrinsic parameter matrices are obtained in advance through calibration. Specifically, the unmanned surface vessel obtains its external parameter matrix in advance through calibration. and intrinsic parameter matrix By placing the unmanned boat indoors and keeping it stationary, placing a calibration checkerboard around the unmanned boat, collecting checkerboard data using a camera, and obtaining the extrinsic and intrinsic parameter matrices of the unmanned boat camera through intrinsic and extrinsic parameter calibration algorithms; A coordinate system for the unmanned vessel is established with its position at a set time as the origin and its orientation and rightward direction as the positive directions. Specifically, the images acquired by the unmanned surface vessel (USV) are based on the USV's coordinate system. It is necessary to unify the results from all cameras to the world coordinate system to facilitate subsequent data integration and fusion processing. First, the USV's coordinate system must be established, based on the USV's coordinate system. The location at any given time is taken as the origin, and the direction in which the unmanned vessel is facing is taken as the reference point. On the positive axis, the right side of the unmanned vessel is... Establish the coordinate system of the unmanned vessel along the positive axis; Based on the extrinsic and intrinsic parameter matrices, the target detection results are transformed into the unmanned vessel coordinate system; Specifically, the target detection results Transform to the unmanned surface vessel coordinate system: ; in, Let i be the i-th target detection box in the coordinate system of the unmanned vessel; The target detection results are transferred from the unmanned vessel coordinate system to the world coordinate system to obtain the detection results in the world coordinate system; Specifically, the target detection results are transferred from the unmanned surface vessel's coordinate system to the world coordinate system, allowing the unmanned surface vessel to acquire... GPS location at any time and yaw angle data , target detection results Along The target detection result in the world coordinate system is obtained after rotating in the opposite direction of the angle and then performing position compensation. : 。 2. The water surface target tracking method based on multiple cameras according to claim 1, characterized in that, The process of merging the detection results in the world coordinate system to obtain a fused position result, and extracting target appearance features from image features based on the fused position result, includes the following steps: The detection results from the world coordinate system are merged to obtain the fused position result; Based on the index of the fusion location result, obtain the image region corresponding to the original target detection result; Based on the image features, the corresponding image region is cropped out as an image set feature, and the image set feature is then extracted by dimensionality reduction to obtain the corresponding target appearance feature.

3. A water surface target tracking device based on multiple cameras, characterized in that, include: The system includes an acquisition unit, a transformation unit, a merging and extraction unit, an input iteration unit, and an output unit. The acquisition unit is used to acquire image data from multiple cameras to obtain target detection results and image features from multiple cameras; The conversion unit is used to convert the target detection results into detection results in the world coordinate system; The merging and extraction unit is used to merge the detection results of the world coordinate system to obtain the fused position result, and extract the target appearance features from the image features based on the fused position result; The input iteration unit is used to input the fused position result and target appearance features into the multi-target tracker, iterate the tracker parameters, and obtain the multi-target tracker result. The output unit is used to output the multi-camera target tracking results in real time at the next moment based on the results of the multi-target tracker. The acquisition unit includes: a first establishment module, a first acquisition module, and an input module; The first establishment module is used to establish a world coordinate system with the location of the unmanned ship at the time of power-on as the origin and the due north and due east directions as positive directions; Specifically, based on the location of the unmanned ship at the time of power-on. With the origin as the reference point and due north as the reference point... The positive direction of the axis is east. Establish a world coordinate system along the positive axis; The first acquisition module is used to acquire image data from n cameras at a set time. Specifically, the time is set as At any given moment, the unmanned surface vessel acquires image data from n cameras. Among them, acquiring n images at the same time is achieved by hardware synchronization triggering or software matching of adjacent frame image data; The input module is used to input image data from n cameras into the target detection model to obtain target detection results and image features; Specifically, camera image data Inputting the data into the object detection model yields the object detection results. The image features F are obtained from the backbone feature extraction network in the target detection model. Furthermore, when processing image data from the camera, each image is sequentially input into the target detection model to obtain a single image result, or all images are simultaneously input into a large target detection model to obtain all target detection results at the same time; The target detection results undergo tracking preprocessing. This includes the category of the detected bounding box and the location of the bounding box; first, the bounding box is normalized to obtain the new bounding box location (x, y, a, b): ; ; ; ; Where (x1,y1) and (x2,y2) are the coordinates of the top left and bottom right corners of the target detection box, respectively; w and h are the width and height of the image; and x,y,a,b are the normalized center of the horizontal axis, center of the vertical axis, width of the horizontal axis, and height of the vertical axis, respectively. The merging and extraction unit includes: a calibration module, a second establishment module, a conversion module, and a transfer module; The calibration module is used to obtain the extrinsic parameter matrix and intrinsic parameter matrix in advance through calibration; Specifically, the unmanned surface vessel obtains its external parameter matrix in advance through calibration. and intrinsic parameter matrix By placing the unmanned boat indoors and keeping it stationary, placing a calibration checkerboard around the unmanned boat, collecting checkerboard data using a camera, and obtaining the extrinsic and intrinsic parameter matrices of the unmanned boat camera through intrinsic and extrinsic parameter calibration algorithms; The second establishment module is used to establish an unmanned vessel coordinate system with the unmanned vessel's position at a set time as the origin and the unmanned vessel's orientation direction and right-hand direction as the positive direction. Specifically, the images acquired by the unmanned surface vessel (USV) are based on the USV's coordinate system. It is necessary to unify the results from all cameras to the world coordinate system to facilitate subsequent data integration and fusion processing. First, the USV's coordinate system must be established, based on the USV's coordinate system. The location at any given time is taken as the origin, and the direction in which the unmanned vessel is facing is taken as the reference point. On the positive axis, the right side of the unmanned vessel is... Establish the coordinate system of the unmanned vessel along the positive axis; The conversion module is used to convert the target detection results into the unmanned vessel coordinate system based on the extrinsic and intrinsic parameter matrices. Specifically, the target detection results Transform to the unmanned surface vessel coordinate system: ; in, Let i be the i-th target detection box in the coordinate system of the unmanned vessel; The transfer module is used to transfer the target detection results from the unmanned vessel coordinate system to the world coordinate system to obtain the detection results in the world coordinate system; Specifically, the target detection results are transferred from the unmanned surface vessel's coordinate system to the world coordinate system, allowing the unmanned surface vessel to acquire... GPS location at any time and yaw angle data , target detection results Along The target detection result in the world coordinate system is obtained after rotating in the opposite direction of the angle and then performing position compensation. : 。 4. The water surface target tracking device based on multiple cameras according to claim 3, characterized in that, The input iteration unit includes: a merging module, a second acquisition module, and a pruning and dimensionality reduction extraction module; The merging module is used to merge the detection results of the world coordinate system to obtain the fused position result; The second acquisition module is used to acquire the image region corresponding to the original target detection result based on the index of the fusion location result; The cropping and dimensionality reduction extraction module is used to crop out the image set features of the corresponding image region based on the image features, and then perform dimensionality reduction extraction on the image set features to obtain the corresponding target appearance features.

5. A computer device, characterized in that, The computer device includes a memory and a processor. The memory stores a computer program, and when the processor executes the computer program, it implements the water surface target tracking method based on multiple cameras as described in any one of claims 1-2.

6. A storage medium, characterized in that, The storage medium stores a computer program, which includes program instructions that, when executed by a processor, can implement the multi-camera-based water surface target tracking method as described in any one of claims 1-2.

Citation Information

Patent Citations

  • Pedestrian movement speed intelligent sensing method based on video stream

    CN112598709A

  • Target tracking method, device and equipment and storage medium

    CN113256691A