A method of ship identification and tracking

By converting the pixel position of the ship's center point into latitude and longitude information in the camera image and comparing it with the AIS system, the problem of associating ships in the video with AIS data is solved, achieving efficient ship identification and tracking, and improving identification efficiency and accuracy.

CN122200560APending Publication Date: 2026-06-12SICHUAN DATACELL-BORUI SCI & TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SICHUAN DATACELL-BORUI SCI & TECH CO LTD
Filing Date
2024-10-16
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing technologies cannot effectively correlate ships detected in videos with AIS data, limiting their value in practical applications.

Method used

By analyzing the pixel position of the ship's center point in the camera image and converting it into latitude and longitude information in the real world, and comparing it with the data in the AIS system, the mapping between the target in the image and the real ship is realized. The ship is detected using the YOLOv7 model, and the Gaussian plane and geodetic coordinates of the ship are calculated by forward and inverse Gaussian transform, and then associated with the ship information in the AIS system.

🎯Benefits of technology

It improves the efficiency and accuracy of ship identification and tracking, realizes video content enhancement, can quickly adapt to training and identify ship targets in real time, shortens training time, and improves the model's learning efficiency and information accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122200560A_ABST
    Figure CN122200560A_ABST
Patent Text Reader

Abstract

The application relates to a method for automatically identifying and tracking a ship by using a camera, which comprises the following steps: model training evaluation, inputting a video frame picture as a parameter into a model, obtaining a position of a ship in an image detected by the model, and calculating a pixel coordinate of a center point of the ship according to a length and a width of the detected ship position; determining a space coordinate corresponding to a pixel point on the image according to a focal length, a space coordinate of a camera center and an angle, and then obtaining a coordinate in an actual position Gaussian plane; and the like. Based on the space coordinate of the camera, the geodetic coordinate and the Gaussian conversion method, the application can quickly identify the ship in a single camera video, and improves the ship identification efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention application is a divisional application of the invention patent application filed on October 16, 2024, with application number 202411443002.8 and titled "A method for automatically identifying and tracking ships using a camera".

[0002] This invention relates to the fields of artificial intelligence and geographic information processing, specifically to automatic identification and tracking technology for vessels in the Yangtze River waterway, and belongs to the field of vessel monitoring. Background Technology

[0003] Automatic Identification System (AIS) is a new type of navigation aid system used for maritime safety and communication between ships and shore, and between ships themselves. It typically consists of a communication controller connected to VHF radios, GPS locators, shipboard displays, and sensors, enabling automatic exchange of important information such as ship position, heading, and name. While transmitting this information outwards, the AIS installed on a ship also receives information from other vessels within its VHF coverage area, thus achieving automatic response.

[0004] Maritime vessel detection and identification mainly consists of two steps: detection and identification. Detection refers to locating all vessels and their bounding boxes in an image at sea. Identification mainly involves determining the type and other characteristics of each vessel (such as length, width, name, MMSI, etc.).

[0005] Object tracking algorithms are a class of computer vision algorithms used to track objects or targets in video sequences. Their main objective is to continuously determine and predict the position, motion, and other attributes of a target at different points in time by analyzing and processing pixel-level data across consecutive frames of a video. They are mainly divided into two categories: generative methods and discriminative methods.

[0006] 1. Generative algorithms model a given target region in the initial frame and then search for the part most similar to the model in subsequent frames as the predicted target location. These methods include Kalman filtering, particle filtering, and mean-shift, but their tracking accuracy is relatively low.

[0007] 2. Discriminative algorithms treat target tracking as a target detection task within each frame. They train a classifier using image features of the target, treating target regions as positive samples and background regions as negative samples. The trained classifier is then used in subsequent frames to find the optimal solution. Discriminative methods continuously update the classifier using the tracking results from each frame during the tracking process.

[0008] The existing technologies described above cannot effectively correlate vessels detected in videos with AIS data, limiting their value in practical applications. In the real world, there is an urgent need for a method to link detected vessel objects in video footage with AIS information, thereby providing video content enhancement capabilities. Summary of the Invention

[0009] The purpose of this invention is to develop an automatic ship identification and tracking method that combines target detection and an AIS system. By analyzing the pixel position of the ship's center point in the camera image and converting it into real-world latitude and longitude information, and comparing it with data from the AIS system, the mapping between the target in the image and the real ship can be achieved.

[0010] To achieve the above objectives, the present invention proposes the following technical solution, the steps of which include:

[0011] Step 1: Filter and label the data to form a dataset, and divide the dataset into training set, validation set, and test set;

[0012] Step 2: Use the training set and validation set to train the YOLOv7 model and obtain the ship detection model and weight file based on YOLOv7.

[0013] Step 3: Input the test set into the model for detection and output the detection results;

[0014] Step 4: Model evaluation; if the evaluation result is unsatisfactory, repeat steps 1-3; if the result is satisfactory, proceed to the next step.

[0015] Step 5: Input the video frame as a parameter into the model to obtain the position of the ship in the image detected by the model, and calculate the pixel coordinates of the ship's center point based on the length and width of the detected ship position;

[0016] Step 6: Convert the geodetic coordinates of the camera's location into Gaussian plane coordinates using a Gaussian forward transform. Here, the camera's geographical location refers to the location of the camera's center.

[0017] Step 7: Determine the spatial coordinates of the pixels on the image based on the focal length, the spatial coordinates of the camera center, and the angle, and then obtain the coordinates of the actual position in the Gaussian plane;

[0018] Step 8: Obtain the corresponding geodetic coordinates from the Gaussian coordinates of the actual location obtained in Step 7 through inverse Gaussian transformation;

[0019] Step 9: Obtain the latitude and longitude information of all ships within a 10-meter range of latitude and longitude from the AIS system in Step 7. By comparing them one by one, find the ship information that is closest to the latitude and longitude calculated in Step 8, and draw the ship information into the video screen using code.

[0020] Step 10: Repeat steps 5, 7, 8, and 9 to achieve the function of tracking ship targets in the frame sequence.

[0021] Further, in step 1, the ship video frames are selected and labeled, and divided into training, validation, test, and evaluation sets according to a certain ratio. The data is divided in a 10:2:2:1 ratio for each set. At the start of training, one-third of the training set is used. After training, if the evaluation does not meet the expected target, one-third more of the training set is added for training. If the evaluation still does not meet the expected target after training, the entire training set is trained, and the data from the evaluation set is mixed and trained together. This process continues until the model meets the expected evaluation results.

[0022] Furthermore, step 2 obtains the coordinates of the target's distance from the top left corner of the screen on the horizontal axis (x1) and vertical axis (y1), as well as the target's horizontal span (x2) and vertical span (y2). Therefore, the horizontal coordinate of the target's center point is x = x1 + x2 / 2, and the vertical coordinate is y = y1 + y2 / 2.

[0023] Furthermore, step 5 is usually represented by a three-dimensional coordinate system, which indicates the position of the camera's optical center in three-dimensional space.

[0024] First, the latitude and longitude coordinates, altitude, and pitch angle of the camera need to be obtained. These parameters are measured when the camera is installed and then saved to the database.

[0025] Secondly, using the Gaussian projection forward calculation formula, given the geodetic coordinates (B, L) and the longitude of the central meridian L0, the Gaussian plane coordinates (x, y) are calculated as follows:

[0026]

[0027]

[0028] Where B is latitude, and l = L - L0, the unit is radians. Let be the radius of curvature of the zonal loop, and t = tan(B). , denoted by the second eccentricity, a is the semi-major axis of the rotating ellipsoid, b is the semi-minor axis, and X is the arc length of the meridian.

[0029] Finally, by substituting the camera parameters into the formula, we can obtain the Gaussian plane coordinates of the camera, and by adding the altitude, we can construct its three-dimensional spatial coordinates. ).

[0030] Further, step 7 calculates the coordinates of the Gaussian plane.

[0031] Assuming the camera's focal length is f, the spatial coordinates of the camera's center point are ( ).

[0032] For each pixel (u, v) in the image, assuming that u and v in the camera coordinate system represent the horizontal and vertical directions of the pixel respectively, the vector from the pixel to the optical center can be represented as (u, v, f).

[0033] Step 71: Calculate the rotation matrix R

[0034] Given the camera's attitude (pitch, yaw, and roll), the camera can be described as a rotation matrix R, which can be obtained through the following steps for constructing a rotation matrix:

[0035] Calculate the rotation matrix Rx about the x-axis (roll), where roll represents the roll angle:

[0036]

[0037] Calculate the rotation matrix Ry about the y-axis (pitch), where pitch represents the pitch angle:

[0038]

[0039] Calculate the rotation matrix Rz about the z-axis (yaw), where yaw represents the yaw angle:

[0040]

[0041] Multiplying the three rotation matrices together yields the final rotation matrix R:

[0042]

[0043] Step 72: Calculate the spatial coordinates of the pixel.

[0044] By multiplying the vector (u,v,f) of the pixels in the image by the rotation matrix R, and then adding the spatial coordinates of the camera center ( (This allows us to calculate the spatial coordinates of a pixel.)

[0045] The specific steps are as follows:

[0046] 1) For a pixel (u,v), construct a vector (u,v,f).

[0047] 2) Multiply the vector (u,v,f) by the rotation matrix R to obtain the vector in the camera coordinate system.

[0048] 3) Add the spatial coordinates of the camera center to the vector in the camera coordinate system. ), to obtain the world coordinates (X,Y,Z) of the pixel.

[0049] 4) The height information remains unchanged, and the Gaussian plane coordinates of the pixel are (X, Y).

[0050] Furthermore, in step 8, the Gaussian coordinates of the actual location obtained in step 7 are transformed by an inverse Gaussian transform to obtain their corresponding geodetic coordinates. The calculation method is as follows:

[0051] Given the Gaussian plane coordinates (x, y) and the specified central meridian longitude L0, calculate the geodetic coordinates (B, L):

[0052]

[0053]

[0054] in, , , , , The latitude of the base point is calculated by inversely from the arc length of the meridian.

[0055] By substituting the Gaussian plane coordinates obtained in step 6 into the formula, we can obtain the geodetic coordinates, which are latitude and longitude information.

[0056] In the above implementation process, it is important to note that in order to ensure accuracy, all calculation results must be retained to at least 10 decimal places; otherwise, the error will be too large and the correct result cannot be obtained.

[0057] The present invention discloses a method for automatically identifying and tracking ships using a camera, which has the following advantages:

[0058] (1) This invention categorizes the dataset and explains how to use the training set, further improving the model's ability to quickly adapt to the expected training method, thus increasing the model's training efficiency and shortening the required training time. This improves the model's training and learning efficiency.

[0059] (2) The present invention uses methods such as spatial coordinates, geodetic coordinates, and Gaussian transformation based on cameras to achieve rapid identification of ships in a single camera video, thereby improving the efficiency of ship identification.

[0060] (3) In this invention, the tracking function of ship targets in frame sequence is realized. Since the position of the ship is changing while it is sailing on the river, but the position of the camera is fixed, after calculating the relevant data of a single camera, the subsequent ships entering the camera area can be identified more quickly.

[0061] (4) The present invention establishes a model evaluation loop procedure and a real-time identification procedure for ships based on cameras, which fully considers the complex influence of various factors and improves the information accuracy based on the training model. Attached Figure Description

[0062] Figure 1 Flowchart of a specific embodiment of the present invention Detailed Implementation

[0063] The purpose of this invention is to develop an automatic ship identification and tracking method that combines target detection and an AIS system. By analyzing the pixel position of the ship's center point in the camera image and converting it into real-world latitude and longitude information, and comparing it with data from the AIS system, the mapping between the target in the image and the real ship can be achieved.

[0064] To achieve the above objectives, the present invention proposes the following technical solution, the steps of which include:

[0065] Step 1: Filter and label the data to form a dataset, and divide the dataset into training set, validation set, and test set;

[0066] Step 2: Use the training set and validation set to train the YOLOv7 model and obtain the ship detection model and weight file based on YOLOv7.

[0067] Step 3: Input the test set into the model for detection and output the detection results;

[0068] Step 4: Model evaluation; if the evaluation result is unsatisfactory, repeat steps 1-3; if the result is satisfactory, proceed to the next step.

[0069] Step 5: Input the video frame as a parameter into the model to obtain the position of the ship in the image detected by the model, and calculate the pixel coordinates of the ship's center point based on the length and width of the detected ship position;

[0070] Step 6: Convert the geodetic coordinates of the camera's location into Gaussian plane coordinates using a Gaussian forward transform. Here, the camera's geographical location refers to the location of the camera's center.

[0071] Step 7: Determine the spatial coordinates of the pixels on the image based on the focal length, the spatial coordinates of the camera center, and the angle, and then obtain the coordinates of the actual position in the Gaussian plane;

[0072] Step 8: Obtain the corresponding geodetic coordinates from the Gaussian coordinates of the actual location obtained in Step 7 through inverse Gaussian transformation;

[0073] Step 9: Obtain the latitude and longitude information of all ships within a 10-meter range of latitude and longitude from the AIS system in Step 7. By comparing them one by one, find the ship information that is closest to the latitude and longitude calculated in Step 8, and draw the ship information into the video screen using code.

[0074] Step 10: Repeat steps 5, 7, 8, and 9 to achieve the function of tracking ship targets in the frame sequence.

[0075] Further, in step 1, the ship video frames are selected and labeled, and divided into training, validation, test, and evaluation sets according to a certain ratio. The data is divided in a 10:2:2:1 ratio for each set. At the start of training, one-third of the training set is used. After training, if the evaluation does not meet the expected target, one-third more of the training set is added for training. If the evaluation still does not meet the expected target after training, the entire training set is trained, and the data from the evaluation set is mixed and trained together. This process continues until the model meets the expected evaluation results.

[0076] Furthermore, step 2 obtains the coordinates of the target's distance from the top left corner of the screen on the horizontal axis (x1) and vertical axis (y1), as well as the target's horizontal span (x2) and vertical span (y2). Therefore, the horizontal coordinate of the target's center point is x = x1 + x2 / 2, and the vertical coordinate is y = y1 + y2 / 2.

[0077] Furthermore, step 5 is usually represented by a three-dimensional coordinate system, which indicates the position of the camera's optical center in three-dimensional space.

[0078] First, the latitude and longitude coordinates, altitude, and pitch angle of the camera need to be obtained. These parameters are measured when the camera is installed and then saved to the database.

[0079] Secondly, using the Gaussian projection forward calculation formula, given the geodetic coordinates (B, L) and the longitude of the central meridian L0, the Gaussian plane coordinates (x, y) are calculated as follows:

[0080]

[0081]

[0082] Where B is latitude, and l = L - L0, the unit is radians. Let be the radius of curvature of the zonal loop, and t = tan(B). , denoted by the second eccentricity, a is the semi-major axis of the rotating ellipsoid, b is the semi-minor axis, and X is the arc length of the meridian.

[0083] Finally, by substituting the camera parameters into the formula, we can obtain the Gaussian plane coordinates of the camera, and by adding the altitude, we can construct its three-dimensional spatial coordinates. ).

[0084] Further, step 6 calculates the coordinates of the camera's center position.

[0085] Assuming the camera's focal length is f, the spatial coordinates of the camera's center point are ( ).

[0086] For each pixel (u, v) in the image, assuming that u and v in the camera coordinate system represent the horizontal and vertical directions of the pixel respectively, the vector from the pixel to the optical center can be represented as (u, v, f).

[0087] Step 61: Calculate the rotation matrix R

[0088] Given the camera's attitude (pitch, yaw, and roll), the camera can be described as a rotation matrix R, which can be obtained through the following steps for constructing a rotation matrix:

[0089] Calculate the rotation matrix Rx about the x-axis (roll), where roll represents the roll angle:

[0090]

[0091] Calculate the rotation matrix Ry about the y-axis (pitch), where pitch represents the pitch angle:

[0092]

[0093] Calculate the rotation matrix Rz about the z-axis (yaw), where yaw represents the yaw angle:

[0094]

[0095] Multiplying the three rotation matrices together yields the final rotation matrix R:

[0096]

[0097] Step 62: Calculate the spatial coordinates of the pixel.

[0098] By multiplying the vector (u,v,f) of the pixels in the image by the rotation matrix R, and then adding the spatial coordinates of the camera center ( (This allows us to calculate the spatial coordinates of a pixel.)

[0099] The specific steps are as follows:

[0100] 1) For a pixel (u,v), construct a vector (u,v,f).

[0101] 2) Multiply the vector (u,v,f) by the rotation matrix R to obtain the vector in the camera coordinate system.

[0102] 3) Add the spatial coordinates of the camera center to the vector in the camera coordinate system. ), to obtain the world coordinates (X,Y,Z) of the pixel.

[0103] 4) The height information remains unchanged, and the Gaussian plane coordinates of the pixel are (X, Y).

[0104] Furthermore, in step 8, the Gaussian coordinates of the actual location obtained in step 7 are transformed by an inverse Gaussian transform to obtain their corresponding geodetic coordinates. The calculation method is as follows:

[0105] Given the Gaussian plane coordinates (x, y) and the specified central meridian longitude L0, calculate the geodetic coordinates (B, L):

[0106]

[0107]

[0108] in, , , , , The latitude of the base point is calculated by inversely from the arc length of the meridian.

[0109] By substituting the Gaussian plane coordinates obtained in step 6 into the formula, we can obtain the geodetic coordinates, which are latitude and longitude information.

[0110] In the above implementation process, it is important to note that in order to ensure accuracy, all calculation results must be retained to at least 10 decimal places; otherwise, the error will be too large and the correct result cannot be obtained.

[0111] The present invention discloses a method for automatically identifying and tracking ships using a camera, which has the following advantages:

[0112] (3) This invention categorizes the dataset and explains how to use the training set, further improving the model's ability to quickly adapt to the expected training method, thus increasing the model's training efficiency and shortening the required training time. This improves the model's training and learning efficiency.

[0113] (4) The present invention uses methods such as spatial coordinates, geodetic coordinates, and Gaussian transformation based on cameras, which can achieve rapid identification of ships in a single camera video and improve the efficiency of ship identification.

[0114] (3) In this invention, the tracking function of ship targets in frame sequence is realized. Since the position of the ship is changing while it is sailing on the river, but the position of the camera is fixed, after calculating the relevant data of a single camera, the subsequent ships entering the camera area can be identified more quickly.

[0115] (4) The present invention establishes a model evaluation loop procedure and a real-time identification procedure for ships based on cameras, which fully considers the complex influence of various factors and improves the information accuracy based on the training model.

[0116] The present invention has been described above. Obviously, the implementation of the present invention is not limited to the above-described manner. Any improvements made using the inventive concept and technical solution of the present invention, or the direct application of the inventive concept and technical solution to other situations without modification, are all within the protection scope of the present invention.

Claims

1. A method for identifying and tracking ships, comprising the following steps: Step 1: Filter and label the data to form a dataset, and divide the dataset into training set, validation set, and test set; In step 1, the ship video frames are screened and labeled, and divided into training set, validation set, test set, and evaluation set according to a certain ratio. The data is divided according to a ratio of 10:2:2:

1. At the beginning of training, 1 / 3 of the training set is used for training. After training, if the evaluation does not meet the expected goal, 1 / 3 of the training set is added for training. After training, if the evaluation does not meet the expected goal, the entire training set is trained and the data of the evaluation set is mixed together and trained together until the model meets the expected evaluation effect. Step 2: Use the training set and validation set to train the YOLOv7 model and obtain the ship detection model and weight file based on YOLOv7. Step 2 obtains the coordinates of the target's distance from the top left corner of the screen on the horizontal axis x1 and vertical axis y1, as well as the target's horizontal span x2 and vertical span y2. Therefore, the horizontal coordinate of the target's center point is x = x1 + x2 / 2, and the vertical coordinate is y = y1 + y2 / 2. Step 3: Input the test set into the model for detection and output the detection results; Step 4: Model evaluation; if the evaluation result is unsatisfactory, repeat steps 1-3; if the result is satisfactory, proceed to the next step. Step 5: Input the video frame as a parameter into the model to obtain the position of the ship in the image detected by the model. Based on the detected length and width of the ship, calculate the pixel coordinates of the ship's center point. Step 5 is typically represented by a three-dimensional coordinate system, indicating the position of the camera's optical center in three-dimensional space. First, the latitude and longitude coordinates, altitude, and pitch angle of the camera need to be obtained. These parameters are measured when the camera is installed and then saved to the database. Secondly, using the Gaussian projection forward calculation formula, given the geodetic coordinates (B, L) and the longitude of the central meridian L0, the Gaussian plane coordinates (x, y) are calculated as follows: Where B is latitude, and l = L - L0, the unit is radians. Let be the radius of curvature of the zonal loop, and t = tan(B). , denoted by the second eccentricity, a is the semi-major axis of the rotating ellipsoid, b is the semi-minor axis, and X is the arc length of the meridian. Finally, by substituting the camera parameters into the formula, we can obtain the Gaussian plane coordinates of the camera, and by adding the altitude, we can construct its three-dimensional spatial coordinates. ); Step 6: Convert the geodetic coordinates of the camera's location into Gaussian plane coordinates using a Gaussian forward transform. Here, the camera's geographical location refers to the location of the camera's center. Step 6 calculates the coordinates of the camera center position. Step 61 calculates the rotation matrix R. Given the camera's pitch angle, yaw angle, and roll angle, the camera can be described as a rotation matrix R. R can be obtained through the following steps for constructing the rotation matrix: Calculate the rotation matrix Rx about the x-axis (roll), where roll represents the roll angle. Calculate the rotation matrix Ry about the y-axis (pitch), where pitch represents the pitch angle. Calculate the rotation matrix Rz about the z-axis (yaw), where yaw represents the yaw angle. Multiplying the three rotation matrices together yields the final rotation matrix R. Step 62: Calculate the spatial coordinates of the pixel. By multiplying the vector (u,v,f) of the pixels in the image by the rotation matrix R, and then adding the spatial coordinates of the camera center ( (This allows us to calculate the spatial coordinates of a pixel.) Step 7: Determine the spatial coordinates of the pixels on the image based on the focal length, the spatial coordinates of the camera center, and the camera's attitude, including pitch, yaw, and roll angles. Then, obtain the coordinates of the actual position in the Gaussian plane. Step 8: Obtain the corresponding geodetic coordinates from the Gaussian coordinates of the actual location obtained in Step 7 through inverse Gaussian transformation; Step 8 involves using the inverse Gaussian transform to obtain the corresponding geodetic coordinates of the actual location obtained in step 7. The calculation method is as follows: Given the Gaussian plane coordinates (x, y) and the specified central meridian longitude L0, calculate the geodetic coordinates (B, L): in, , , , , The latitude of the base point is calculated by inversely multiplying the arc length of the meridian. By substituting the Gaussian plane coordinates obtained in step 6 into the formula, we can obtain the geodetic coordinates, which are latitude and longitude information. To ensure accuracy, all calculation results must be retained to at least 10 decimal places; otherwise, the error will be too large and the correct result cannot be obtained. Step 9: Obtain the latitude and longitude information of all ships within a 10-meter range of latitude and longitude from the AIS system in Step 7. By comparing them one by one, find the ship information that is closest to the latitude and longitude calculated in Step 8, and draw the ship information into the video screen using code. Step 10: Repeat steps 5, 7, 8, and 9 to achieve the function of tracking ship targets in the frame sequence.