A long-short focal camera target fusion method

By using target detection and similarity calculation, target fusion between long and short focal length cameras was achieved, solving the problem of high-cost manual calibration, improving the accuracy of target fusion and image precision, and expanding the application areas.

CN116188770BActive Publication Date: 2026-06-12CHONGQING CHANGAN TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHONGQING CHANGAN TECH CO LTD
Filing Date
2023-02-28
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In existing target fusion methods for long and short focal length cameras, the calculation of intrinsic and extrinsic parameter conversion relationships requires manual calibration, which is costly.

Method used

The system obtains bounding boxes for telephoto and short-focus images through object detection, calculates similarity and cross-union ratio (CUI), selects rectangular regions of interest using similarity and grayscale histogram features, maps and calculates the bounding box with the maximum CUI, and achieves object fusion.

🎯Benefits of technology

At a low cost, it improves the accuracy of target fusion and the precision of image breadth and depth, enhancing the flexibility of application areas and the breadth of application scenarios in engineering.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116188770B_ABST
    Figure CN116188770B_ABST
Patent Text Reader

Abstract

The application belongs to the technical field of data processing, and provides a long-short-focus camera target fusion method, which comprises the following steps: S1: acquiring a long-focus picture and a short-focus picture shot by a long-focus camera and a short-focus camera at the same time; S2: obtaining a first perception box of a target in the long-focus picture and a second perception box of the target in the short-focus picture through target detection; S3: performing similarity calculation on the long-focus picture and the short-focus picture, and obtaining a common field of view rect region of the long-focus camera and the short-focus camera in the short-focus picture; S4: if the second perception box of the target is located in the rect region, then the first perception box of the target is mapped to the short-focus picture, and an iou calculation is performed to obtain a maximum intersection-over-union perception box; S5: completing target fusion according to information of the first perception box and information of the maximum intersection-over-union perception box, and outputting fused target box information. The application is used to solve the problem of high cost caused by manual calibration in the conversion relationship calculation process of internal and external parameters indicated in the background technology.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of data processing technology, specifically relating to a target fusion method for long and short focal length cameras. Background Technology

[0002] With the development of vision technology, the improvement of the performance of chips, lenses, and industrial cameras, and the vigorous development and application of new energy and artificial intelligence, the demand for multi-camera collaborative engineering applications is becoming increasingly urgent. As a result, there are more and more industrial or intelligent business scenarios where long and short focal length cameras collaborate to collect data, and the application of long (narrow angle) and short (wide angle) focal length cameras has emerged.

[0003] In existing long- and short-focal-length camera technologies, such as patent number CN202011288888.5, a target detection and fusion method based on multiple cameras with long and short focal lengths in a vehicle environment is protected. This method includes the following steps: 1. Using a convolutional neural network to perform target detection on images acquired by long and short-focal-length binocular cameras, obtaining the target bounding box positions in images obtained by cameras with different focal lengths at the same time. 2. Based on the camera imaging principle and the intrinsic and extrinsic parameters K, R, T obtained from camera calibration, the mapping relationship f of the spatial target point P in the pixel coordinate system of the long and short-focal-length cameras can be obtained. 3. The target bounding box positions in the long-focal-length camera image are mapped using the mapping relationship f, etc.

[0004] However, the conversion relationship calculation of the intrinsic and extrinsic parameters of this method requires manual calibration, which is costly. Therefore, it is necessary to develop a low-cost method that meets the requirements in terms of speed and performance to achieve target fusion of long and short focal length cameras. Summary of the Invention

[0005] The purpose of this invention is to provide a target fusion method for long and short focal length cameras to solve the problem mentioned in the background art, which requires manual calibration and is costly during the calculation of the conversion relationship between intrinsic and extrinsic parameters.

[0006] To achieve the above-mentioned technical objectives, the technical solution adopted by the present invention is as follows:

[0007] A target fusion method for long-focus and short-focus cameras is provided, the method comprising:

[0008] S1: Acquire telephoto and short-focus images taken by the telephoto and short-focus cameras at the same time.

[0009] S2: By target detection, obtain a first perception box of the target in the telephoto image and a second perception box of the target in the short-focus image;

[0010] S3: Calculate the similarity between the telephoto and the short-focus images, and obtain the common field of view (rect) region of the telephoto and short-focus cameras in the short-focus image;

[0011] S4: If the second perception box of the target is located within the rect region, then the first perception box of the target is mapped onto the short-focus image, and IOU calculation is performed to obtain the perception box with the maximum intersection-union ratio;

[0012] S5: Complete target fusion based on the information of the first sensing box and the information of the maximum intersection-union ratio sensing box, and output the fused target box information.

[0013] Furthermore, the method of step S3 is as follows:

[0014] S31: Compress the telephoto lens to obtain a compressed image;

[0015] S32: Create several rectangular regions of interest in the short-focus image, wherein the rectangular regions of interest have the same field of view as the long-focus image;

[0016] S33: Calculate the similarity between the compressed image and several rectangular regions of interest;

[0017] S34: Select the rectangular region of interest with the highest similarity as the rect region.

[0018] Further, in step S32, the size of the telephoto image is obtained by acquiring the field of view width and height of the telephoto camera based on the factory parameters of the telephoto camera, and the size of the short-focus image is obtained by acquiring the field of view width and height of the short-focus camera based on the factory parameters of the short-focus camera. The aspect ratio of the short-focus image to the telephoto image is calculated based on the field of view width and height of the telephoto camera and the field of view width and height of the short-focus camera. The aspect ratio of the rectangular region of interest is calculated based on the field of view width and height of the telephoto camera and the aspect ratio.

[0019] Furthermore, the rectangular region of interest is measured using features from SSIM and the grayscale histogram.

[0020] Furthermore, in step S33, cosine distance is used to calculate the similarity between the compressed image and several rectangular regions of interest.

[0021] Furthermore, in step S4, the method for mapping the first perception box of the target onto the short-focus image is as follows:

[0022] S411: Scale the first perception box of the target according to the aspect ratio to obtain a scaled image box;

[0023] S412: After translating the scaled image frame according to the distance of the rect region information, coordinate mapping is performed.

[0024] Furthermore, in step S4, the method for performing IOU calculation and obtaining the maximum intersection-union ratio (IU) sensing box is as follows:

[0025] Calculate the IOU (Intersection over Union) ratio between the scaled image box and the second perception boxes of each target within the rect region, and obtain the perception box with the maximum IOU.

[0026] Furthermore, in step S5, the method for outputting the fused target box information is as follows:

[0027] S51: Preset the cross-union ratio threshold, and determine the size of the maximum cross-union ratio sensing box and the cross-union ratio threshold.

[0028] S52: If the cross-union ratio of the maximum cross-union ratio sensing box is less than a preset threshold, a new target is created based on the detection result of the second sensing box of the target and stored in the output list;

[0029] S53: If the cross-union ratio of the maximum cross-union ratio sensing box is greater than a preset threshold, then based on the maximum cross-union ratio sensing box information, a new target of the data type required by the business is constructed and stored in the output list.

[0030] The invention employing the above technical solution has the following advantages:

[0031] 1. By utilizing both long and short focal length cameras, this invention can improve the accuracy and comprehensiveness of the breadth and depth (long-distance clarity) of targets or images in the field of image processing, thereby laying a solid foundation for the subsequent calculation of target state attributes.

[0032] 2. While ensuring speed and practicality, this invention can improve the accuracy, clarity and breadth of the image at the software level by applying algorithms. Based on the given hardware conditions, it increases the flexibility of the application field and expands the scope of application scenarios or industries in engineering. Attached Figure Description

[0033] The present invention can be further illustrated by the non-limiting embodiments given in the accompanying drawings;

[0034] Figure 1 This is a flowchart illustrating the target fusion method for long and short focal length cameras in an embodiment of the present invention.

[0035] Figure 2 This is a schematic diagram of short-focus image acquisition in an embodiment of the present invention;

[0036] Figure 3 This is a system block diagram of the data parsing and optimization device in an embodiment of the present invention. Detailed Implementation

[0037] The present application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that similar or identical parts are referred to by the same reference numerals in the drawings or description. Implementations not shown or described in the drawings are forms known to those skilled in the art. In the description of this application, terms such as "first" and "second" are used only to distinguish descriptions and should not be construed as indicating or implying relative importance.

[0038] like Figure 1 As shown in the figure, this application provides a target fusion method for long and short focal length cameras, the method including:

[0039] In this embodiment, before executing step S1, the algorithm flow for long and short focal lengths is configured through the project's JSON configuration file. If configured as a single-camera model, the subsequent target state attribute calculation flow of a single camera can also be followed.

[0040] S1: Acquire telephoto and short-focus images taken by the telephoto and short-focus cameras at the same time.

[0041] S2: By object detection, a first perception box of the target in the telephoto image and a second perception box of the target in the short-focus image are obtained; in this embodiment, deep learning object detection is used, and a lightweight convolutional neural network is used to perform object detection to obtain the positions of the first perception box and the second perception box.

[0042] S3: Calculate the similarity between the telephoto and the short-focus images, and obtain the common field of view (rect) region of the telephoto and short-focus cameras in the short-focus image;

[0043] In this embodiment, the method of step S3 is as follows:

[0044] S31: Compress the telephoto image to obtain a compressed image. In this embodiment, the compressed image has the same resolution as the rectangular region of interest.

[0045] S32: Create several rectangular regions of interest in the short-focus image, wherein the rectangular regions of interest have the same field of view as the long-focus image;

[0046] S33: Calculate the similarity between the compressed image and several rectangular regions of interest;

[0047] S34: Select the rectangular region of interest with the highest similarity as the rect region.

[0048] In this embodiment, in step S32, the size of the telephoto image is obtained by acquiring the field of view width and height of the telephoto camera based on the factory parameters of the telephoto camera, and the size of the short-focus image is obtained by acquiring the field of view width and height of the short-focus camera based on the factory parameters of the short-focus camera. The aspect ratio of the short-focus image to the telephoto image is calculated based on the field of view width and height of the telephoto camera and the field of view width and height of the short-focus camera. The aspect ratio of the rectangular region of interest is calculated based on the field of view width and height of the telephoto camera and the aspect ratio.

[0049] In this embodiment, the rectangular region of interest is measured using features from SSIM and the grayscale histogram.

[0050] In this embodiment, in step S33, cosine distance is used to calculate the similarity between the compressed image and several rectangular regions of interest.

[0051] S4: If the second perception box of the target is located within the rect region, then the first perception box of the target is mapped onto the short-focus image, and IOU calculation is performed to obtain the perception box with the maximum intersection-union ratio;

[0052] In this embodiment, the second perception box of the target has three different states, such as Figure 3 As shown, within the telephoto frame range, it is defined as state 1; outside the telephoto frame range, it is defined as state 2; and at the boundary of the telephoto frame, it is defined as state 3.

[0053] In this embodiment, the method for mapping the first perception box of the target onto the short-focus image in step S4 is as follows:

[0054] S411: Scale the first perception box of the target according to the aspect ratio to obtain a scaled image box;

[0055] S412: After translating the scaled image frame according to the distance of the rect region information, coordinate mapping is performed.

[0056] In this embodiment, the method for performing IOU calculation and obtaining the maximum intersection-union ratio (IU) sensing box in step S4 is as follows:

[0057] Calculate the IOU (Intersection over Union) ratio between the scaled image box and the second perception boxes of each target within the rect region, and obtain the perception box with the maximum IOU.

[0058] S5: Complete target fusion based on the information of the first sensing box and the information of the maximum intersection-union ratio sensing box, and output the fused target box information.

[0059] In this embodiment, the method for outputting the fused target box information in step S5 is as follows:

[0060] S51: Preset the cross-union ratio threshold, and determine the size of the maximum cross-union ratio sensing box and the cross-union ratio threshold.

[0061] S52: If the cross-union ratio of the maximum cross-union ratio sensing box is less than a preset threshold, a new target is created based on the detection result of the second sensing box of the target and stored in the output list;

[0062] S53: If the cross-union ratio of the maximum cross-union ratio sensing box is greater than a preset threshold, then based on the maximum cross-union ratio sensing box information, a new target of the data type required by the business is constructed and stored in the output list.

[0063] The above provides a detailed description of a target fusion method for long and short focal length cameras provided by the present invention. The specific embodiments described are merely for the purpose of helping to understand the method and its core ideas. It should be noted that those skilled in the art can make various improvements and modifications to the present invention without departing from its principles, and these improvements and modifications also fall within the protection scope of the claims of the present invention.

Claims

1. A target fusion method for long and short focal length cameras, characterized in that, The method includes: S1: Acquire telephoto and short-focus images taken by the telephoto and short-focus cameras at the same time. S2: By target detection, obtain a first perception box of the target in the telephoto image and a second perception box of the target in the short-focus image; S3: Calculate the similarity between the telephoto and the short-focus images, and obtain the common field of view (rect) region of the telephoto and short-focus cameras in the short-focus image; S4: If the second sensing box of the target is located in the rect area, then the first sensing box of the target is mapped onto the short-focus image, the IOU cross-union ratio of the mapped first sensing box and the second sensing boxes of each target in the rect area is calculated, and the second sensing box with the largest cross-union ratio is selected as the sensing box with the largest cross-union ratio. S5: Complete target fusion based on the information of the first sensing box and the information of the maximum intersection-union ratio sensing box, and output the fused target box information; In step S5, the method for outputting the fused target box information is as follows: S51: Preset an intersection-over-union (IoU) threshold, and determine the size of the maximum IoU sensing box and the IoU threshold; S52: If the cross-union ratio of the maximum cross-union ratio sensing box is less than a preset threshold, a new target is created based on the detection result of the second sensing box of the target and stored in the output list; S53: If the cross-union ratio of the maximum cross-union ratio sensing box is greater than a preset threshold, then based on the maximum cross-union ratio sensing box information, a new target of the data type required by the business is constructed and stored in the output list.

2. The target fusion method for long and short focal length cameras according to claim 1, characterized in that, The method for step S3 is as follows: S31: Compress the telephoto lens to obtain a compressed image; S32: Create several rectangular regions of interest in the short-focus image, wherein the rectangular regions of interest have the same field of view as the long-focus image; S33: Calculate the similarity between the compressed image and several rectangular regions of interest; S34: Select the rectangular region of interest with the highest similarity as the rect region.

3. The target fusion method for long and short focal length cameras according to claim 2, characterized in that, In step S32, the size of the telephoto image is obtained by acquiring the field of view width and height of the telephoto camera based on the factory parameters of the telephoto camera. The size of the short-focus image is obtained by acquiring the field of view width and height of the short-focus camera based on the factory parameters of the short-focus camera. The aspect ratio of the short-focus image to the telephoto image is calculated based on the field of view width and height of the telephoto camera and the aspect ratio. The width and height of the rectangular region of interest are calculated based on the field of view width and height of the telephoto camera and the aspect ratio.

4. The target fusion method for long and short focal length cameras according to claim 3, characterized in that, The rectangular region of interest is measured using features from SSIM and grayscale histograms.

5. The target fusion method for long and short focal length cameras according to claim 2, characterized in that, In step S33, cosine distance is used to calculate the similarity between the compressed image and several rectangular regions of interest.

6. The target fusion method for long and short focal length cameras according to claim 3, characterized in that, In step S4, the method for mapping the first perception box of the target onto the short-focus image is as follows: S411: Scale the first perception box of the target according to the width-to-height ratio to obtain a scaled image box; S412: After translating the scaled image frame according to the distance of the rect region information, coordinate mapping is performed.