Image recognition device, image recognition method, and program

The human detection unit determines the human's position and size, and the frame determination unit adjusts the frame size according to the human's detection position and distance. This solves the problem of misjudgment caused by fixed frame size and improves the accuracy of position recognition.

CN115280359BActive Publication Date: 2026-06-30JVC KENWOOD CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
JVC KENWOOD CORP
Filing Date
2021-08-20
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In existing technologies, the fixed frame size cannot accommodate people of different heights. This results in shorter individuals having an undersized frame, mistakenly believing they are located further away than their actual position, increasing the possibility of misjudgment.

Method used

The detection unit determines the position and size of the person being detected, and the frame determination unit determines different sizes of frame lines based on the position and distance of the person being detected. The frame lines are generated and displayed to be different from the detected size. In particular, the frame line size is increased for short people to reduce misjudgment.

Benefits of technology

This reduces the possibility of misjudging the location of short individuals and improves the accuracy of location recognition.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115280359B_ABST
    Figure CN115280359B_ABST
Patent Text Reader

Abstract

The image recognition device (10) includes: a person detection unit (14) that detects people in a captured image and determines the detection position and detection size of the people in the captured image; a frame determination unit (16) that determines the size of a frame that is different from the detection size based on the determined detection position of the people; an image generation unit (18) that generates a display image by superimposing the frame of the determined size onto the detection position of the captured image; and a display control unit (20) that causes the display device (28) to display the generated display image.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to an image recognition device, an image recognition method, and a program. Background Technology

[0002] A known technique involves detecting people within a captured image, generating and displaying an image with a rectangular frame overlaid on the detected person. The size of the rectangular frame overlaid on the captured image is sometimes set to an aspect ratio of 2:1, enclosing the area of ​​the detected person. Furthermore, as disclosed in Patent Document 1, a technique has been used to display a frame of a standard size (e.g., equivalent to a height of 170cm) determined based on the distance from the detected person.

[0003] Prior art literature

[0004] Patent documents

[0005] Patent document 1: Japanese Patent Application Publication No. 2019-204374. Summary of the Invention

[0006] The problem that the invention aims to solve

[0007] When the frame size is set to a fixed value regardless of the person's size, it may be difficult to accurately determine whether a person is an adult or a child. On the other hand, when the frame size is set to correspond to the person's size, the frame size superimposed on shorter individuals such as children becomes smaller. When the frame size is small, when viewing an image with the frame superimposed, it may give the impression that the person is located at a greater distance than their actual position, potentially leading to a misjudgment of the location of nearby shorter individuals.

[0008] The present invention was made in view of the above circumstances, and its object is to provide a technique that reduces the possibility of erroneously determining the location of a detected person.

[0009] means for solving problems

[0010] An image recognition apparatus according to one aspect of the present invention includes: an image acquisition unit for acquiring a captured image; a person detection unit for detecting a person contained in the captured image acquired by the image acquisition unit and determining the detection position and detection size of the person in the captured image; a frame determination unit for determining the size of a frame line different from the detection size based on the detection position of the person determined by the person detection unit; an image generation unit for generating a display image, which is obtained by superimposing a frame line of the size determined by the frame determination unit onto the detection position of the captured image; and a display control unit for causing a display device to display the display image generated by the image generation unit.

[0011] Another aspect of the present invention is an image recognition method performed by an image recognition device. The method includes the following steps: acquiring a captured image; detecting a person contained in the acquired captured image, and determining the detection position and detection size of the person in the captured image; determining the size of a frame line different from the detection size based on the determined detection position of the person; generating a display image by superimposing the determined frame line onto the detection position of the captured image; and displaying the generated display image on a display device.

[0012] Another aspect of the invention is a computer-readable storage medium storing a program. The program causes a computer to perform the following functions: acquire a captured image; detect a person contained in the acquired captured image and determine the detection position and detection size of the person in the captured image; based on the determined detection position of the person, determine the size of a frame line different from the detection size; generate a display image by superimposing the determined frame line size onto the detection position of the captured image; and cause a display device to display the generated display image.

[0013] The effects of the invention

[0014] According to the present invention, the possibility of erroneously determining the location of a detected person can be reduced. Attached Figure Description

[0015] Figure 1 This is a block diagram schematically illustrating the functional configuration of the image recognition device according to the first embodiment;

[0016] Figure 2 This is an example of a captured image;

[0017] Figure 3 It is a schematic diagram showing the detection position and dimensions of a person;

[0018] Figure 4 This is an example of a display image with an overlaid border;

[0019] Figure 5 This is a flowchart illustrating the process of the image recognition method according to the first embodiment;

[0020] Figure 6 (a)~ Figure 6 (d) is a diagram showing other examples of the frame lines;

[0021] Figure 7 This is a block diagram schematically illustrating the functional configuration of the image recognition device according to the second embodiment;

[0022] Figure 8 This is a flowchart illustrating the process of the image recognition method according to the second embodiment. Detailed Implementation

[0023] Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. The specific numerical values ​​and other details shown in these embodiments are merely illustrative examples for ease of understanding the invention and are not intended to limit the invention, unless specifically stated otherwise. Furthermore, elements not directly related to the present invention are omitted from the drawings.

[0024] (First Implementation)

[0025] Before detailing the first embodiment, an outline is provided. In the first embodiment, a person is detected in an image, and a display image is generated and displayed by overlaying a frame surrounding the detected person onto the image. In this embodiment, a frame having a size corresponding to the detection size of the detected person is overlaid. If the height of the detected person is less than a predetermined value, for example, less than 120 cm, a display image is generated by overlaying a frame with a size larger in the vertical direction than the detection size of the person. According to this embodiment, by overlaying a frame with a size larger than the detection size onto people shorter than the predetermined value, the possibility of mistakenly identifying the detected person as being at a position farther than their actual location due to the small vertical dimension of the frame is reduced, and the possibility of incorrectly determining the location of the detected person is also reduced.

[0026] Figure 1 This is a block diagram schematically illustrating the functional configuration of the image recognition device 10 according to the first embodiment. The image recognition device 10 includes: an image acquisition unit 12, a person detection unit 14, a frame determination unit 16, an image generation unit 18, a display control unit 20, and a storage unit 22. In this embodiment, the image recognition device 10 is illustrated as being mounted in a vehicle. The image recognition device 10 is, for example, a device mounted in a vehicle, and is composed of a control device such as a CPU (Central Processing Unit) provided by the vehicle. The image recognition device 10 may also be composed of a navigation system provided by the vehicle. In addition, the image recognition device 10 may also be implemented by a portable device such as a camera device or a smartphone. The image recognition device 10 may also include at least one of a camera 26 and a display device 28.

[0027] The functional blocks shown in this embodiment can be implemented in hardware using components such as a computer's CPU and memory, or mechanical devices, and in software using computer programs, but here they are described as functional blocks implemented through their cooperation. Therefore, those skilled in the art should understand that these functional blocks can be implemented in various forms through a combination of hardware and software.

[0028] The image acquisition unit 12 acquires images captured by the camera 26. The camera 26 is mounted on the vehicle and captures images of the area surrounding the vehicle. For example, the camera 26 captures images of the front of the vehicle. The camera 26 can also capture images of the rear or sides of the vehicle. The camera 26 is configured to capture visible light. The camera 26 can be configured to capture color images (red, green, and blue) or monochrome images of visible light.

[0029] Camera 26 can also be configured to capture infrared light. Camera 26 can be a so-called infrared thermal imager, which can image the temperature distribution around the vehicle to identify heat sources present around the vehicle. Camera 26 can be configured to detect mid-infrared light with a wavelength of about 2μm to 5μm, or it can be configured to detect far-infrared light with a wavelength of about 8μm to 14μm.

[0030] The captured images, taken by camera 26 and acquired by image acquisition unit 12, are, for example, continuous motion images at 30 frames per second. The processing described below is performed continuously for motion images.

[0031] Figure 2 An example of a captured image 30 obtained by the image acquisition unit 12 is shown. Figure 2 In the example, multiple people 32a, 32b, 32c, and 32d are included in the captured image 30. The first person 32a and the second person 32b are located at a first distance L1, relatively close to the vehicle. The third person 32c and the fourth person 34d are located at a second distance L2, relatively far from the vehicle. The first distance L1 is, for example, approximately 15m to 20m from the vehicle. The second distance L2 is, for example, approximately 40m to 50m from the vehicle.

[0032] The first person 32a and the third person 32c are approximately 170cm tall, while the second person 32b and the fourth person 32d are approximately 100cm tall; they are considered short individuals. In the illustrated example, the size of the second person 32b, who is short, is similar to the size of the third person 32c, i.e., the detection size in image 30. Detection size refers to the size on the image within the captured image.

[0033] The person detection unit 14 detects people contained in the captured images obtained by the image acquisition unit 12. The person detection unit 14 uses a person recognition dictionary to search for pedestrians, cyclists, etc. For each captured image, the person detection unit 14 uses the person recognition dictionary to search for people and calculates a person score value indicating the probability of a person being present in the retrieved area. For example, the person detection unit 14 detects a person if, in the retrieved area, the person score value is above a predetermined threshold. The person recognition dictionary used by the person detection unit 14 is generated using machine learning, which takes images of people as input and outputs person score values. Convolutional neural networks (CNNs) and the like can be used as the model for machine learning. Figure 2 In the example, the person detection unit 14 performs human detection processing on the captured image 30 acquired by the image acquisition unit 12, and detects multiple people 32a, 32b, 32c, and 32d.

[0034] The shape of the region used in the human detection unit 14 during retrieval is predetermined based on the human recognition dictionary used. In the human recognition dictionary of this embodiment, the retrieval region is rectangular, and the ratio of the vertical to horizontal image size of the region is determined to be approximately 2:1. The shape of the region used for retrieval corresponds, for example, to the vertical and horizontal image sizes of the training images used in machine learning for generating the human recognition dictionary.

[0035] The person detection unit 14 determines the detection position and size of the detected person. The person detection unit 14 assigns a tag number to the detected person and stores the detection position and size in the storage unit 22 for each tag number. The detection position is the position coordinate of the area in the captured image 30 where the person is detected, for example, determined by the position coordinates of the center of the lower end of the area. The lower end of the area where the person is detected corresponds to the position under the person's feet, or the ground contact point. The ground contact point can be used to estimate the distance from the person. The detection size is the size of the area where the person is detected, for example, determined by the vertical image size of the detected area. The detection size can be used to estimate the person's height.

[0036] Figure 3 It is shown schematically in Figure 2 The diagram shows the detection locations and sizes of people 32a to 32d detected in the captured image 30. Figure 3 Shown in Figure 2In the captured image 30, the regions 34a, 34b, 34c, and 34d of people 32a to 32d, and the ground positions 36a, 36b, 36c, and 36d of people 32a to 32d are detected. The person detection unit 14 stores the at least vertical dimensions ha to hd of the regions 34a to 34d, which are detected as the detection dimensions of people 32a to 32d, in the storage unit 22. The vertical dimensions ha to hd of the regions 34a to 34d correspond to the vertical image dimensions of people 32a to 32d in the captured image 30. The vertical dimensions ha to hd of the regions 34a to 34d are determined by the vertical image dimensions from the feet to the head of people 32a to 32d. The person detection unit 14 stores the coordinates of the ground positions 36a to 36d located at the lower center of the regions 34a to 34d as the detection positions of people 32a to 32d in the storage unit 22.

[0037] The frame line determination unit 16 determines whether to overlay a frame line onto the captured image 30 based on the detection results of the human detection unit 14. If a frame line is to be overlaid, the unit determines the position and size of the frame line to be overlaid. The image generation unit 18 overlays the frame line, determined by the frame line determination unit 16, onto the captured image 30 to generate a display image. The display control unit 20 causes the display device 28 to display the display image generated by the image generation unit 18. The display device 28 is, for example, a display mounted in a vehicle.

[0038] The frame line determination unit 16 calculates the distance to the person detected by the person detection unit 14 and stores the calculated distance in the storage unit 22 according to each tag number. The frame line determination unit 16 can also calculate the distance to the person based on the longitudinal position coordinates of the grounding position of the person detected in the captured image. The frame line determination unit 16 can also use a table or formula showing the correlation between the longitudinal position coordinates of the captured image and the distance to calculate the distance to the person. In this case, the distance to the person refers to the distance from the vehicle equipped with the camera 26 to the person.

[0039] The frame line determination unit 16 determines whether to overlay a frame line based on the distance from the vehicle to the person. The frame line determination unit 16 determines as follows: if the distance from the vehicle to the person is less than a threshold (e.g., 40m), a frame line is overlaid; if the distance is greater than or equal to the threshold (e.g., 40m), no frame line is overlaid. The frame line determination unit 16 can also determine the color of the frame line to be overlaid based on the distance from the vehicle to the person. For example, if the distance to the person is less than a first threshold (e.g., 20m), a red frame line is used; if the distance is greater than or equal to the first threshold (e.g., 20m) but less than a second threshold (e.g., 40m), a yellow frame line is used. Furthermore, if the distance to the person is greater than or equal to the second threshold, no frame line may be used. The frame line determination unit 16 stores the determined need for a frame line and the color of the frame line in the storage unit 22 according to each tag number.

[0040] The frame line determination unit 16 calculates the height of the person detected by the person detection unit 14 and stores the calculated height in the storage unit 22 according to each tag number. The frame line determination unit 16 calculates the height of the person based on the ground position of the detected person and the longitudinal detection dimension. The frame line determination unit 16 may also use a table or formula that represents the ratio of the detection dimension determined according to the longitudinal position coordinates of the captured image 30 to the height to calculate the height.

[0041] The frame determination unit 16 determines the frame size based on the height of the person whose distance is less than a threshold. The frame determination unit 16 determines the frame size according to the person's height. When the person's height is above a predetermined value, such as 120cm or more, the frame determination unit 16 sets the vertical dimension of the frame to be the same as the person's detection dimension. When the person's height is below the predetermined value, such as less than 120cm, i.e., a short person, the frame determination unit 16 sets the vertical dimension of the frame to be larger than the person's detection dimension. The vertical dimension of the frame for people whose height is below the predetermined value is, for example, a large size equivalent to a height of 150cm to 170cm at the detection position, larger than the detected person's detection dimension.

[0042] The size of the frame for individuals taller than a predetermined value varies based on the person's measurement dimensions, taking into account both height and distance. Specifically, the frame size is proportional to the person's height and inversely proportional to their distance. If the distance is constant, the frame size varies with height. For example, a frame superimposed on a 180cm tall person at a distance of 20m is vertically larger than a frame superimposed on a 160cm tall person at the same distance. Conversely, if the height is constant, the frame size varies with distance. For example, a frame superimposed on a 180cm tall person at a distance of 10m is vertically larger than a frame superimposed on a 180cm tall person at a distance of 20m.

[0043] The frame size for people shorter than a predetermined value does not necessarily follow the person's detection dimensions. For example, a fixed frame size can be set for people shorter than a predetermined value. In this case, for example, if a person with a height of 100cm and a person with a height of 90cm are detected, a frame size equivalent to a height of 180cm is set for both. Alternatively, for people shorter than a predetermined value, the frame size can be set as follows: the vertical dimension of the frame is a value obtained by multiplying the detected person's height by a coefficient such as 1.5. In this case, for example, a frame size equivalent to a height of 150cm is set for a person with a height of 100cm, and a frame size equivalent to a height of 135cm is set for a person with a height of 90cm.

[0044] Figure 4 This is a diagram showing an example of a display image 40 with overlaid borders 42 and 44. Figure 4 middle, Figure 2 The height of the first person 32a is determined to be above a predetermined value based on the grounding position 36a and the vertical size Ha in the displayed image 40, therefore, a frame line 42 of the detection size of the first person 32a is superimposed. That is, the vertical detection size Ha of the first person 32a is used as the vertical dimension, and a frame line 42 with an aspect ratio of 2:1 is superimposed. The height of the second person 32b is determined to be below a predetermined value based on the grounding position 36a and the vertical size Hb in the displayed image 40, therefore, a frame line 44 of a size larger in the vertical direction than the detection size of the second person 32b is superimposed as a frame line of a size different from the detection size of the second person 32b. Figure 4 In the image 40, the frame lines 42 and 44 are represented by thick lines. Figure 4 The outlines of the first division 34a and the second division 34b, represented by dashed lines, are not depicted in the displayed image 40. Figure 4 In the middle, only the first person 32a and the second person 32b, which are located at the first distance L1, are superimposed with frame lines 42 and 44, while the third person 32c and the fourth person 32d, which are located at the second distance L2, are not superimposed with frame lines.

[0045] The frame line 42 superimposed on the first person 32a whose height is above a predetermined value has a size corresponding to the detection size of the first person 32a, for example, it has the same size as the first division 34a that detects the first person 32a. The size of the frame line 42 can be slightly different from the size of the first division 34a, for example, the difference can be about 5% to 10%. Therefore, the vertical size Ha of the frame line 42 can be the same as the vertical size ha of the first division 34a, or it can be slightly smaller or slightly larger. The aspect ratio of the frame line 42 is the same as the aspect ratio of the detection size of the first person 32a, which is 2:1. The frame line 42 is superimposed with the grounding position 36a of the first person 32a as a reference, and the lower center of the frame line 42 is aligned with the grounding position 36a.

[0046] The frame line 44 superimposed on the second person 32b, whose height is less than the predetermined value, is larger than the detection size of the second person 32b. The vertical size Hb of the frame line 44 is significantly larger than the vertical size hb of the second region 34b that detected the second person 32b, for example, more than 10% larger. Figure 4In this example, the vertical dimension Hb of frame line 44 is approximately 1.5 times the vertical dimension hb of the second segment 34b. The aspect ratio of frame line 44 is the same as that of the detection dimension of the second person 32b, which is 2:1. That is, the aspect ratio of frame line 44 is the same as that of frame line 42. Frame line 44 is superimposed with the grounding position 36b of the second person 32b as a reference, and the lower center of frame line 44 is aligned with the grounding position 36b. As a result, there is a gap 46 between the head of the second person 32b and the upper end of frame line 44. The vertical dimension of gap 46 is, for example, more than 10% of the vertical dimension hb of the second segment 34b, for example, about 20% to 50%.

[0047] according to Figure 4 The display image 40 has a frame line 44 superimposed on the second person 32b, which is larger than their actual height. Therefore, compared to the case where a frame line superimposed on the detected size of the second person 32b, i.e., the size corresponding to the second division 34b shown by the dashed line, makes the second person 32b appear larger. Furthermore, the size of the frame line 44 superimposed on the second person 32b is close to the size of the frame line 42 superimposed on the first person 32a, making it easier to understand that the first person 32a and the second person 32b are at approximately the same distance. Additionally, since the grounding positions 36a and 36b of the first person 32a and the second person 32b coincide with the lower ends of the frame lines 42 and 44, the lower ends of the frame lines 42 and 44 also make it easier to understand that the first person 32a and the second person 32b are at the same distance. As a result, the possibility of mistakenly believing that the second person 32b is at a greater distance than they actually are due to their smaller appearance is reduced. For example, it can reduce the possibility of mistakenly believing that the second person 32b is located near the second distance L2 where the third person 32c, which has the same appearance size as the second person 32b, is located.

[0048] Figure 5 This is a flowchart illustrating the process of the image recognition method according to the first embodiment. Figure 5 The processing can begin and end when the image recognition device 10 is installed in a vehicle, such as when the vehicle is started or stopped, or when the engine or power supply is started or stopped. Alternatively, it can also be started and stopped by user operation.

[0049] First, as processing begins, the image acquisition unit 12 acquires a captured image 30 from the camera 26 (S10), and the person detection unit 14 begins detecting people contained in the acquired captured image (S12). If a person is detected (S12 "Yes"), the person detection unit 14 determines the detection position and detection size of the person (S14). If the distance to the detected person is less than a threshold (S16 "Yes") and the height of the detected person is less than a predetermined value (S18 "Yes"), the frame determination unit 16 determines the size of the frame superimposed on the person's detection position to be different from the person's detection size; specifically, it determines it to be larger than the person's detection size, and the image generation unit 18 generates a display image with the frame superimposed on the determined size (S20). If the height of the detected person is not less than the predetermined value (S18 "No"), a display image with a frame superimposed on the person's detection position and a size corresponding to the person's detection size is generated (S22). The generated display image is displayed on the display device 28 via the display control unit 20 (S24). If the distance to the person is not less than the threshold (S16 "No"), skip the processing of S18 to S22 and do not overlay the frame line on the person. If no person is detected (S12 "No"), skip the processing of S14 to S22 and directly display the captured image 30 without overlay frame line as the display image (S24).

[0050] In the above process, when multiple people are detected in the captured image 30, the processing in S12 to S22 determines whether a frame line needs to be superimposed on each of the detected individuals and the size of the superimposed frame line.

[0051] One embodiment of this method may also be a program. This program may be configured to enable a computer to perform the following functions: acquire a captured image; detect a person contained in the acquired captured image, and determine the detection position and detection size of the person in the captured image; calculate the height of the person based on the detection position and detection size; determine the size of a frame line different from the detection size based on the calculated height of the person; generate a display image that superimposes the determined frame line size onto the detection position of the captured image; and cause a display device to display the generated display image.

[0052] The present invention has been described above with reference to the embodiments described above, but the present invention is not limited to the embodiments described above. Structures with appropriate combinations or substitutions of the structures shown in the embodiments are also included in the present invention.

[0053] In a variation of the first embodiment, it can also be used in conjunction with... Figure 4 Different methods are used to display the border for people whose height is smaller than the predetermined value. Figure 6 (a)~ Figure 6 (d) is shown as... Figure 4The diagram shows examples of other frames 44a to 44d corresponding to the frame 44 shown. Figure 6 The 32 people shown are all shorter than the predetermined height.

[0054] Figure 6 Compared to the size of the region 34 where the person 32 was detected, the frame line 44a shown in (a) is only increased in length, while the width remains unchanged. Therefore, Figure 6 The aspect ratio of frame line 44a of (a) is greater than that of the aspect ratio of the detection size of person 32, which is approximately 2:1. Figure 6 The vertical size of the frame line 44a of (a) is, for example, more than twice the horizontal size, approximately 2.2 to 3.5 times.

[0055] Figure 6 Compared to the size of the region 34 where the person 32 was detected, the frame line 44b shown in (b) is only increased in the horizontal direction, while the vertical direction remains unchanged. Therefore, Figure 6 The aspect ratio of the frame line 44b of (b) is smaller than that of the aspect ratio of the division 34 of the detected person 32, which is approximately 2:1. Figure 6 The horizontal size of the frame line 44b in (b) can also be smaller than the vertical size of the frame line 44b, for example, it can be about 0.6 to 1 times the vertical size.

[0056] Figure 6 The frame line 44c shown in (c) is with Figure 4 The frame line 44 has the same shape and size, but the position of the superimposed frame line 44c is different. Figure 6 The lower end of the frame line 44c of (c) is located below the grounding position 36 of person 32. Figure 6 The center position of the frame line 44c of (c) coincides, for example, with the center position of the region 34 where the person 32 was detected. By offsetting and superimposing the frame line 44c downwards, it can give the impression that the person 32 exists in a closer position, thereby further emphasizing the presence of the short person.

[0057] Figure 6 The frame line 44d shown in (d) is with Figure 4 The frame line 44 has the same shape and size, but the position of the superimposed frame line 44d is different, and the superimposed position is offset in the left and right directions. Figure 6 The frame line 44d of (d) is offset and superimposed on the direction of movement or line of sight 38 of the detected person 32. Figure 6In example (d), the direction of movement or gaze of person 32 is to the right, and the center of frame line 44d is located to the right of the ground position 36. By offsetting and superimposing frame line 44d on the direction of movement or gaze of person 32 38, the direction of movement or gaze of person 32 can be indicated, thus suggesting the actions of the shorter person. Furthermore, if the direction of movement or gaze of person 32 38 is not left or right but downward or tilted, the superimposed position of frame line 44d can also be offset downward or tilted.

[0058] The direction of movement of person 32 can also be determined based on the shift of the person's detection position in each frame of the captured image, the orientation of the person's hands and feet, etc. Furthermore, the gaze direction 38 of person 32 can be determined based on the orientation of person 32's face, or the orientation of person 32's face can be considered as the gaze direction 38. The orientation of person 32's face is determined based on the person detection results of person detection unit 14.

[0059] In the above embodiments, the case where the lower end of the detected area is set as the grounding position is shown. In another embodiment, the grounding position can also be detected based on the image content of the detected area. For example, if the feet of a person included in the detected area are hidden and cannot be seen, the height of the person can be estimated based on the position and size of the person's head included in the detected area, and the grounding position can be detected based on the estimated height.

[0060] In the above embodiments, the necessity of the frame line and its color are determined based on the distance from the person. In another embodiment, the necessity of the frame line and its color can also be determined based on the detection position of the person. For example, the frame line determination unit 16 can also maintain the longitudinal position coordinates of the captured image corresponding to the distance (e.g., 20m or 40m) as a threshold, and determine whether to overlay the frame line and its color based on the threshold of the position coordinates.

[0061] In the above embodiments, it is shown that the size of the frame line for people whose height is less than a predetermined value is set to a size different from the detection size of the person, for example, by enlarging it vertically, and a frame line of a size different from the detection size of the person is superimposed. In another embodiment, for people whose height is less than the predetermined value, a first frame line corresponding to the detected size and a second frame line that is enlarged vertically compared to the first frame line may also be superimposed. In this case, the frame line surrounding the detected person becomes a double frame line. Furthermore, an outline line may also be superimposed on the frame line to depict the shape of a person whose height is less than the predetermined value who exists within the vertically enlarged frame line.

[0062] In the above implementation, no border is overlaid on people at a distance of more than a threshold. In another implementation, a border may be overlaid on people at a distance of more than a threshold. For example, even if no person less than the threshold is detected, a border may be overlaid on people at a distance of more than the threshold. Alternatively, a border may be overlaid on people at a distance of more than the threshold regardless of whether a person less than the threshold is detected.

[0063] In the above embodiments, a case is shown where the frame size for people whose height is less than a predetermined value is enlarged vertically to a size different from the detection size of the person, and a frame size different from the detection size of the person is superimposed. In another embodiment, the need for such processing can be determined based on the distance between a person whose height is less than a predetermined value and a person whose height is greater than or equal to the predetermined value. For example, if there is a person whose height is greater than or equal to the predetermined value in the vicinity of a person whose height is less than the predetermined value, for example, within a range equivalent to 2 meters, the possibility of misidentification due to the presence of a person whose height is greater than or equal to the predetermined value is reduced. Therefore, if there is no person whose height is greater than or equal to the predetermined value in the vicinity of a person whose height is less than the predetermined value, for example, within a predetermined range of 2 meters, the size of the frame for people whose height is less than the predetermined value can be set to a size different from the detection size of the person.

[0064] In the above embodiment, the frame line determination unit 16 calculates the distance to the detected person and calculates the height of the detected person. In another embodiment, instead of the frame line determination unit 16, the person detection unit 14 may calculate the distance to the detected person and calculate the height of the detected person. In this case, the person detection unit 14 may also detect people contained in the captured image acquired by the image acquisition unit 12, determine the detection position and detection size of the person in the captured image, and calculate the height of the person based on the detection position and detection size. The frame line determination unit 16 may also determine the size of the frame line, which is different from the detection size, based on the height of the person calculated by the person detection unit 14. If the height of the person detected by the person detection unit 14 is less than a predetermined value, the frame line determination unit 16 may also make the size of the frame line larger than the detection size, at least in the vertical direction.

[0065] (Second Implementation)

[0066] Next, a second embodiment of the present invention will be described with reference to the accompanying drawings. In the second embodiment, the height of the person is not calculated; instead, the size of the frame line, which differs from the detection size, is determined based on a predetermined size corresponding to the person's detection position. Hereinafter, the second embodiment will be described focusing on its differences from the first embodiment; for commonalities with the first embodiment, accompanying drawings or descriptions will be omitted as appropriate.

[0067] Figure 7This is a block diagram schematically illustrating the functional configuration of the image recognition device 10a according to the second embodiment. The image recognition device 10a includes an image acquisition unit 12, a person detection unit 14, a frame determination unit 16a, an image generation unit 18, a display control unit 20, and a storage unit 22. In the second embodiment, the image acquisition unit 12, the person detection unit 14, the image generation unit 18, the display control unit 20, and the storage unit 22 are configured in the same way as in the first embodiment.

[0068] The frame determination unit 16a determines the size of the frame line, which differs from the detection size, based on the detection position of the person determined by the person detection unit 14. Unlike the first embodiment, the frame line determination unit 16a determines the frame line size based on the person's height, but rather on the detection position and a predetermined size corresponding to that position. Specifically, if the detection size of the person detected by the person detection unit 14a is smaller than the predetermined size corresponding to the detection position, the frame line size is determined such that it is at least vertically larger than the detection size. The predetermined size corresponding to the detection position is, for example, equal to the detection size of a person with a height of 170cm located at the detection position. The predetermined size can also vary depending on the detection position; for example, the lower the detection position in the captured image, the larger the predetermined size, and the higher the detection position in the captured image, the smaller the predetermined size. The predetermined size corresponding to the detection position is, for example, pre-stored in the storage unit 22. The predetermined size corresponding to the detection position can be determined using tables, mathematical formulas, or other methods that represent the correlation between the detection position and the predetermined size. The frame detection unit 16a, apart from the frame size determination processing corresponding to a person's height, can be configured in the same way as the frame determination unit 16 in the first embodiment.

[0069] Reference Figure 3 as well as Figure 4 The processing of the frame line determination unit 16a will be explained. The person detection unit 14 detects a first person 32a and determines the detection size and detection position (i.e., grounding position 36a) of the area 34a where the first person 32a is detected. The frame line determination unit 16a compares the detection size of the first person 32a with the specified size corresponding to the grounding position 36a of the first person 32a. If the specified size is equivalent to the detection size of a person with a height of 170cm, and the height of the first person 32a is 170cm or more, then the detection size of the first person 32a is greater than or equal to the specified size corresponding to the grounding position 36a. In this case, the frame line determination unit 16a determines the detection size of the area 34a where the first person 32a is detected as the size of the frame line. As a result, for example, as... Figure 4 As shown, a frame line 42 with the same size as the detection size of the first person 32a is superimposed.

[0070] The person detection unit 14 detects the second person 32b and determines the detection size and detection position (i.e., grounding position 36b) of the area 34b where the second person 32b is detected. The frame line determination unit 16a compares the detection size of the second person 32b with the specified size corresponding to the grounding position 36b of the second person 32b. If the specified size is equivalent to the detection size of a person with a height of 170cm, and the height of the second person 32b is approximately 100cm, the detection size of the second person 32b is smaller than the specified size corresponding to the grounding position 36b. In this case, the frame line determination unit 16a determines the frame line size as a size that is at least larger in the longitudinal direction than the detection size of the area 34b where the second person 32b is detected. The frame line determination unit 16a may also determine the same size as the specified size corresponding to the grounding position 36b as the frame line size of the second person 32b. As a result, for example, as... Figure 4 As shown, the superimposed frame 44 is larger than the detection size of the second person 32b at least in the vertical direction.

[0071] Figure 8 This is a flowchart illustrating the process of the image recognition method according to the second embodiment. The image acquisition unit 12 acquires a captured image 30 from the camera 26 (S50), and the person detection unit 14 detects people contained in the acquired captured image (S52). If a person is detected (S52 "Yes"), the person detection unit 14 determines the detection position and detection size of the person (S54). If the distance to the detected person is less than a threshold (S56 "Yes") and the detection size of the detected person is less than a predetermined size corresponding to the detection position (S58 "Yes"), the frame determination unit 16a determines the size of the frame superimposed on the detection position of the person to be a size different from the detection size of the person, specifically, a size larger than the detection size of the person, and the image generation unit 18 generates a display image with the frame superimposed on the determined size (S60). If the detection size of the detected person is less than the predetermined size corresponding to the detection position (S58 "No"), a display image with the frame superimposed on the detection position of the person to be the size corresponding to the detection size of the person is generated (S62). The generated display image is displayed on the display device 28 via the display control unit 20 (S64). If the distance to the person is not less than a threshold ("No" in S56), the processing of S58 to S62 is skipped, and no frame line is superimposed on the person. If no person is detected ("No" in S52), the processing of S54 to S62 is skipped, and the captured image 30 without superimposed frame line is directly displayed as the display image (S64).

[0072] In the second embodiment, the same effect as in the first embodiment can be achieved. Furthermore, variations of the first embodiment can also be applied to the second embodiment.

[0073] In the embodiments described above, the case of calculating the distance to a person based on captured images is shown. In another embodiment, a sensor different from camera 26 can also be used to measure the distance to a person. For example, any ranging sensor such as an ultrasonic sensor, radar sensor, or LIDAR (Light Detection and Ranging) can also be used.

[0074] In the embodiments described above, the image recognition device 10 is shown as being mounted on a vehicle. In another embodiment, the location of the image recognition device 10 is not particularly limited, and it can be used for any purpose.

[0075] Industrial applicability

[0076] According to the present invention, the possibility of erroneously determining the location of a detected person can be reduced.

[0077] 10…Image recognition device, 12…Image acquisition unit, 14…Person detection unit, 16, 16a…Frame line determination unit, 18…Image generation unit, 20…Display control unit, 28…Display device, 30…Image capture, 32…Person, 34…Division, 36…Grounding position, 40…Display image, 42…Frame line, 44…Frame line.

Claims

1. An image recognition device, characterized in that, include: The image acquisition unit acquires captured images; The human detection unit detects people contained in the captured images acquired by the image acquisition unit, and determines the detection position and size of the people in the captured images; The frame line determination unit determines the size of the frame line, which is different from the detection size, based on the detection position of the person determined by the person detection unit. The image generation unit generates a display image, which is obtained by superimposing a frame of a size determined by the frame determination unit onto the detection position of the captured image; as well as The display control unit causes the display device to display the image generated by the image generation unit. If the detection size of a person determined by the person detection unit is smaller than the specified size corresponding to the detection position, the frame determination unit sets the size of the frame to be at least larger than the detection size in the longitudinal direction.

2. An image recognition device, characterized in that, include: The image acquisition unit acquires captured images; The human detection unit detects people contained in the captured images acquired by the image acquisition unit, and determines the detection position and size of the people in the captured images; The frame line determination unit determines the size of the frame line, which is different from the detection size, based on the detection position of the person determined by the person detection unit. The image generation unit generates a display image, which is obtained by superimposing a frame of a size determined by the frame determination unit onto the detection position of the captured image; as well as The display control unit causes the display device to display the image generated by the image generation unit. The frame line determination unit calculates the height of the person based on the detection position and detection size determined by the person detection unit, and determines the size of the frame line that is different from the detection size based on the calculated height of the person.

3. The image recognition device according to claim 2, characterized in that, If the height of a person detected by the person detection unit is less than a predetermined value, the frame determination unit sets the size of the frame to be at least larger than the detection size in the vertical direction.

4. The image recognition device according to claim 1 or 2, characterized in that, The human detection unit determines the grounding position of the person in the captured image. The image generation unit generates a display image superimposed with the frame line, such that the lower end of the frame line of the size determined by the frame line determination unit is located at the ground position.

5. The image recognition device according to claim 1 or 2, characterized in that, The human detection unit determines the grounding position of the person in the captured image. The image generation unit generates a display image superimposed with the frame line, such that the lower end of the frame line of the size determined by the frame line determination unit is located at a position lower than the ground position.

6. An image recognition method, executed by an image recognition device, the method comprising: The acquisition steps include acquiring captured images; The steps include detecting people contained in the acquired captured images and determining the detection location and size of the people in the captured images; The decision-making step involves determining the size of a frame line that differs from the detection size, based on the determined detection position of the person. The generation step generates a display image that superimposes a frame of the determined size onto the detection location of the captured image; as well as The display step involves causing the display device to display the generated display image; In the decision step, if the detection size of the person determined in the determination step is smaller than the specified size corresponding to the detection position, the size of the frame line is set to be at least larger than the detection size in the longitudinal direction.

7. An image recognition method, executed by an image recognition device, the method comprising: The acquisition steps include acquiring captured images; The steps include detecting people contained in the acquired captured images and determining the detection location and size of the people in the captured images; The decision-making step involves determining the size of a frame line that differs from the detection size, based on the determined detection position of the person. The generation step generates a display image that superimposes a frame of the determined size onto the detection location of the captured image; as well as The display step involves causing the display device to display the generated display image; In the decision step, the height of the person is calculated based on the detection position and detection size determined in the determination step, and the size of the frame line that is different from the detection size is determined based on the calculated height of the person.

8. A computer-readable storage medium storing a program that enables a computer to perform the following functions: Acquire captured images; Detect people contained in the acquired captured images, and determine the detection position and size of the people in the captured images; Based on the determined detection position of the person, the size of the frame line, which is different from the detection size, is determined; Generate a display image that overlays a frame of the determined size onto the detection location of the captured image; as well as The generated display image is displayed on the display device. If the detected size of the person is smaller than the specified size corresponding to the detection position, the size of the frame line is set to be at least larger than the detected size in the longitudinal direction.

9. A computer-readable storage medium storing a program that enables a computer to perform the following functions: Acquire captured images; Detect people contained in the acquired captured images, and determine the detection position and size of the people in the captured images; Based on the determined detection position of the person, the size of the frame line, which is different from the detection size, is determined; Generate a display image that overlays a frame of the determined size onto the detection location of the captured image; as well as The generated display image is displayed on the display device. The height of the person is calculated based on the determined detection position and detection size, and the size of the frame line that is different from the detection size is determined based on the calculated height of the person.