Information processing apparatus, information processing method, and computer program product

By using fisheye cameras and deep learning technology, and utilizing specific distances and threshold ranges within the area of ​​moving objects, human bodies can be identified. This solves the processing load and accuracy problems of human body detection in factory automation, and enables real-time and efficient human body detection.

CN115803780BActive Publication Date: 2026-06-26OMRON CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
OMRON CORP
Filing Date
2021-06-17
Publication Date
2026-06-26

Smart Images

  • Figure CN115803780B_ABST
    Figure CN115803780B_ABST
Patent Text Reader

Abstract

An information processing apparatus includes a moving object detection section that detects a moving object from a captured image captured by a fisheye camera; a human body determination section that determines whether the moving object is a human body by comparing a distance between prescribed two points on an outline of a moving object region including the moving object with a range of a threshold value set based on a height of a human body measured based on a position of the moving object within the captured image; and a human body detection section that detects a human body from the moving object region including the moving object determined by the human body determination section to be a human body.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to an information processing apparatus and an information processing method. Background Technology

[0002] In recent years, the factory automation (FA) market has seen applications that utilize human body information detected by image sensors to optimize factory operations and improve safety by analyzing workers' work times or movement patterns. While deep learning technology can be used for human body detection, real-time analysis can be challenging due to the time-consuming nature of detection. Patent Document 1 discloses a technique that reduces the processing load when detecting objects from an animated image by using the region of moving objects that changes between frames constituting the animated image as the detection target region.

[0003] Existing technical documents

[0004] Patent documents

[0005] Patent Document 1: Japanese Patent Application Publication No. 2018-128885 Summary of the Invention

[0006] The problem that the invention aims to solve

[0007] Even when the detection area is limited to moving objects, moving objects other than humans, such as cardboard boxes on a conveyor belt in a factory, may still be included in the parsed object, thus the processing load for human detection is not sufficiently reduced. Furthermore, when using object shape information for detection, human bodies whose shape changes with posture may not be detected with good accuracy.

[0008] In one aspect, the present invention aims to provide a technique for real-time and highly accurate detection of the human body.

[0009] Methods for solving problems

[0010] To achieve the above objectives, the present invention employs the following structure.

[0011] A first aspect of this disclosure is an information processing apparatus comprising: a moving object detection unit for detecting moving objects from captured images taken by a fisheye camera; a human body determination unit for determining whether the moving object is a human body by comparing a distance between two predetermined points on the outline of a moving object region containing the moving object with a range of a threshold value set based on the height of a human body measured from the position of the moving object in the captured image; and a human body detection unit for detecting human bodies from the moving object region containing the moving object determined by the human body determination unit to be a human body.

[0012] "The distance between two points on the outline of the region containing the moving object" is equivalent to the height of a human body when the moving object is a human. Hereinafter, it is also referred to as the length of the moving object. "The range of thresholds" can be set to a range of values ​​that the height of a human can take at a position within the captured image when photographing a human. The information processing device limits the object region for detecting a human body through moving object detection, and further limits the detection of a human body when the moving object is a human. Therefore, the processing load based on human detection is reduced, and the information processing device can detect a human body with good accuracy in real time.

[0013] The distance between two specified points on the outline of the moving object region containing the moving object can also be the distance between a first coordinate and a second coordinate, which is different from the first coordinate. The first coordinate is the point within the moving object region that is closest to or farthest from the center coordinate of the captured image, and the second coordinate is the intersection of a straight line passing through the center coordinate and the first coordinate with the outline of the moving object region. The information processing device can calculate the length of the moving object using a simple method.

[0014] The distance between two points on the outline of a moving object region containing a moving object can also be the distance between two points where the line drawn from the center of gravity coordinates of the moving object region and the center coordinates of the captured image intersects the outline of the moving object region. Even though the shape of the moving object region, which is a human body, changes according to postures such as reaching out, the center of gravity of the moving object region lies on the human body because the hand is thinner than the torso. Therefore, by calculating the distance between two points where the line drawn from the center of gravity coordinates and the center coordinates of the captured image intersects the outline of the moving object region, the information processing device can obtain the height of a human body with relatively high accuracy.

[0015] The threshold range can also be set for each of the multiple regions into which the captured image is divided. Since the way a person appears varies depending on their position within the captured image, the information processing device sets the threshold range based on the perceived length of the human body for each of the divided regions. Thus, the information processing device can determine with relatively high accuracy whether a detected moving object is a human body.

[0016] The moving object detection unit can also detect moving objects based on background subtraction or inter-frame subtraction. Furthermore, the moving object detection unit can also detect moving objects based on the motion and direction of movement of objects commonly reflected in consecutive frames of the captured image. By detecting moving objects and limiting the human detection target area to the moving object area that includes the detected moving object, the overhead caused by unnecessary human detection can be reduced.

[0017] The information processing device may also include an output unit that outputs information about the human body detected by the human body detection unit. The information processing device can output the detection results of the human body based on the human body detection unit to a display or similar device in real time and provide prompts to the user.

[0018] The information processing device may also include a camera unit for capturing images. By integrating the information processing device with the camera unit, a simple structure can be achieved.

[0019] A second aspect of this disclosure is an information processing method comprising the following steps performed by a computer: a moving object detection step, detecting a moving object from an image captured by a fisheye camera; a human body determination step, determining whether the moving object is a human body by comparing a distance between two predetermined points on the outline of a moving object region containing the moving object with a range of a threshold set based on the height of a human body measured from the position of the moving object in the captured image; and a human body detection step, detecting a human body from the moving object region of the moving object that was determined to be a human body in the human body determination step.

[0020] Invention Effects

[0021] According to the present invention, human body can be detected in real time with good accuracy. Attached Figure Description

[0022] Figure 1 This is a diagram illustrating an application example of the information processing apparatus involved in the implementation method.

[0023] Figure 2 This is a diagram illustrating the hardware structure of an information processing device.

[0024] Figure 3 This is a diagram illustrating the functional structure of an information processing device.

[0025] Figure 4 This is a flowchart illustrating human body detection and processing.

[0026] Figure 5 This is a diagram illustrating the detection of moving objects.

[0027] Figure 6 This is the first example illustrating the calculation of the length of a moving object.

[0028] Figure 7 (A) and Figure 7 (B) is a diagram illustrating the second example of calculating the length of a moving object.

[0029] Figure 8 It is a chart that represents the length of the human body corresponding to the distance from the center.

[0030] Figure 9 (A) and Figure 9 (B) is a diagram showing an example of a threshold set for each area of ​​the camera's field of view.

[0031] Figure 10 This is a diagram illustrating an example of determining whether a moving object is a human body.

[0032] Figure 11 A and Figure 11 B is a diagram illustrating a method for detecting the human body from the region of a moving object. Detailed Implementation

[0033] Hereinafter, embodiments of one aspect of the present invention will be described with reference to the accompanying drawings.

[0034] <Application Example>

[0035] Figure 1 This diagram illustrates an application example of the information processing apparatus according to the embodiment. The information processing apparatus 1 acquires a camera image (captured image) captured by the camera 10 (camera unit). The camera 10 is, for example, an ultra-wide-angle camera equipped with a fisheye lens capable of acquiring a wide range of image information. Cameras equipped with fisheye lenses are also referred to as fisheye cameras, omnidirectional cameras, spherical cameras, etc., and the term "fisheye camera" is used in this specification.

[0036] Images captured by a fisheye camera distort the appearance of the subject depending on its position within the frame. For example, when a fisheye camera is configured to look down from the ceiling, the feet of a person in the image will be facing inwards, while the top of their head will be pointing outwards. The human body at the periphery of the image is called a frontal image, back image, or side image, while the image in the center is called a top surface image.

[0037] The information processing device 1 detects moving objects from the captured images obtained by the camera 10 and determines whether they are human bodies. Because the human body captured by the fisheye camera is distorted, the distance between the feet and the top of the head (the height of the human body) varies depending on the position within the captured image.

[0038] The information processing device 1 stores in advance the distance between the feet and the top of the head, which is assumed to be based on the position within the captured image, as a threshold range for determining whether the detected moving object is a human body. The information processing device 1 can determine whether the moving object is a human body by comparing the distance between two predetermined points (the length of the moving object) on the outline of the moving object region containing the detected moving object with the threshold range preset according to the position within the captured image.

[0039] The information processing device 1 analyzes and identifies moving object regions that are determined to be human bodies, and then detects the human body. The information processing device 1 can use general object recognition algorithms to detect the human body. For example, the human body detection algorithm can use a recognizer that combines image features and boosting methods, such as HoG or Haar-like algorithms. Alternatively, the human body detection algorithm can also use human body recognition based on deep learning (such as R-CNN, Faster R-CNN, YOLO, SSD, etc.).

[0040] As described above, the information processing device 1 detects moving objects from the captured image and compares them with a range of thresholds preset according to their positions within the captured image, thereby determining the degree to which they resemble a human body. The information processing device 1 detects human bodies by defining a region in the captured image containing the moving object determined to be a human body. Therefore, the workload caused by human body detection processing is reduced.

[0041] <Implementation Method>

[0042] (Hardware structure)

[0043] refer to Figure 2 An example of the hardware structure of the information processing device 1 will be described. Figure 2 This is a diagram illustrating the hardware structure of an information processing device 1. The information processing device 1 includes a processor 101, a main storage device 102, an auxiliary storage device 103, a communication interface (I / F) 104, and an output device 105. The processor 101 reads a program stored in the auxiliary storage device 103 into the main storage device 102 and executes it, thus functioning as... Figure 3 The functions of each functional structure are described below. Communication interface 104 is an interface for wired or wireless communication. Output device 105 is, for example, a device for outputting data to a display or similar device.

[0044] The information processing device 1 can be a general-purpose computer such as a personal computer, server computer, tablet terminal, or smartphone, or an embedded computer such as an onboard computer. For example, the information processing device 1 can be implemented through distributed computing based on multiple computer devices, or by implementing parts of each functional unit through a cloud server. Furthermore, parts of each functional unit of the information processing device 1 can also be implemented using dedicated hardware devices such as FPGAs or ASICs.

[0045] The information processing device 1 is connected to the camera 10 via wired (USB cable, LAN cable, etc.) or wireless (WiFi, etc.) connection to receive image data captured by the camera 10. The camera 10 is a camera device having an optical system including a lens and an image sensor (CCD, CMOS, etc.).

[0046] Alternatively, the information processing device 1 can be integrated with the camera 10 (camera unit). Furthermore, some of the processing performed by the information processing device 1, such as moving object detection and human body identification of the captured images, can also be performed by the camera 10. Moreover, the results of human body detection performed by the information processing device 1 can be sent to an external device and displayed to the user.

[0047] (Functional Structure)

[0048] refer to Figure 3 An example of the functional structure of the processing device 1 will be described. Figure 3 This is a diagram illustrating the functional structure of the information processing device 1. The information processing device 1 includes a moving object detection unit 11, a human body determination unit 12, a human body detection unit 13, an output unit 14, and a determination information database 15 (determination information DB15).

[0049] The moving object detection unit 11 detects moving objects from the captured images obtained by the camera 10. For example, the moving object detection unit 11 can use background subtraction to detect regions that change between the captured image and a pre-prepared background image, and inter-frame subtraction to detect regions that change between frames. Moving objects can also be detected based on the difference between background subtraction and inter-frame subtraction. Furthermore, the method for detecting moving objects can also utilize optical flow to estimate the motion and direction of movement of the object from portions that are commonly reflected within consecutive frames.

[0050] The human body determination unit 12 determines whether the moving object detected by the moving object detection unit 11 is a human body. For example, the human body determination unit 12 can determine whether the moving object is a human body by comparing the length of the detected moving object with a range of thresholds set based on the height of a human body measured at the position of the moving object.

[0051] The human detection unit 13 detects (identifies) the human body in the region of a moving object determined by the human body determination unit 12 to be a human body. Human body detection can be achieved using general object recognition technologies such as deep learning.

[0052] The output unit 14 outputs (displays) the information of the detected human body to an output device 105 such as a display. The output unit 14 can display the human body detected by the human body detection unit 13 by surrounding it with a frame, or extract it from the captured image for display.

[0053] The determination information database 15 stores information used by the human body determination unit 12 to determine whether a moving object detected from a captured image is a human body. This information includes, for example, the length (height) of a human body imagined based on its distance from the center within the captured image from the camera 10. The human body determination unit 12 can determine whether a moving object is a human body by comparing its length with the length of a human body stored in the determination information database 15 as a threshold range.

[0054] (Human body detection and processing)

[0055] refer to Figure 4 This describes the entire process of human body detection and processing. Figure 4 This is a flowchart illustrating human detection processing. Human detection processing begins, for example, by turning on the power to camera 10 and the information processing unit 1 receiving the captured image from camera 10. Furthermore, Figure 4 The human detection processing shown is performed frame by frame of the captured image. Figure 4 In the flowchart, "captured image" is explained as one frame contained in the captured image.

[0056] In S101, the moving object detection unit 11 acquires the captured image. The moving object detection unit 11 acquires the captured image from the camera 10 via the communication interface 104. In addition, when the information processing device 1 and the camera (camera unit) are integrated, the moving object detection unit 11 acquires the captured image taken by the camera unit.

[0057] In S102, the moving object detection unit 11 detects moving objects within the captured image obtained in S101. (See reference...) Figure 5 This describes the detection of moving objects within the captured image. The information processing device 1 stores the background image 501, captured without any moving objects such as humans, in an auxiliary storage device 103, etc. The moving object detection unit 11 extracts the region of difference between the captured image 502 and the background image 501 as the moving object region. Figure 5 In the output image 503 shown, the extracted moving object region is represented by a bounding box. Output image 503 represents an example of a structure other than a human being being detected as a moving object due to location shifts or misidentification.

[0058] Furthermore, methods for detecting moving objects are not limited to Figure 5 The example described could also be a method that uses optical flow to estimate the motion and direction of movement of an object from parts that are commonly reflected in consecutive frames.

[0059] When multiple moving objects are detected in S102, the processing from S103 to S105 is repeated for each moving object.

[0060] In S103, the human body determination unit 12 calculates the length of the moving object to be determined. (See also...) Figure 6 , Figure 7 (A) and Figure 7 (B) Two examples are given for calculating the length of a moving object detected in an image captured by a fisheye camera (camera 10).

[0061] exist Figure 6 In the example, when the moving object is a person, the human body determination unit 12 calculates the distance between the coordinates of a position imagined as the feet and the position imagined as the top of the head, as the length of the moving object. Figure 7 In example (A), the human body determination unit 12 calculates the distance between two points where the straight line connecting the coordinates of the center of gravity of the moving object and the coordinates of the center of the captured image intersects the outline of the moving object region, and uses this distance as the length of the moving object.

[0062] Figure 6 This is a diagram illustrating the first example of calculating the length of a moving object. Image 600A shows the moving object regions 601 to 605 containing the moving object detected in S102. The center of the captured image is indicated by an × symbol. As shown in image 600A, if a human body is captured using a fisheye camera, the feet point towards the center, and the top of the head points outwards.

[0063] In image 600B, when the moving object is a person, the coordinates of what is assumed to be the position of the feet are represented by circular markers (hereinafter referred to as foot coordinates). The human body determination unit 12 can, for example, obtain the coordinates closest to the center of the captured image in the moving object area (hereinafter referred to as center coordinates) as the foot coordinates.

[0064] In the case of a moving object being a person, image 600C uses triangular markers to represent the coordinates of a position assumed to be the top of the head (hereinafter referred to as the top head coordinates). The human body determination unit 12 can, for example, obtain the coordinates of another intersection point between a line passing through the foot coordinates and the center coordinates and the outline of the moving object region, as the top head coordinates. Alternatively, the human body determination unit 12 can also obtain the coordinates furthest from the center coordinates of the moving object region as the top head coordinates.

[0065] The human body detection unit 12 calculates the distance between the obtained foot coordinates and head coordinates as the length of the moving object (the height of the human body). Additionally, in Figure 6 In the example, the method of obtaining the foot coordinates first is shown, but the human body determination unit 12 can also obtain the head coordinates first. That is, the human body determination unit 12 can obtain the coordinates of the moving object region that are farthest from the center coordinates as the head coordinates, and obtain the coordinates of the other intersection point of the straight line passing through the head coordinates and the center coordinates with the outline of the moving object region, and set it as the foot coordinates.

[0066] Figure 7 (A) and Figure 7 (B) is a diagram illustrating the second example of calculating the length of a moving object. Figure 7 Image 700 in (A) represents the moving object region 601 to moving object region 605 containing the moving object detected in S102. The center of the captured image is indicated by the × symbol. The coordinates of the center of gravity of the moving object region (hereinafter referred to as the center of gravity coordinates) are indicated by an asterisk in image 700.

[0067] The human body determination unit 12 can calculate the distance between two points where a straight line passing through the center of gravity coordinates and the center coordinates intersects the outline of the moving object region, and use this distance as the length of the moving object (the height of the human body). In the second example, even when a person is reaching out, the human body determination unit 12 can calculate the height of the human body with good accuracy.

[0068] For example, such as Figure 7 As shown in (B), when the tip of the hand is closest to the center of the captured image, in the method of Example 1, the tip of the hand may be misidentified as a foot. In this case, the straight line connecting the center coordinates and the coordinates of the tip of the hand, as shown by the dashed line 701, may not pass through the top of the human head.

[0069] In contrast, even when a person is extending their hand, the hand and arm are thinner than the torso, so the center of gravity of the moving object region usually lies in the torso. In this case, the straight line 702 connecting the center coordinates and the center of gravity coordinates of the moving object region passes through the top of the person's head. Thus, in the second method using the center of gravity of the moving object region, the human body determination unit 12 can calculate the height of the human body with good accuracy, regardless of the person's posture.

[0070] exist Figure 4 In S104, the human body determination unit 12 determines whether the moving object is a human body by comparing the length of the moving object obtained in S103 with a range of thresholds preset according to the position of the moving object within the captured image. Here, refer to Figures 8 to 10 The range of thresholds used to determine whether a moving object is a human body is explained.

[0071] refer to Figure 8 This paper explains the length of the human body as reflected in images captured by a fisheye camera. Figure 8 This is a graph representing the length of the human body corresponding to the distance from the center. The horizontal axis is the distance from the center of the captured image. The distance from the center to the moving object can be, for example, set as the distance between the center of gravity of the moving object's region and the center of the captured image. The vertical axis is the length (height) of the human body within the captured image.

[0072] A human standing directly beneath a fisheye camera mounted on the ceiling, with their feet and top of head centered within the camera's field of view, will appear as having a length of 0 in the captured image. As the human moves away from the center of the camera's field of view, their length will increase. Figure 8 In the example shown, the length of the human body decreases when the distance from the center exceeds r. As mentioned above, in images captured by a fisheye camera, the length of the human body initially increases as it moves away from the center, but tends to gradually decrease when the distance from the center exceeds a certain value.

[0073] refer to Figure 9 (A) and Figure 9 (B) Explains the range of a threshold for the length of the human body that is preset to correspond to the position of a moving object in the captured image. Figure 9 (A) and Figure 9 (B) is a diagram showing an example of a threshold set for each area of ​​the camera's field of view. Figure 9 Example (A) illustrates a case where the camera range is a planar representation of the spherical camera range based on a fisheye camera. The camera range is divided into multiple regions, categorized into groups 1 to 5 based on their distance from the center. For each group, a range of the imagined human body length is defined.

[0074] Figure 9 (A) shows an example of a threshold range set based on the measured length of the human body in a 1600×1200 pixel (px) image captured by a fisheye camera positioned at a height of 3m.

[0075] In Group 1, located at the center of the camera's field of view, the length of the human body is assumed to be 0px to 100px. In Group 2, adjacent to Group 1, the length of the human body is assumed to be 100px to 200px longer than that of Group 1. In Group 3, adjacent to Group 2 and located further out, the length of the human body is assumed to be 200px to 300px longer than that of Group 2.

[0076] The length of the human body decreases from the boundary of region 3. In region 4, which is adjacent to region 3 and located further out, the length of the human body is assumed to be 100px to 200px shorter than that of region 3. In region 5, which is adjacent to region 4 and located further out, the length of the human body is assumed to be 10px to 100px shorter than that of region 4.

[0077] In this way, the camera range is divided into multiple areas, and the length of the human body in each area is preset based on the camera 10's position and the pixel value of the captured image. The preset human body length (threshold range) information is stored in the determination information database 15. The human body determination unit 12 can determine whether the detected moving object is a human body by comparing the threshold range information stored in the determination information database 15 with the length of the moving object obtained in S103.

[0078] Furthermore, if there are no objects larger than a human body within the camera's field of view, it is not necessary to set an upper limit for the threshold range of each group. In this case, the human body detection unit 12 can detect objects larger than a human body. Figure 9 (A) indicates that the moving object at the lower limit of the threshold range is a human body.

[0079] also, Figure 9 (A) shows an example of dividing the camera area into multiple rectangles and setting a threshold for each region, but is not limited to this. Figure 9 As shown in (B), the camera range represented by a circle can also be divided into multiple concentric circles, and a threshold range related to the length of the human body can be set in each region.

[0080] refer to Figure 10 For use Figure 9 (A) describes the range of thresholds and the method for determining whether a moving object is a human body. Figure 10 This diagram illustrates an example of determining whether a moving object is a human body. The human body determination unit 12 is shown below. Figure 6 As illustrated in image 600C, for a detected area of ​​moving objects, the length of the moving object is calculated by obtaining the coordinates of its feet and the top of its head.

[0081] Furthermore, the human body determination unit 12 determines which group of regions the moving object area belongs to within the camera's field of view. For example, the human body determination unit 12 can determine which group of regions the moving object area belongs to based on the coordinates of the top of the head. Alternatively, the human body determination unit 12 is not limited to the top of the head coordinates; it can also determine which group of regions the moving object belongs to based on the foot coordinates, center of gravity coordinates, the midpoint between the foot coordinates and the top of the head coordinates, etc.

[0082] The human body determination unit 12 obtains the threshold range of the group to which the moving object region belongs from the determination information database 15. The human body determination unit 12 compares the length of the moving object calculated in S103 with the threshold range obtained from the determination information database 15. If the length of the moving object is within the threshold range, the human body determination unit 12 determines that the detected moving object is a human body.

[0083] exist Figure 10In the example, the moving object region 605 in image 600C has a calculated length that does not fall within the threshold range and is therefore determined not to be a human body. Image 1000 uses an × symbol to indicate that moving object region 605 was not determined to be a human body. Furthermore, image 1000 encloses moving object regions 601 to 604 with rectangles to indicate that these moving object regions were determined to be human bodies.

[0084] exist Figure 4 In step S104, if the detected moving object is identified as a human body (S104: Yes), the process proceeds to S105. If the detected moving object is not identified as a human body (S104: No), the process proceeds to S106.

[0085] In S105, the human detection unit 13 identifies and detects a human body from the moving object region that was determined to be a human body in S104. The human detection unit 13 is capable of detecting a human body using general object recognition algorithms.

[0086] Here, for reference Figure 11 A and Figure 11 B. This section explains the method of using CNN to detect the human body from the region of a moving object. Figure 11 A represents an example where a moving object is detected in S102 by differentiating moving objects from multiple frames. The human detection unit 13 can detect a human body by inputting the region of the moving object detected by the difference between moving objects into the CNN as is.

[0087] However, when detecting moving objects using moving object differential, the detected moving object region is derived from regions spanning multiple frames, therefore... Figure 11 As shown in Figure A, human bodies are sometimes measured to be larger than they actually are. Therefore, as... Figure 11 As shown in B, the human detection unit 13 can also detect a human body by inputting a segmented region obtained by sequentially placing windows within the moving object region into a CNN. By searching the moving object region based on the windows, the human detection unit 13 can perform detection with good accuracy suitable for the length of the human body.

[0088] Alternatively, the human detection unit 13 can also identify human bodies from moving object regions using a recognizer that combines image features and boosting methods, such as HoG or Haar-like recognition. In this case, it is also possible to determine whether the entire moving object region contains a human body, or to combine it with... Figure 11 Similarly, as shown in B, a human body of arbitrary length is detected and identified within a moving object region by searching the region based on the window.

[0089] exist Figure 4In step S106, the human body determination unit 12 determines whether there are other moving objects among the moving objects detected in S102 that have not been determined to be human bodies. If there are other moving objects that have not been determined (S106: Yes), the process returns to S103. If there are no other moving objects that have not been determined (S106: No). Figure 4 The human body detection process shown has ended.

[0090] When the human body detection process is finished, the output unit 14 overlays the captured image to display a rectangle or similar object representing the detected human body, and outputs it to a display or similar device.

[0091] (Effects)

[0092] In the above embodiment, the information processing device 1 detects moving objects from captured images and determines whether the detected moving objects are human bodies. If the moving object is determined to be a human body, the information processing device 1 detects the human body from the moving object region containing the detected moving object using methods such as deep learning. In this way, by limiting the region where the human body will be detected to the moving object region determined to be a human body, the information processing device 1 can reduce the load on human body recognition caused by deep learning and other methods, and can detect human bodies in real time with good accuracy.

[0093] Furthermore, when determining whether a detected moving object is a human body, the information processing device 1 compares the length of the moving object with a pre-set threshold range corresponding to the position of the moving object within the captured image. In images captured by a fisheye camera, the human body being photographed is distorted depending on its position within the captured image. Since the perceived length of the human body varies depending on its position within the captured image, the threshold range used to determine whether it is a human body is set within the range corresponding to its position within the captured image. Thus, by considering the characteristics of images captured by a fisheye camera and setting a threshold range corresponding to a position or region within the captured image, the information processing device 1 can determine whether it is a human body with relatively high accuracy.

[0094] <Other>

[0095] The above embodiments are merely illustrative examples of the structure of the present invention. The present invention is not limited to the specific forms described above, and various modifications can be made within the scope of its technical concept.

[0096] In the above embodiments, an example is shown where the threshold range for determining whether something is a human body is divided into multiple regions and preset for each region, but this is not a limitation. For example, the threshold range for determining whether something is a human body can also be calculated using a prescribed formula based on the distance from the center of the captured image to the center of gravity of the moving object region.

[0097] Furthermore, the threshold range used to determine whether something is a human body can be set to different ranges depending on the gender or age group of the human body that is the main subject of the photograph.

[0098] <Postscript 1>

[0099] (1) An information processing device (1), comprising:

[0100] The moving object detection unit (11) detects moving objects from images captured by a fisheye camera;

[0101] The human body determination unit (12) determines whether the moving object is a human body by comparing the distance between two predetermined points on the outline of the moving object region containing the moving object with a range of thresholds set based on the height of a human body measured based on the position of the moving object in the captured image.

[0102] The human body detection unit (13) detects a human body from the moving object region containing the moving object that is determined to be a human body by the human body determination unit.

[0103] (2) An information processing method, comprising:

[0104] The following steps are performed by the computer:

[0105] Moving object detection step (S102): Detect moving objects from images captured by a fisheye camera;

[0106] The human body determination step (S103, S104) determines whether the moving object is a human body by comparing the distance between two predetermined points on the outline of the moving object region containing the moving object with a range of thresholds set based on the height of a human body measured based on the position of the moving object in the captured image.

[0107] Human body detection step (S105): Detecting a human body from the moving object region of the moving object that was determined to be a human body in the human body determination step.

[0108] Explanation of reference numerals in the attached figures

[0109] 1: Information processing device; 10: Camera; 11: Moving object detection unit; 12: Human body determination unit; 13: Human body detection unit; 14: Output unit; 15: Determination information database

Claims

1. An information processing device, comprising: The moving object detection unit detects moving objects from images captured by a fisheye camera; The human body determination unit determines whether the moving object is a human body by comparing the distance between two predetermined points on the outline of the moving object region containing the moving object with a range of thresholds set based on the height of a human body measured from the position of the moving object in the captured image; and The human body detection unit detects human bodies from the moving object region, which includes the moving object identified as a human body by the human body determination unit. The distance between two points on the outline of the moving object region containing the moving object is the distance between two points where the straight line passing through the centroid coordinates of the moving object region and the center coordinates of the captured image intersects the outline of the moving object region.

2. The information processing apparatus as described in claim 1, wherein, The distance between two defined points on the contour of the moving object region containing the moving object is the distance between a first coordinate and a second coordinate different from the first coordinate. The first coordinate is the point within the moving object region that is closest to or farthest from the center coordinate of the captured image. The second coordinate is the intersection of a straight line passing through the center coordinate and the first coordinate with the contour of the moving object region.

3. The information processing apparatus as described in claim 1 or 2, wherein, The threshold range is set according to each region into which the captured image is divided.

4. The information processing apparatus as described in claim 1 or 2, wherein, The moving object detection unit detects the moving object based on background subtraction or inter-frame subtraction.

5. The information processing apparatus as described in claim 1 or 2, wherein, The moving object detection unit detects the moving object based on the motion and direction of movement of the object that are commonly reflected in consecutive frames of the captured image.

6. The information processing apparatus as described in claim 1 or 2, wherein, The information processing device further includes an output unit that outputs information about the human body detected by the human body detection unit.

7. The information processing apparatus as described in claim 1 or 2, wherein, The information processing device also includes a camera unit for capturing the captured images.

8. An information processing method, comprising: The following steps are performed by a computer: The moving object detection step involves detecting moving objects from images captured by a fisheye camera. The human body identification step determines whether the moving object is a human body by comparing the distance between two predetermined points on the outline of the region containing the moving object with a range of thresholds set based on the height of a human body measured from the position of the moving object in the captured image; and The human body detection step detects a human body from the moving object region of the moving object that was determined to be a human body in the human body determination step. The distance between two points on the outline of the moving object region containing the moving object is the distance between two points where the straight line passing through the centroid coordinates of the moving object region and the center coordinates of the captured image intersects the outline of the moving object region.

9. A computer program product comprising a program for causing a computer to perform the steps of the method of claim 8.