Monitoring and control device, monitoring and control system, monitoring and control method and program
The monitoring and control device addresses the limitations of conventional image sensors by integrating image and sound recognition to provide immediate alarms and notifications, improving detection accuracy and reducing device relocation needs.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- KK TOSHIBA
- Filing Date
- 2024-12-09
- Publication Date
- 2026-06-19
AI Technical Summary
Conventional image sensors lack a reporting function and cannot detect information not reflected in images, requiring frequent relocation of devices like speakers and microphones, and they do not provide immediate notification of critical information.
A monitoring and control device with an image recognition unit to detect moving objects, determine their shape, and an information processing unit to issue alarms based on the detected region, incorporating a microphone for sound detection and a speaker for immediate notification.
Enhances detection accuracy and responsiveness by minimizing device components, allowing immediate notification of potential dangers or suspicious activities through integrated image and sound analysis.
Smart Images

Figure 2026100285000001_ABST
Abstract
Description
Technical Field
[0001] Embodiments of the present invention relate to a monitoring control device, a monitoring control system, a monitoring control method, and a program.
Background Art
[0002] Conventionally, an image sensor has determined the movement and shape of an object by performing image processing on an image or video captured by a camera, and has output data such as presence / absence, estimated number of people, estimated illuminance, walking / staying to the outside. Based on the output data and determination results, an upper-level system has executed processing for providing a safe and secure service. When providing such a safe and secure service, it is required to process information in real time and notify the target person of the information.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Patent Document 2
Summary of the Invention
Problems to be Solved by the Invention
[0004] However, since a safe and secure service is constructed with a highly important system, it is necessary to minimize the constituent devices in order to reduce the frequency of failures. In addition, conventional image sensors do not have a reporting function and cannot detect information that is not reflected in the image. In addition, with changes in the spatial layout, it was necessary to move the speaker, microphone, patrol light (registered trademark), etc., which are reporting devices, each time.
[0005] Embodiments of the present invention have been made in view of the above circumstances, and an object thereof is to provide a monitoring control device, a monitoring control system, a monitoring control method, and a program that can immediately notify a target person of information. [Means for solving the problem]
[0006] A monitoring and control device according to one embodiment includes: an image recognition unit that acquires multiple images from an imaging device, detects a moving object using the difference between the multiple images, and determines the shape of the moving object by comparing the multiple images with a dictionary containing information about the shape; an information processing unit that determines the region to which the moving object belongs when the image recognition unit detects a moving object in a region identified using detection region information and determines that the shape of the moving object is a specific shape; and an output unit that issues an alarm according to the region to which the moving object belongs as determined by the information processing unit. [Brief explanation of the drawing]
[0007] [Figure 1] Figure 1 is a block diagram showing an example of the configuration of a monitoring and control system according to the first to fourth embodiments. [Figure 2] Figure 2 is a block diagram showing an example of the configuration of an image sensor according to the first to fourth embodiments. [Figure 3] Figure 3 shows an example of the operation of the image sensor according to the first embodiment. [Figure 4] Figure 4 is a flowchart showing an example of the operation of the image sensor according to the first embodiment. [Figure 5] Figure 5 shows an example of the operation of a monitoring and control system including an image sensor according to the second embodiment. [Figure 6] Figure 6 is a flowchart showing an example of the operation of a monitoring and control system including an image sensor according to the second embodiment. [Figure 7] Figure 7 shows an example of the operation of a monitoring and control system including an image sensor according to the third embodiment. [Figure 8] Figure 8 is a flowchart showing an example of the operation of a monitoring and control system including an image sensor according to the third embodiment. [Figure 9] Figure 9 shows an example of the operation of a monitoring and control system including an image sensor according to the fourth embodiment. [Figure 10]Figure 10 is a flowchart showing an example of the operation of a monitoring and control system including an image sensor according to the fourth embodiment. [Modes for carrying out the invention]
[0008] The monitoring and control device, monitoring and control system, monitoring and control method, and program will be described in detail below with reference to the attached drawings. In the following descriptions of embodiments and modified examples, the same reference numerals are used. The parts have essentially the same function, and explanations of the overlapping parts will be omitted as appropriate.
[0009] [First Embodiment] Figure 1 is a block diagram showing an example of the configuration of a monitoring and control system according to the first embodiment. The monitoring and control system is a system that issues an alarm in areas where an abnormality or other issue is detected in each area monitored by multiple image sensors. The monitoring and control system comprises multiple image sensors 100, a GW (Gateway) device 112 connected to the multiple image sensors 100, a higher-level system 120 connected to the GW device 112, and a security room alarm device 130 connected to the higher-level system.
[0010] Multiple image sensors 100, a gateway device 112, a higher-level system 120, and a security room alarm device 130 are connected to each other via wires, enabling communication between them. Alternatively, the multiple image sensors 100, the gateway device 112, the higher-level system 120, and the security room alarm device 130 may be connected to each other wirelessly, enabling communication between them.
[0011] The higher-level system 120 includes, for example, at least one processor and memory in which a program executed by the processor is stored, and is configured to realize various functions by software or a combination of software and hardware. The higher-level system 120 acquires data / information from multiple image sensors 100 via the GW device 112. The higher-level system 120 also transmits data / information to the multiple image sensors 100 via the GW device 112. Similarly, the higher-level system 120 transmits data / information to the security room alarm device 130 and acquires data / information from the security room alarm device 130. The specific operation of the higher-level system 120 will be described later. Note that the higher-level system 120 is an example of a higher-level device.
[0012] Next, the details of the image sensor 100 will be described. Figure 2 is a block diagram showing an example of the configuration of the image sensor according to the first embodiment.
[0013] The image sensor 100 is mounted, for example, above the target space, and recognizes moving objects (e.g., people or vehicles) that have entered a predetermined area within the imaging range, and issues an alarm according to the area to which the moving object belongs. The image sensor 100 also recognizes moving objects that have entered a predetermined area and issues an alarm in response to sounds emitted by the moving object.
[0014] The image sensor 100 is an example of a monitoring and control device and comprises a camera 101, an image recognition unit 102, a microphone 103, a voice recognition unit 104, a speaker 105, a voice output unit 106, an information processing unit 108, and a data communication unit 111. For example, in the image sensor 100, the image recognition unit 102, the voice recognition unit 104, the voice output unit 106, and the information processing unit 108 are examples of processors, the camera 101, the microphone 103, and the speaker 105 are examples of input / output interfaces, and the data communication unit 111 is an example of a communication interface.
[0015] In addition, the image sensor 100 includes a storage unit (not shown). The storage unit is composed of, for example, a ROM, a RAM, and an auxiliary storage unit. The ROM is a read-only non-volatile memory corresponding to the main storage part of a processor equivalent to a computer. The ROM stores programs such as an operating system or application software. Also, the ROM stores data used by the processor for performing various processes and the like.
[0016] The RAM is a volatile memory corresponding to the main storage part of a processor equivalent to a computer. The RAM is used as a so-called work area that stores data temporarily used by the processor for performing various processes.
[0017] The auxiliary storage unit corresponds to the auxiliary storage part of a processor equivalent to a computer. The auxiliary storage unit is, for example, an EEPROM (registered trademark) (electric erasable programmable read-only memory), an HDD (hard disk drive), or an SSD (solid state drive). The auxiliary storage unit may store a part or all of the above program. Also, the auxiliary storage unit stores data used by the processor for performing various processes, data generated by various processes by the processor, or various setting values and the like.
[0018] The ROM and the auxiliary storage unit are non-temporary computer-readable storage media, and the image sensor 100 may be transferred with the above program stored, or the image sensor 100 may be transferred without storing the above program. In the latter case, the image sensor 100 reads the above program stored in a removable storage medium such as an optical disk or a semiconductor memory, and writes the read program to the auxiliary storage unit or the like. The image sensor 100 downloads the above program via a network or the like, and writes the downloaded program to the auxiliary storage unit or the like.
[0019] The memory unit stores, for example, detection data from the image recognition unit 102 (described later), detection data from the speech recognition unit 104 (described later), and detection data from the information processing unit 108 (also described later) in a ROM or auxiliary memory unit.
[0020] The image recognition unit 102, the speech recognition unit 104, and the information processing unit 108, which are examples of the processor described above, are typically CPUs (Central Processing Units) and / or GPUs (Graphics Processing Units), but may also be microcontrollers, FPGAs (Field Programmable Gate Arrays), or DSPs (Digital Signal Processors), etc.
[0021] The image recognition unit 102, the voice recognition unit 104, and the information processing unit 108 can realize various functions of the image sensor 100 by executing programs such as system software, application software, or firmware stored in at least one of the ROM and auxiliary storage unit described above. The image sensor 100 may also be configured to perform various processes by combining two or more processors and having these processors cooperate.
[0022] Camera 101 may include, for example, a fisheye lens, an aperture mechanism, an image sensor, and a register. The fisheye lens captures the target space in its field of view, for example, by looking down from the ceiling, and forms an image on the image sensor. The amount of light incident on the image sensor is adjusted by the aperture mechanism. The image sensor is an image sensor, such as a CMOS (complementary metal-oxide-semiconductor) sensor, and generates a video signal with a frame rate of, for example, 30 frames per second. This video signal is digitally encoded and output as image data. In other words, the image sensor is configured to capture an image within the field of view and obtain image data. The register stores camera information. Camera information includes, for example, information about camera 101, such as the status of the auto gain control function, the gain value, and the exposure time, or information about the image sensor itself.
[0023] The image recognition unit 102 acquires multiple image data from the camera 101 and detects moving objects from the differences between the images contained in the image data. The image recognition unit 102 also acquires multiple image data from the camera 101 and compares the images contained in the image data with pre-registered dictionary information to identify the shape of objects present in the image. The dictionary information includes, but is not limited to, the shape of a person, the posture of a person, the shape of an object, and the state of an object.
[0024] The image recognition unit 102 may process the motion detection function and the shape detection function in parallel, or it may detect the shape of the moving object after detecting the moving object using the motion detection function. The image recognition unit 102 outputs detection data to the information processing unit 108, which includes information about the moving object detected by the motion detection function and information about the moving object detected by the shape detection function.
[0025] The microphone 103 is, for example, a directional sound collector that collects sound from any point. The microphone 103 may collect sound by tracking a moving object recognized by the image recognition unit 102, or it may be configured as an omnidirectional sound collector to collect sound from the entire area monitored by the image sensor 100.
[0026] The speech recognition unit 104 acquires multiple audio data from the microphone 103 and detects the audio contained in the audio data and the pattern of the audio. The speech recognition unit 104 may be configured to recognize audio by comparing the audio contained in the audio data with a dictionary in which multiple audio patterns are registered in advance. For example, if the dictionary contains recordings of shouts or warning sounds, the speech recognition unit 104 will output detection data related to the audio to the information processing unit 108 when it detects the target audio.
[0027] Speaker 105 is an output device that outputs a predetermined sound, for example, it issues an alarm in response to an alarm signal from the sound output unit 106. That is, speaker 105 outputs sound as a trigger when it receives a signal related to sound output from the sound output unit 106.
[0028] The audio output unit 106 receives an alarm generation signal from the information processing unit 108 and generates an alarm activation signal according to the alarm generation signal. The alarm activation signals generated by the audio output unit 106 may be separated into different levels, for example, by volume or sound pattern. The audio output unit 106 outputs the generated alarm activation signals to the speaker 105.
[0029] The information processing unit 108 has the function of generating and outputting an alarm generation signal according to the detection data obtained from the image recognition unit 102, which includes detection data including information about moving objects detected by the motion detection function and information about moving objects detected by the shape detection function, and from the voice recognition unit 104, which includes detection data related to voice. In other words, the information processing unit 108 comprises a detection area identification unit 109 and a detection data processing unit 110, and the above function is realized by the detection area identification unit 109 and the detection data processing unit 110.
[0030] The detection area identification unit 109 sets the area to be monitored within the imageable range using detection area information input from the configuration tool PC 107. In other words, the detection area identification unit 109 has the function of identifying multiple areas using detection area information. The area to be monitored may be set in advance by the administrator of the configuration tool PC 107, or the entire imageable range may be set as the area to be monitored. Furthermore, the detection area identification unit 109 may be configured to divide the area to be monitored by, for example, a security level or a risk level.
[0031] The detection data processing unit 110 determines whether a moving object of a specific shape, as contained in the detection data acquired from the image recognition unit 102, belongs to the monitoring area set by the detection area identification unit 109. The detection data processing unit 110 generates an alarm generation signal according to the area to which the moving object of the specific shape belongs. The detection data processing unit 110 outputs the alarm generation signal to the audio output unit 106, thereby triggering an alarm from the speaker 105.
[0032] Furthermore, the detection data processing unit 110 outputs detection data, including a moving object of a specific shape and the region to which the moving object belongs, to the GW device 112 via the data communication unit 111. The data communication unit 111 receives data / information from the GW device 112, transmits the received data / information to the configuration within the image sensor 100, and transmits the received data / information from the configuration within the image sensor 100 to the GW device 112. The data communication unit 111 may also send and receive data / information from an external source via a network such as the Internet.
[0033] Next, an example of the operation of the image sensor 100 according to the first embodiment will be described using Figures 3 and 4. Figure 3 is a diagram showing an example of the operation of the image sensor according to the first embodiment. Figure 4 is a flowchart showing an example of the operation of the image sensor according to the first embodiment.
[0034] As shown in Figure 3, the detection area identification unit 109 of the image sensor 100 identifies three areas using detection area information input via the setting tool PC 107. For example, the first area is the image sensor detection area 200. The image sensor 100 starts monitoring a moving object when it detects that the moving object has entered the image sensor detection area 200. Note that the image sensor detection area 200 is an example of the first area.
[0035] The second region is the pre-alarm area 201. The image sensor 100 detects when a moving object enters the pre-alarm area 201 and, for example, issues an alarm to warn the user. The image sensor 100 also stops issuing the alarm when it detects that the moving object has moved from the pre-alarm area 201 to the image sensor detection area 200. The pre-alarm area 201 is an area that is at least included in the image sensor detection area 200 and is an example of the second region.
[0036] The third area is the hazardous area 202. The image sensor 100 detects when a moving object enters the hazardous area 202 and issues an alarm, for example, to signal "no entry." Furthermore, when the image sensor 100 detects that a moving object has moved from the hazardous area 202 to the pre-alarm area 201, it changes the "no entry" alarm to a cautionary alarm and continues to issue the alarm. Note that the hazardous area 202 is an area that is included in at least the pre-alarm area 201 and is an example of the third area.
[0037] When the detection data processing unit 110 determines that a moving object of a specific shape, as contained in the detection data acquired from the image recognition unit 102, belongs to an area identified by the detection area identification unit 109 using detection area information, the image sensor 100 starts the process shown in Figure 4. In other words, when the detection data processing unit 110 determines that a moving object belongs to the image sensor detection area 200, the monitoring and control process by the image sensor 100 is started.
[0038] The detection data processing unit 110 determines, using the detection data, whether a moving object belonging to the image sensor detection area 200 has entered the pre-alarm area 201 (step S10). If the detection data processing unit 110 determines, using the detection data, that a moving object belonging to the image sensor detection area 200 has entered the pre-alarm area 201 (step S10, YES), the detection data processing unit 110 generates an alarm generation signal corresponding to the pre-alarm area 201 and outputs it to the audio output unit 106. The audio output unit 106 generates an alarm activation signal based on the alarm generation signal corresponding to the pre-alarm area 201 and outputs it to the speaker 105, causing the speaker 105 to emit an alarm that warns of the pre-alarm area 201 (step S11).
[0039] The detection data processing unit 110 uses the detection data from the image recognition unit 102 to determine whether the moving object belonging to the pre-alarm area 201 has left the pre-alarm area 201 (step S12). That is, the detection data processing unit 110 determines whether the moving object has moved from the pre-alarm area 201 to the image sensor detection area 200.
[0040] If the detection data processing unit 110 determines using the detection data that the area to which the moving object belongs maintains the state of the pre-alarm area 201 (step S12, NO), it determines whether the moving object has entered the danger area 202 (step S13). By using the detection data output in real time from the image recognition unit 102, the detection data processing unit 110 can monitor the area to which a moving object of a specific shape included in the detection data belongs.
[0041] When the detection data processing unit 110 determines, using the detection data, that an object has moved from the pre-alarm area 201 to the hazard area 202 (step S13, YES), the detection data processing unit 110 generates an alarm generation signal corresponding to the hazard area 202 and outputs it to the audio output unit 106. The audio output unit 106 generates an alarm activation signal based on the alarm generation signal corresponding to the hazard area 202 and outputs it to the speaker 105, causing the speaker 105 to emit an alarm that communicates "no entry" for the hazard area 202 (step S14). For example, the "no entry" alarm emitted in the hazard area 202 may be set to be louder and use stronger language to warn of danger than the cautionary alarm emitted in the pre-alarm area 201.
[0042] The detection data processing unit 110 uses the detection data from the image recognition unit 102 to determine whether the moving object belonging to the hazard area 202 has left the hazard area 202 (step S15). In other words, the detection data processing unit 110 determines whether the moving object has moved from the hazard area 202 to the pre-alarm area 201.
[0043] When the detection data processing unit 110 determines, using the detection data, that the area to which the moving object belongs remains in the state of a hazardous area 202 (step S15, NO), the speaker 105 continues to emit an alarm that transmits a no-entry warning corresponding to the hazardous area 202. When the detection data processing unit 110 determines, using the detection data, that the moving object belonging to the hazardous area 202 has left the hazardous area 202 (step S15), it generates an alarm generation signal corresponding to the pre-alarm area 201, and the audio output unit 106 generates and outputs an alarm emission signal based on the alarm generation signal, so that the alarm output from the speaker 105 changes from an alarm corresponding to the hazardous area 202 to an alarm corresponding to the pre-alarm area 201, and the process proceeds to step S13.
[0044] When the detection data processing unit 110 detects that a moving object belonging to the pre-alarm area 201 has entered the danger area 202 again, it repeats the processing in steps S14 and S15. On the other hand, if the detection data processing unit 110 determines that the moving object belonging to the pre-alarm area 201 is maintaining its status as belonging to the pre-alarm area 201 (step S13, NO), it proceeds to the processing in step S12.
[0045] When the detection data processing unit 110 determines that a moving object belonging to the pre-alarm area 201 has left the pre-alarm area 201 (step S12, YES), it stops the alarm corresponding to the pre-alarm area 201 output from the speaker 105 and terminates processing. For example, the detection data processing unit 110 stops the alarm output from the speaker 105 by outputting an alarm stop signal to the audio output unit 106.
[0046] [Second Embodiment] The second embodiment differs from the first embodiment in that multiple image sensors 100 each monitor a target area, and a higher-level system 120 determines the priority of the image sensors 100 that issue alarms. The configuration of the monitoring and control system and the image sensors 100 according to the second embodiment is the same as in the first embodiment, so a description is omitted.
[0047] An example of the operation of the monitoring and control system according to the second embodiment will be explained using Figures 5 and 6. Figure 5 is a diagram showing an example of the operation of the monitoring and control system including the image sensor according to the second embodiment. Figure 6 is a flowchart showing an example of the operation of the monitoring and control system including the image sensor according to the second embodiment.
[0048] As shown in Figure 5, each of the multiple image sensors 100 has an image sensor detection area 301, an image sensor detection area 302, an image sensor detection area 303, an image sensor detection area 304, an image sensor detection area 305, and an image sensor detection area 306, which are areas to be monitored. Each image sensor 100 is connected to a higher-level system 120 via a GW device 112, and the higher-level system 120 comprehensively controls the multiple image sensors 100.
[0049] For example, in the example shown in Figure 5, there is a person lying down in image sensor detection area 301, and there are people working in image sensor detection areas 303, 304, and 306, respectively. When explaining Figure 6, we will use the above arrangement setting from Figure 5.
[0050] The process shown in Figure 6 can be started at any time. In this embodiment, the process starts when the image sensor 100 identifies from the dictionary information that a person, which is an example of a moving object, is lying down. That is, the image recognition unit 102 of the image sensor 100, which monitors the image sensor detection area 301, compares multiple image data acquired from the camera 101 with the dictionary information and determines (identifies) whether an object present in the image data, in this embodiment a person, is in a lying position (step S20).
[0051] When the image recognition unit 102 identifies from dictionary information that a person is in a fallen position (step S20, YES), it outputs detection data including information about the person's posture to the detection data processing unit 110. The detection data processing unit 110 uses the detection data to generate an alarm generation signal corresponding to the image sensor detection area 301 and outputs it to the audio output unit 106. The audio output unit 106 generates an alarm activation signal based on the alarm generation signal corresponding to the image sensor detection area 301 and outputs it to the speaker 105, causing the speaker 105 to emit an alarm corresponding to the image sensor detection area 301 (for example, an alarm indicating the presence of a person in need of rescue, etc.) (step S21).
[0052] The detection data processing unit 110 outputs detection data to the GW device 112 via the data communication unit 111. The higher-level system 120 obtains the detection data from the GW device 112 and determines that the moving object maintaining a fallen posture belongs to the image sensor detection area 301. The higher-level system 120 also identifies the image sensor detection area to which the moving object belongs from the multiple image sensors 100. That is, the higher-level system 120 obtains detection data from each image sensor 100 from the GW device 112. Since each detection data contains identification information, it is assumed that the detection data is associated with the image sensor detection areas 301 to 306, respectively.
[0053] The higher-level system 120 uses multiple detection data acquired from multiple image sensors 100 to extract the image sensor 100 that is closest to the image sensor detection area 301 and monitors the area to which the moving object belongs (step S22). The higher-level system 120 transmits a rescue assistance request signal to the extracted image sensor 100.
[0054] For example, as shown in Figure 5, the higher-level system 120 transmits a rescue assistance request signal to the image sensor 100, which monitors the image sensor detection area 304, via the GW device 112. The detection data processing unit 110 acquires the rescue assistance request signal via the data communication unit 111 and outputs it to the audio output unit 106. The audio output unit 106 outputs the rescue assistance request signal to the speaker 105, which then emits an alarm corresponding to the image sensor detection area 304 (for example, an alarm indicating that there is a person in need of rescue in the image sensor detection area 301 and that assistance is needed) (step S23).
[0055] The image recognition unit 102 of the image sensor 100, which monitors the image sensor detection area 301, outputs detection data to the detection data processing unit 110 that includes information indicating that at least one additional moving object of a specific shape belonging to the image sensor detection area 301 has been detected. If the detection data processing unit 110 determines that a moving object of a specific shape (for example, a person who belonged to the image sensor detection area 304) is near a moving object that is maintaining a fallen position (step S24, YES), it stops the alarms being emitted from the image sensor 100 in the image sensor detection area 301 and the alarms being emitted from the image sensor 100 in the image sensor detection area 304, and terminates processing. In other words, the speaker 105 continues to emit alarms until the moving object of a specific shape arrives near the moving object that is maintaining a fallen position (step S24, NO).
[0056] [Third Embodiment] The third embodiment differs from the first and second embodiments in that multiple image sensors 100 each monitor a target area, a higher-level system 120 determines which image sensor 100 will issue an alarm, and the security room alarm device 130 issues an alarm based on the signal from the higher-level system. The configuration of the monitoring and control system and the image sensors 100 according to the third embodiment is the same as in the first and second embodiments, so a description is omitted.
[0057] An example of the operation of the monitoring and control system according to the third embodiment will be explained using Figures 7 and 8. Figure 7 is a diagram showing an example of the operation of the monitoring and control system including the image sensor according to the third embodiment. Figure 8 is a flowchart showing an example of the operation of the monitoring and control system including the image sensor according to the third embodiment.
[0058] For example, in the example shown in Figure 7, a suspicious person is located in image sensor detection area 301, and there are people working in image sensor detection areas 302, 303, 304, and 305, respectively. The explanation of Figure 8 will use the above arrangement setting from Figure 7.
[0059] The process shown in Figure 8 can be started at any time. In this embodiment, the process is started when a person, which is an example of a moving object, is identified as performing suspicious behavior that can be identified from the dictionary information. The dictionary information may have suspicious behaviors registered in advance, or it may be configured so that only normal behavior in a task is registered, and the process is started when behavior that deviates from normal behavior in that task, i.e., behavior not registered in the dictionary information, is identified.
[0060] The image recognition unit 102 of the image sensor 100, which monitors the image sensor detection area 301, compares multiple image data acquired from the camera 101 with dictionary information and determines (detects) whether an object present in the image data, in this embodiment a person, is engaging in suspicious behavior (step S30). Similarly, the voice recognition unit 104 of the image sensor 100, which monitors the image sensor detection area 301, compares multiple audio data acquired from the microphone 103 with dictionary information and determines (detects) whether the audio data contains shouts, warnings of danger, or other suspicious behavior (step S30).
[0061] If the image recognition unit 102 or the voice recognition unit 104 identifies from dictionary information that a moving object is behaving suspiciously (step S30, YES), it outputs detection data containing information about the suspicious behavior to the detection data processing unit 110. The detection data processing unit 110 uses the detection data to generate an alarm generation signal corresponding to the image sensor detection area 301 and outputs it to the voice output unit 106. The voice output unit 106 generates an alarm activation signal based on the alarm generation signal corresponding to the image sensor detection area 301 and outputs it to the speaker 105, causing the speaker 105 to activate an alarm corresponding to the image sensor detection area 301 (for example, an alarm indicating that there is a person behaving suspiciously) (step S31).
[0062] Furthermore, the detection data processing unit 110 outputs detection data to the GW device 112 via the data communication unit 111 in parallel with the issuance of the alarm. The higher-level system 120 obtains the detection data from the GW device 112 and determines that the moving object maintaining suspicious behavior belongs to the image sensor detection area 301. The higher-level system 120 also identifies the image sensor detection area to which the moving object belongs from the multiple image sensors 100. In other words, the higher-level system 120 obtains detection data from each image sensor 100 from the GW device 112.
[0063] The higher-level system 120 outputs an alarm signal to the security room alarm device 130 that includes information about the image sensor detection area 301 to which the moving object exhibiting suspicious behavior belongs. The security room alarm device 130 issues an alarm based on the alarm signal (for example, an alarm indicating that there is a person exhibiting suspicious behavior in the image sensor detection area 301) (step S31).
[0064] In parallel with the above processing, the higher-level system 120 outputs an evacuation alarm signal to the image sensors 100 that monitor the area near the image sensor detection area 301 to which the moving object maintaining suspicious behavior belongs. For example, as the first process, the higher-level system 120 sends an evacuation alarm signal to the image sensor 100 that monitors the image sensor detection area 302, as the second process, it sends an evacuation alarm signal to the image sensor 100 that monitors the image sensor detection area 304, and as the third process, it sends an evacuation alarm signal to the image sensor 100 that monitors the image sensor detection area 305. In other words, the higher-level system 120 determines the priority order of the image sensors 100 that output the evacuation alarm signal and sends the evacuation alarm signal to the corresponding image sensor 100 based on the priority order. The priority order may be set in advance for each area to which the moving object maintaining suspicious behavior belongs, or the evacuation alarm signal may be output to the target areas at the same timing.
[0065] The detection data processing unit 110 of the image sensor 100, which monitors image sensor detection areas 302, 304, and 305 respectively, acquires an evacuation alarm signal via the data communication unit 111 and outputs it to the audio output unit 106. The audio output unit 106 outputs the evacuation alarm signal to the speaker 105, which then issues an alarm corresponding to each of the image sensor detection areas 302, 304, and 305 (for example, an alarm indicating that there is a suspicious person in image sensor detection area 301 and that it is necessary to evacuate while avoiding that area) (step S31). It is desirable that the processing in step S31 be carried out in parallel.
[0066] The image recognition unit 102 of the image sensor 100, which monitors the image sensor detection area 301, outputs detection data to the detection data processing unit 110 that includes information indicating that at least one additional moving object of a specific shape belonging to the image sensor detection area 301 has been detected. If the detection data processing unit 110 determines that a moving object of a specific shape (for example, a security guard stationed in the space where the security room alarm device 130 is installed) is in the vicinity of a moving object that is maintaining suspicious behavior (step S32, YES), it stops the alarms being emitted from the image sensors 100 that monitor image sensor detection area 301, image sensor detection area 302, image sensor detection area 304, and image sensor detection area 305, respectively, and the alarm being emitted from the security room alarm device 130, and terminates processing. In other words, the speaker 105 continues to emit alarms until the moving object of a specific shape arrives in the vicinity of the moving object that is maintaining suspicious behavior (step S32, NO).
[0067] Furthermore, alarms issued by the image sensors 100 that monitor image sensor detection areas 302, 304, and 305 may be stopped when a specific moving object can no longer be detected in each of these areas. In addition, alarms issued by the image sensor 100 that monitors image sensor detection area 301 and alarms issued by the security room alarm device 130 may be manually stopped by a security guard or other person who arrives at image sensor detection area 301.
[0068] [Fourth Embodiment] The fourth embodiment differs from the first to third embodiments in that it triggers an alarm in the image sensor 100 by acquiring collision sounds or similar sounds in the image sensor detection area, and the security room alarm device 130 triggers an alarm based on a signal from a higher-level system. The configuration of the monitoring and control system and the image sensor 100 according to the fourth embodiment is the same as in the first to third embodiments, so a description is omitted.
[0069] An example of the operation of the monitoring and control system according to the fourth embodiment will be explained using Figures 9 and 10. Figure 9 is a diagram showing an example of the operation of the monitoring and control system including the image sensor according to the fourth embodiment. Figure 10 is a flowchart showing an example of the operation of the monitoring and control system including the image sensor according to the fourth embodiment.
[0070] As shown in Figure 9, in this embodiment, the image sensor 100 sets the image sensor detection area 400 using detection area information input via the setting tool PC 107. For example, the image sensor detection area 400 is assumed to contain both a person and a vehicle as moving objects. When explaining Figure 10, the arrangement of the person and vehicle shown in Figure 9 will be used as an example.
[0071] The process shown in Figure 10 can be started at any time. In this embodiment, the process starts when the detection data processing unit 110 detects that there is a possibility of collision between a person and a vehicle, which are examples of moving objects. The image recognition unit 102 recognizes the movement of the moving object using the difference between images contained in multiple image data acquired from the camera 101. The image recognition unit 102 outputs detection data including the movement of the moving object to the detection data processing unit 110. In addition, the voice recognition unit 104 recognizes the state of the moving object using the voice contained in multiple audio data acquired from the microphone 103. The voice recognition unit 104 outputs detection data including the state of the moving object to the detection data processing unit 110.
[0072] The detection data processing unit 110 estimates the movements of multiple moving objects using the movements of the moving objects included in the detection data output from the image recognition unit 102 and the speech recognition unit 104. For example, the detection data processing unit 110 estimates the position information and trajectory of a moving object after an arbitrary time period from the trajectory of a moving object generated by combining multiple detection data. The detection data processing unit 110 determines at least two moving objects among the multiple moving objects that are likely to collide (step S40).
[0073] When the detection data processing unit 110 identifies at least two moving objects that may collide (step S40, YES), it generates a collision avoidance warning signal and outputs it to the audio output unit 106. The audio output unit 106 generates a collision avoidance warning activation signal based on the collision avoidance warning signal corresponding to the image sensor detection area 400 and outputs it to the speaker 105, causing the speaker 105 to activate an alarm corresponding to the image sensor detection area 400 (for example, an alarm that announces the names of the moving objects that may collide and the time of possibility of collision) (step S41).
[0074] The detection data processing unit 110 uses the detection data output from the image recognition unit 102 and the voice recognition unit 104 to determine whether a collision was avoided (step S42). For example, the detection data processing unit 110 may extract an image of the moving object and audio from the detection data at the estimated collision time, compare the image and audio with the dictionary information to identify actions not registered in the dictionary information, and determine whether a collision occurred.
[0075] If the detection data processing unit 110 determines that a collision has been avoided using the detection data output from the image recognition unit 102 and the voice recognition unit 104 (step S42, YES), it stops issuing alarms corresponding to the image sensor detection area 400 and terminates processing.
[0076] If the detection data processing unit 110 determines that a collision could not be avoided using the detection data output from the image recognition unit 102 and the voice recognition unit 104 (step S42, NO), it generates a collision warning signal that includes information about the image sensor detection area 400 to which the two or more moving objects that collided belong, and outputs the collision warning signal to the higher-level system 120. The higher-level system 120 outputs the collision warning signal to the security room alarm device 130, which then issues an alarm based on the collision warning signal (for example, an alarm indicating that there are two or more people or vehicles that have collided in the image sensor detection area 400) (step S43).
[0077] The image recognition unit 102 of the image sensor 100, which monitors the image sensor detection area 400, outputs detection data to the detection data processing unit 110 that includes information indicating that at least one additional moving object of a specific shape belonging to the image sensor detection area 400 has been detected. If the detection data processing unit 110 determines that a moving object of a specific shape (for example, a security guard stationed in the space where the security room alarm device 130 is installed) is near two or more moving objects that have collided with it (step S44, YES), it stops the alarms being emitted from the image sensor 100 that monitors the image sensor detection area 400 and the alarms being emitted from the security room alarm device 130, and terminates processing. In other words, the speaker 105 continues to emit alarms until the moving object of a specific shape arrives near the two or more moving objects that have collided with it (step S44, NO).
[0078] According to the image sensor 100 of the first to fourth embodiments, by adding a microphone and speaker to the image sensor 100, it is possible to improve responsiveness and reduce the frequency of calling by minimizing the number of components. Furthermore, by equipping the image sensor 100 with a microphone, it is possible to improve detection accuracy. In addition, since the image sensor 100 acquires images and sound, identifies moving objects belonging to a pre-identified area from the images and sound, and outputs an alarm according to the area to which the moving object belongs, the image sensor 100 can provide these services on its own, thus minimizing delay. Furthermore, by connecting a host device to multiple image sensors 100, and having the image sensor 100 transmit only the necessary information to the host device, and having the image sensor 100 perform basic processing, it is possible to minimize processing on the host device and cooperate with the image sensor 100.
[0079] Furthermore, the image sensor 100 makes it possible to immediately notify nearby moving objects or security rooms, etc., of moving objects that are registered in the dictionary and determined to be in a dangerous state, or moving objects that are determined to be in a suspicious state due to taking actions not registered in the dictionary.
[0080] In other words, the monitoring and control device, monitoring and control system, monitoring and control method, and program of this embodiment make it possible to immediately notify the target person of information.
[0081] While several embodiments of the present invention have been described, these embodiments are presented as examples only and are not intended to limit the scope of the invention. These novel embodiments can be carried out in a variety of other forms, and various omissions, substitutions, and modifications can be made without departing from the spirit of the invention. These embodiments and their variations are included in the scope and spirit of the invention, as well as in the claims of the invention and its equivalents. [Explanation of Symbols]
[0082] 100...Image sensor 101...Camera 102...Image Recognition Unit 103... Mike 104...Voice recognition unit 105...Speaker 106...Audio output section 107...Setting tool PC 108…Information Processing Section 109...Detection area identification unit 110...Detection Data Processing Unit 111...Data Communications Department 112…GW device 120…Higher-level system 130...Security room alarm system
Claims
1. An image recognition unit that acquires multiple images from an imaging device, detects a moving object using the difference between the multiple images, and determines the shape of the moving object by comparing the multiple images with a dictionary containing information about its shape. When the image recognition unit detects the moving object in a region identified using detection region information and determines that the shape of the moving object is a specific shape, an information processing unit determines the region to which the moving object belongs. An output unit that issues an alarm according to the region to which the moving object belongs, as determined by the information processing unit, A monitoring and control device equipped with the following features.
2. When the information processing unit determines that the region to which the moving object belongs has changed from the first region to the second region, the output unit issues an alarm. The monitoring and control device according to claim 1, wherein when the information processing unit determines that the area to which the moving object belongs has changed from the second area to the third area, the output unit issues a different alarm than the alarm issued when the object belongs to the second area.
3. The monitoring and control device according to claim 1, further comprising a speech recognition unit that acquires multiple speeches from a sound collection device and detects the speech pattern by comparing the multiple speeches with a dictionary containing information about the speech pattern.
4. The monitoring and control device according to claim 1, wherein the information processing unit, after the alarm is issued, determines that at least one moving object different from the moving object belongs to the area to which the moving object belongs, and thereby stops the output unit from issuing the alarm.
5. An image recognition unit that acquires multiple images from an imaging device, detects a moving object using the difference between the multiple images, and determines the shape of the moving object by comparing the multiple images with a dictionary containing information about its shape. When the image recognition unit detects the moving object in a region identified using detection region information and determines that the shape of the moving object is a specific shape, an information processing unit determines the region to which the moving object belongs. An output unit that issues an alarm according to the region to which the moving object belongs, as determined by the information processing unit, Multiple monitoring and control devices comprising, Among the multiple monitoring and control devices, a higher-level device that causes any nearby monitoring and control device, excluding the monitoring and control device that has issued the alarm, to issue the alarm according to the shape of the moving object, A monitoring and control system equipped with the following features.
6. Image recognition means that acquires multiple images from an imaging device, detects a moving object using the difference between the multiple images, and determines the shape of the moving object by comparing the multiple images with a dictionary containing information about the shape. When the image recognition means detects the moving object in a region identified using detection region information and determines that the shape of the moving object is a specific shape, the information processing means determines the region to which the moving object belongs. An output means for issuing an alarm according to the region to which the moving object, as determined by the information processing means, belongs, A monitoring and control method comprising:
7. A program that causes a computer to execute the monitoring and control method described in claim 6.