Illegal stall detection method and device
A detection method and booth technology, applied in the fields of instruments, character and pattern recognition, computer parts, etc., can solve the problems of inability to meet video real-time monitoring and alarm, human sensory fatigue state, missed detection of abnormal events, etc., and achieve the maintenance of urban traffic. The effect of managing order, alleviating random stalls, and efficient detection
Active Publication Date: 2020-01-07
HANGZHOU HIKVISION DIGITAL TECH
5 Cites 7 Cited by
AI-Extracted Technical Summary
Problems solved by technology
 Most of the current video surveillance systems adopt the traditional manual interpretation method, which requires staff to be on duty day and night in front of the video images, constantly using the naked eye to judge whether there are sudden abnormalities in the video images
This kind of monitoring method has a heavy workload, which will inevitably make the human senses enter ...
Further, the application adopts deep learning theory, adopts semantic network model to carry out target segmentation, and correspondingly has completed the training of the stall target segmentation model, can segment and locate the stall groups with variable shapes, and make full use of target detection and target segmentation The advantage of this method realizes the efficient detection of various booths in the scene;
In summary, the application efficiently detects illegal stalls, not only can reduce labor costs, maintain urban traffic management order, but also can provide real-time reference information for city supervision ...
The invention provides an illegal stall detection method and device. The method comprises the steps of obtaining a video frame corresponding to a to-be-detected place; performing target detection on the video frame by using the trained detection model to obtain a first stall which is a morphological fixed stall; performing image segmentation on the video frame by using the trained segmentation model, and performing connected domain processing on a segmentation result to obtain a second stall which is a stall with changeable forms; acquiring a pre-configured non-permitted stall area of the to-be-detected place; and regarding the stall in the first stall and the second stall, of which the overlapping area with the stall non-allowing area reaches a set threshold value, as an illegal stall. According to the method, the respective advantages and disadvantages of the detection model and the segmentation model are fully utilized to efficiently detect the illegal stall, so that the labor costcan be reduced, the urban traffic management order can be maintained, real-time reference information can be provided for urban supervision departments, the management is convenient, the stall management efficiency is effectively improved, and the problem of disordered stall arrangement is relieved.
Character and pattern recognition
Image segmentationReal-time computing +4
- Experimental program(1)
 Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present application as recited in the appended claims.
 The terminology used in this application is for the purpose of describing particular embodiments only, and is not intended to limit the application. As used in this application and the appended claims, the singular forms "a", "the", and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
 It should be understood that although the terms first, second, third, etc. may be used in this application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present application, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "at" or "when" or "in response to a determination."
 With the rapid development of computer technology, intelligence has also been more and more applied in the field of digital security. It can be considered to add intelligent video analysis to video surveillance, so as to conduct real-time analysis on the video source of concern, effectively avoid information omission, and make various illegal stall behaviors (such as store operations, tourist vendors) be banned at the beginning of occurrence. found and alerted to the on-duty staff.
 Based on this idea, the embodiment of this application proposes a target detection and target segmentation algorithm based on deep learning technology. A large number of samples are trained according to the calibration rules set by target detection to obtain a detection model. The set calibration rules are trained to obtain the segmentation model, making full use of the respective advantages and disadvantages of the detection model and the segmentation model, using the detection model to locate and detect the stalls with high consistency, and using the segmentation model to accurately locate the stalls with variable shapes Positioning and segmentation to complete the task of detecting and locating all stalls in the current frame of the monitoring field of view; judge illegality based on the analyzed stalls and pre-configured illegal stall areas. Finally, by comprehensively analyzing the booth information of a certain number of consecutive frames, an alarm is issued for the booths that are determined to be in multiple consecutive frames.
 The embodiment of the present application may include a training process and an application process of a deep learning network. The training process is first introduced below.
 The embodiment of the present application needs to train two kinds of deep learning networks, one is a detection model for detecting stalls with fixed shapes, and the other is a segmentation model for segmenting stalls with variable shapes. Among them, most of the fixed-form booths are single booths with high consistency and occupy a small area. For the schematic diagram, please refer to figure 1; The booths with various shapes occupy a relatively large area, and the schematic diagram can be found in figure 2.
 Here we first introduce the training process of the detection model:
 The first step is sample collection. Obtain one or any two or any three of different time periods, different weather (such as sunny, rainy, cloudy, etc.), different light intensities (such as day and night), and different monitoring camera installations and different scenes (that is, locations). Sample booth images for any or all three or all of the conditions.
 The monitoring camera setup here includes camera pitch angle and camera imaging quality. As an example, in order to improve the detection effect, the embodiment of the present application may put forward the following requirements on the camera pitch angle and camera imaging quality:
 1) The camera pitch angle is limited to 15 to 90 degrees. like image 3 As shown, the camera pitch angle refers to the angle between the line between the camera and the detection target (referring to the booth in the embodiment of the present application) and the road surface; the camera pitch angle determines the monitoring range of the monitoring camera; the smaller the pitch angle, the larger the monitoring range, However, the pixels of the corresponding captured detection target will be smaller;
 2) On the imaging image, the width of a single booth with high consistency is required to be in the range of 80 to 900 pixels, and the area of a group of booths with variable shapes is greater than 200 pixels.
 It should be noted that, when the camera pitch angle and/or camera imaging quality do not meet the above conditions, this solution still has a certain effect on booth detection and segmentation.
 The second step is sample calibration. Calibrate the fixed booths with high consistency in the booth picture samples (such as trolleys, vehicle booths, tables and chairs, etc.) through the circumscribed rectangle. The calibration is a rectangular frame including the booth. Refer to figure 1 shown in the white box.
 The third step is sample training. After a certain amount of training is performed on the calibrated booth picture samples by using the pre-built first deep learning network, a trained detection model is obtained.
 In the application, based on the Caffe (Convolutional Architecture for Fast Feature Embedding, convolutional neural network framework) environment, the deep learning network structure can be iterated more than 1 million times to achieve convergence and obtain the detection model.
 The following describes the training process of the segmentation model:
 The first step is sample collection. Obtain booth picture samples under different time periods, different weathers, different light intensities, different surveillance camera setups, one or any two or any three or any four or all of the different scenes.
 For the description of the surveillance camera setup, please refer to the detection model section, so I won’t go into details here.
 The second step is sample calibration. The target segmentation calibration rules are different from the target detection calibration rules. The target segmentation adopts pixel-by-pixel calibration, and the booth area and background in the booth picture sample are calibrated for two types of targets. The calibration obtains the outline of the booth with various shapes.
 The third step is sample training. After a certain amount of training is performed on the calibrated booth picture samples by using the pre-built second deep learning network, a trained segmentation model is obtained.
 In the application, based on the Caffe environment, the deep learning semantic segmentation network structure can be used to iterate more than 1 million times to achieve convergence and obtain the segmentation model.
 Based on the detection model and segmentation model trained above, the following pass Figure 4 The flow shown describes the method for detecting illegal booths provided by this application. This method can be applied to surveillance cameras, and can also be applied to back-end servers connected to surveillance cameras. see Figure 4 , the method may include the following steps:
 Step 401: Obtain video frames corresponding to the location to be detected.
 Here, the corresponding video frames to be detected may be collected by a surveillance camera. In one example, the installation of the monitoring camera at the place to be inspected meets the following conditions: the angle between the line between the monitoring camera and the booth and the road surface is 15 degrees to 90 degrees.
 In this embodiment, each frame in the original video stream can be executed Figure 4 The method shown; or, considering that the number of frames per second may be as high as 20 frames, and the picture changes of adjacent frames are very subtle, so in order to reduce the amount of processing, the original video stream can also be sampled, and the sampled video frame execution Figure 4 steps included.
 In the application, the video frame is generally converted into RGB format, and then the video frame in RGB format is executed Figure 4 steps included.
 Step 402: Use the trained detection model to perform target detection on the acquired video frames to obtain the first booth, which is a booth with a fixed shape.
 Due to the advantage of target detection, the first booth with high consistency and its position coordinates can be detected relatively easily from the video frame. It should be noted that, in the embodiment of the present application, the first booth does not specifically refer to a certain fixed booth, but refers to one or more booths detected by the detection model.
 Step 403: Use the trained segmentation model to perform image segmentation on the same video frame, and perform CCL (Connected Component Analysis-Labeling, connected domain processing) on the segmentation result to obtain the second booth, which is a booth with various shapes.
 Due to the advantages of target segmentation, it is easier to segment the contour of the second booth with low consistency that the detection model may miss from the video frame; then the contour of the second booth can be morphologically dilated and corroded to remove the noise area, and then The circumscribing rectangle including the outline of the second booth and the position coordinates of the circumscribing rectangle, that is, the position coordinates of the second booth, are extracted by using CCL.
 Similarly, the second booth here does not specifically refer to a certain fixed booth, but refers to one or more booths detected by the detection model. It should be noted that the first booth and the second booth may include the same booth, so before step 405 is performed, deduplication processing may be performed on the first booth and the second booth.
 Step 404: Obtain the pre-configured area where stalls are not allowed in the place to be detected.
 Here, the area where stalls are not allowed may be an area where stalls are not allowed to be artificially configured for the area displayed in the video frame, and may be represented by the coordinates of one or more rectangular boxes.
 Step 405: among the first booth and the second booth, the booth whose overlapping area with the above-mentioned area not allowed to set up a booth reaches a set threshold is regarded as an illegal booth.
 In an optional embodiment, the process described in step 405 can be performed by Figure 5 The method implementation shown:
 Step 501: Input the position coordinates of the first booth and the second booth identified from the current video frame;
 Step 502: Let variable i=0;
 Step 503: judging whether i is less than the total number of booths identified from the current video frame; if yes, execute step 504; if no, execute step 507;
 Step 504: Select an unprocessed booth from the first booth and the second booth, and determine whether the position coordinates of the booth overlap with the preset position coordinates of the area where stalls are not allowed, and the overlapping area reaches the set threshold ; If yes execute step 505, if no execute step 506;
 For example, suppose the coordinates of the upper left corner of a stall are (50, 50), the coordinates of the lower right corner are (60, 60), the coordinates of the upper left corner of the preset area where stalls are not allowed are (0, 100), and the coordinates of the lower right corner are is (100, 0); then the booth area overlaps with the preset area where stalls are not allowed, and the overlapping area reaches a value of 100 pixels.
 Step 505: save the position coordinates of the booth, and continue to execute step 506;
 Step 506: Add 1 to the value of i, and return to step 503;
 Step 507: Confirm that the saved booth coordinates are the location coordinates of illegal booths.
 Through steps 401-405 and steps 501-507, the illegal booth information of a single video frame can be determined.
 As an embodiment, compared with simply using the illegal booth identification result of a single video frame to make an alarm, the identification result of multiple video frames can achieve the effect of preventing false detection, and the specific implementation method is as follows:
 1) Create corresponding illegal records for the places monitored by each surveillance camera, and the content of the illegal records is empty when they are initially created.
 2) After the illegal booth included in the current video frame is determined, the location coordinates of the determined illegal booth are obtained; then, the location to be detected monitored by the current video frame is determined.
 3) For the position coordinates of each illegal booth identified from the current video frame, determine whether the illegal records of the place to be detected have included the position coordinates of the illegal booth;
 If it is not included, add the location coordinates of the illegal booth to the illegal record, and set the number of illegal frames detected by the location coordinates to 1, and set the number of allowed short-lived frames corresponding to the location coordinates to the initial value, this initial value can be an integer greater than 0. The purpose of counting the number of illegal frames detected at the same location coordinates here is to prevent false detections. The purpose of setting the number of frames allowed to disappear for a short time is to improve the detection rate of illegal booths and avoid the illegal booths not being detected in a later frame. And mistakenly think that the illegal booth does not exist in the place to be tested.
 If the illegal record of the place to be detected includes the position coordinates of the illegal booth, the number of illegal frames recorded in the illegal record at which the position coordinates are detected can be increased by 1. As for the position coordinates recorded in the illegal record corresponding to The number of frames allowed to disappear briefly, you can choose to reset to the initial value, or you can choose to keep it unchanged. Then, it is judged whether the number of illegal frames after adding 1 is equal to the set threshold, and if it is equal to the set threshold, an alarm is generated for the scene to be detected, otherwise, if the number of illegal frames is less than or greater than the set threshold. Then no alarm will be generated, which can avoid repeated alarms for the same place.
 5) For every other position coordinate in the illegal record of the place to be detected that is inconsistent with the position coordinate of the illegal booth identified from the current frame, subtract 1 from the number of allowed short-lived frames corresponding to the other position coordinate; Whether the number of allowed short-lived disappearing frames is 0, if it is to delete the other position coordinates from the illegal record of the place to be detected.
 In an optional embodiment, the process of generating an alarm can be performed by Image 6 The method implementation shown:
 Step 601: Input the location coordinates of the illegal booths identified from the current video frame, and the number of illegal frames detected by each updated location coordinate;
 Step 602: Let variable i=0;
 Step 603: Determine whether i is smaller than the total number of illegal stalls identified from the current frame; if yes, execute step 604, if not, execute step 606.
 Step 604: Screen out an unprocessed position coordinate from the position coordinates of the above-mentioned illegal stalls, and update the state of the position coordinate;
 Specifically, the state of the location coordinate can be updated in the following way: if the number of illegal frames detected by the location coordinate is less than the set threshold, update the attribute of the location coordinate to "suspected illegal target"; if the location coordinate is detected If the number of illegal frames is equal to the set threshold, update the attribute of the position coordinate as "new illegal target"; if the number of illegal frames detected at the position coordinate is greater than the set threshold, update the attribute of the position coordinate as " Illegal targets have been reported”;
 Step 605: Add 1 to the value of i, and return to step 603;
 Step 606: Send an alarm to the location coordinates of the illegal booth whose attribute is "new illegal target" identified in the current video frame.
 So far, the description of the method provided in this application is completed.
As can be seen from the above description, this application has implemented a target detection algorithm based on deep learning, correspondingly completed the training of the booth target detection model, mainly for the detection of booths with high consistency, and has a high detection rate for most of the illegal booths Rate;
 Furthermore, this application adopts the deep learning theory and uses the semantic network model for target segmentation, and correspondingly completes the training of the booth target segmentation model, which can segment and position the stall groups with various shapes, and make full use of the advantages of target detection and target segmentation. Efficient detection of various booths in the scene has been realized;
 Furthermore, this application proposes the use of multi-frame statistical information to realize the judgment logic of whether a booth is illegal, which has the advantages of improving the detection rate, eliminating false triggers, and preventing repeated alarms compared to simply using single-frame detection information to make judgments;
 To sum up, this application efficiently detects illegal stalls, which can not only reduce labor costs and maintain the order of urban traffic management, but also provide real-time reference information for urban supervision departments, facilitate management, effectively improve stall management efficiency, and alleviate disorderly stalls question.
 The method provided by the present application has been described above. The device provided by this application is described below.
 see Figure 7 , Figure 7 The functional block diagram of the illegal booth detection device provided for this application. like Figure 7 As shown, the device includes:
 A video frame acquisition module 701, configured to acquire a video frame corresponding to the place to be detected;
 The target detection module 702 is used to utilize the trained detection model to perform target detection on the video frame to obtain a first booth, and the first booth is a booth with a fixed form;
 The target segmentation module 703 is used to use the trained segmentation model to perform image segmentation on the video frame, and perform connected domain processing on the segmentation result to obtain a second booth, and the second booth is a booth with a variety of shapes;
 A configuration acquiring module 704, configured to acquire a pre-configured area where stalls are not allowed in the place to be detected;
 The detection and judging module 705 is configured to consider a booth whose overlapping area with the area not allowed to set up a stall reaches a set threshold among the first booth and the second booth as an illegal booth.
 In one of the embodiments, the device may also include:
 The statistical module is used to obtain the position coordinates of the illegal booths; for the obtained position coordinates of each illegal booth, it is judged whether the position coordinates have been included in the illegal record of the place to be detected; if not, the position coordinates The coordinates are added to the illegal record, and the number of illegal frames detected by the position coordinate is set to 1, and the allowed short-disappearing frame number corresponding to the position coordinate is set to the initial value.
 In one of the implementation manners, the statistical module is further configured to: if the location coordinate is included in the illegal record of the place to be detected, the number of illegal frames recorded in the illegal record where the location coordinate is detected Add 1; and determine whether the number of illegal frames after adding 1 is equal to the set threshold, and if so, generate an alarm for the scene to be detected.
 In one of the implementation manners, the statistical module is further configured to subtract 1 from the number of allowed short-disappearing frames corresponding to other position coordinates in the illegal record that are inconsistent with the position coordinates of the illegal booth ; and judge whether the allowable number of short-disappearing frames after subtracting 1 is 0, and if so, delete the other position coordinates from the illegal record.
 In one of the implementation manners, the target detection module 702 can obtain the detection model by training in the following manner: acquiring one or more of different time periods, different weathers, different light intensities, different monitoring camera installations, and different Booth picture samples under these conditions, the stall picture samples are marked with a fixed shape by the circumscribed rectangle; after using the pre-built first deep learning network to carry out a certain amount of training on the booth picture samples, the trained detection model.
 In one of the implementation manners, the target segmentation module 703 can obtain the segmentation model by training in the following manner: acquiring one or more of different time periods, different weather, different light intensities, different monitoring camera installations, and different scenes. Booth picture sample under this kind of condition, on described booth picture sample, mark out the profile of the stall of shape changeable by pixel calibration mode on the described booth picture sample; After utilizing the second deep learning network that builds in advance to carry out certain amount of training to described booth picture sample, Get the trained segmentation model.
 In one of the implementation manners, the video frame acquisition module 701 is configured to collect video frames corresponding to the place to be detected through a monitoring camera, wherein the installation of the monitoring camera at the place to be detected meets the following conditions: The included angle between the line between the monitoring camera and the booth and the road surface is 15° to 90°.
 It should be noted that the division of units in the embodiment of the present application is schematic, and is only a logical function division, and there may be another division manner in actual implementation. Each functional unit in the embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
 For the implementation process of the functions and effects of each unit in the above device, please refer to the implementation process of the corresponding steps in the above method for details, and will not be repeated here.
 So far, complete Figure 7 Description of the device shown.
 The present application also provides an illegal booth detection device, including a processor, a memory and a bus, and the processor and the memory are connected to each other through the bus; machine-readable instructions are stored in the memory, and the processor is connected to each other through the bus call the machine readable instructions to implement as Figure 4 method shown.
 In addition, the present application also provides a machine-readable storage medium, the machine-readable storage medium stores machine-readable instructions, and when the machine-readable instructions are called and executed by a processor, the machine-executable instructions cause the processor to implement Figure 4 method shown.
 The above is only a preferred embodiment of the application, and is not intended to limit the application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the application should be included in the application. within the scope of protection.
Description & Claims & Application Information
We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Similar technology patents
Method for rapidly identifying raw materials of feed
InactiveCN108896447AWide range of applicationsEfficient detection
Owner:GUANGDONG NUTRIERA GRP CO LTD +2
Feeding and detecting mechanism of automatic pipe chamfering machine
Owner:HUZHOU IRON FORCE METAL PROD
Real-time fluorescent PCR primer and detection kit for A137R gene of African swine fever virus and application of real-time fluorescent PCR primer and detection kit
PendingCN113981137AStrong specificityEfficient detection
Performance detection equipment for sleep monitoring mattress
Detection device for integrally-cast rotating wheel
PendingCN113551577AEfficient detectionSimplify testing procedures
Owner:NINGXIA KOCEL MOLD
Classification and recommendation of technical efficacy words
- Reduce labor costs
- Efficient detection
Taxi hiring system and taxi hiring method
InactiveCN102752393Areduce delaysReduce labor costs
Application of male sterility gene OsDPW2 and rice sterility recovery method
ActiveCN106011167AIncrease productivityReduce labor costs
Owner:SHANGHAI JIAO TONG UNIV
Multi-network cooperative network optimization and energy saving method and system
ActiveCN105357692Abig spaceReduce labor costs
Owner:BEIJING TUOMING COMM TECH
Remote after-service on-line ECU refresh method and diagnosis apparatus
ActiveCN105094901AReduce labor costsensure safety
Owner:GUANGZHOU XIAOPENG MOTORS TECH CO LTD +1
Logistics robot, method and controller used for logistics robot, and computer readable medium
InactiveCN107807652AReduce labor costsImprove work efficiency
Owner:LINGDONG TECH BEIJING CO LTD
Herringbone gear shaft and production method thereof
ActiveCN102937173Alow manufacturing costEfficient detection
Owner:WUXI WEIFU CHINA ITAL GEAR
Imaging apparatus and subject tracking method
Owner:CASIO COMPUTER CO LTD
Probe clamp suitable for ultrasonic testing of baseboard of oil storage tank
InactiveCN102721752AEfficient detectioneasy to operate
Owner:SOUTHWEST PETROLEUM UNIV
Information processing device, information processing method, computer-readable recording medium, and inspection system
320t torpedo car axle box temperature measurement system
InactiveCN107576398AEfficient detectionReport in time