A scene detection method, apparatus, medium and device
By calculating the change in the ratio of the target to its shadow, the software distinguishes between indoor and outdoor scenes, solving the problems of low detection accuracy and poor image quality caused by the lack of sensors in cameras, and achieving higher detection accuracy and improved image quality.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHEJIANG UNIVIEW TECH CO LTD
- Filing Date
- 2021-11-01
- Publication Date
- 2026-06-12
Smart Images

Figure CN116071813B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to a scene detection method, apparatus, medium and device. Background Technology
[0002] With the rapid development of technology, scene detection has improved in terms of event detection and image quality in the field of security monitoring, and has gained increasing popularity among users.
[0003] In the field of security monitoring, effectively distinguishing between indoor and outdoor scenes not only improves the recognition performance of traditional algorithms but also significantly enhances image quality. Currently, most commonly used scene detection methods employ multiple sensors integrated into the terminal to collect data on indoor and outdoor scenes, creating a sample dataset corresponding to that state. A classifier corresponding to each state is then used, along with the extraction of classification attribute data from the collected data, to determine the current scene category.
[0004] Traditional scene detection methods require the use of sensors to distinguish scene categories, and most surveillance cameras do not have this capability, resulting in low scene detection accuracy and poor image quality captured by the cameras. Summary of the Invention
[0005] This application provides a scene detection method, apparatus, medium, and device that can determine the change in the ratio of a target to its shadow over at least two frames by calculating the ratio, thereby achieving the purpose of more accurately determining the scene category in which the target is located.
[0006] In a first aspect, embodiments of this application provide a scene detection method, the method comprising:
[0007] Acquire a preset number of images;
[0008] Target detection and motion detection are performed on the preset number of frames of images to determine the target information, target shadow information, and target motion information in each frame of the image.
[0009] If the target motion information meets the recognition conditions, then the ratio of the target to the target shadow is determined frame by frame based on the target information and the target shadow information.
[0010] The target is determined to be in an outdoor or indoor scene based on the change in the ratio of the target to its shadow over at least two frames.
[0011] Secondly, embodiments of this application provide a scene detection device, which includes:
[0012] The image acquisition module is used to acquire a preset number of images;
[0013] The image information determination module is used to perform target detection and motion detection on the preset number of frames of images to determine the target information, target shadow information and target motion information in each frame of images;
[0014] The ratio determination module is used to determine the ratio of the target to the target shadow frame by frame based on the target information and the target shadow information if the target motion information meets the recognition conditions.
[0015] The scene determination module is used to determine whether the target is in an outdoor scene or an indoor scene based on the change in the ratio of the target to its shadow over at least two frames.
[0016] Thirdly, embodiments of this application provide a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the scene detection method as described in embodiments of this application.
[0017] Fourthly, embodiments of this application provide an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the scene detection method as described in the embodiments of this application.
[0018] The technical solution provided in this application calculates the ratio of target information to target shadow information frame by frame, and then determines the scene category of the target based on the change in the ratio of the target to the target shadow in at least two frames. This achieves scene category differentiation using software, improving the accuracy of scene detection results, and thus controlling image parameters according to the scene to improve image quality. On the other hand, this technical solution does not adopt the hardware of traditional scene detection methods, but uses software recognition, which is applicable to existing cameras, has wide adaptability, and is easy to popularize. Attached Figure Description
[0019] Figure 1 This is a flowchart of the scene detection method provided in the embodiments of this application;
[0020] Figure 2 This is a flowchart of another scene detection method provided in the embodiments of this application;
[0021] Figure 3 This is a flowchart of another scene detection method provided in the embodiments of this application;
[0022] Figure 4 This is a structural block diagram of a scene detection device provided in an embodiment of this application;
[0023] Figure 5 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation
[0024] The present application will now be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the application and not intended to limit it. Furthermore, it should be noted that, for ease of description, the accompanying drawings show only the parts relevant to the present application, not the entire structure.
[0025] Before discussing the exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although the flowcharts describe the steps as sequential processes, many of these steps can be performed in parallel, concurrently, or simultaneously. Furthermore, the order of the steps can be rearranged. The process can be terminated when its operation is complete, but may also have additional steps not included in the figures. The process can correspond to a method, function, procedure, subroutine, subroutine, etc.
[0026] Figure 1 This is a flowchart of a scene detection method provided in an embodiment of this application. This embodiment is applicable to scene detection in security monitoring. The method can be executed by the scene detection device provided in this embodiment, which can be implemented by software and / or hardware and can be integrated into an electronic device. Figure 1 As shown, the scene detection method includes:
[0027] S110, acquire a preset number of frames of images.
[0028] This application embodiment can acquire images and identify the scene using a front-end camera, or it can acquire images and identify the scene using a back-end server. After determining the scene, image parameters can be controlled according to the scene. Therefore, the execution entity of this application embodiment can be the camera itself or the back-end connected server; no specific limitation is made here.
[0029] In animation, a frame is the smallest unit of visual imagery, equivalent to a single shot on film. A frame is a still image, and consecutive frames form animation, such as television images. A preset number of frames can be understood as a set of frame images.
[0030] The preset number of image frames can be obtained through backend network transmission. For example, if there are 30 frames per second, the backend server will extract 3 preset number of image frames through network transmission.
[0031] For example, the camera can acquire a preset number of 3 frames of images via network transmission through a backend connected server.
[0032] Therefore, the preset number of frames can be consecutive frames from the video, or frames extracted from it. Specifically, extraction can be performed within a certain time period. For example, when a moving target is detected within the camera's field of view, a detection period of 5 seconds can be defined. This means that all frames within that 5-second period can be used as the analysis object, or a certain number of frames can be extracted from that 5-second period as the analysis object.
[0033] S120, Perform target detection and motion detection on a preset number of frames of images to determine the target information, target shadow information and target motion information in each frame of images.
[0034] The target can be understood as the object to be detected in the pre-set number of image frames, which can be a person or an object.
[0035] Object detection, also known as object extraction, can be understood as finding all objects of interest in an image. It includes two sub-tasks: object localization and object classification, simultaneously determining the object's category and location. Automatic object extraction is particularly important in complex scenes where multiple targets need to be processed in real time. Taking human detection as an example, human detection is a method based on extracting human contours and pedestrian body features, such as height, and matching them with a pre-trained human model library to calculate the pedestrian's location.
[0036] Motion detection, also known as motion detection, is commonly used in unattended video surveillance and automatic alarm systems. Images captured by a camera at different frame rates are calculated and compared according to a certain algorithm. When there is a change in the scene, such as someone walking by or the camera being moved, the calculated comparison result will exceed a threshold, instructing the system to take appropriate action automatically.
[0037] Target information can be the feature information of the target that needs to be detected and motion detected, such as the target's outline. Target shadow information can be the length and position of the shadow cast by the target due to illumination, or the angle between the target and its shadow. Target motion information can be the target's speed, displacement, and direction of motion.
[0038] After acquiring a preset number of images, target detection and motion detection can be performed on the preset number of images through a backend connected server.
[0039] For example, for the three frames of images captured by the camera obtained in S110 above, the objects to be detected are determined by target detection, such as the height features of a person and the length of a person's shadow, and the motion information of the person in each frame of the image is determined by motion detection, such as the distance of the person's movement and the direction of the person's movement.
[0040] S130, if the target motion information meets the recognition conditions, then the ratio of the target to the target shadow is determined frame by frame based on the target information and the target shadow information.
[0041] After performing target detection and motion detection on a preset number of image frames, a backend connected server can be used to determine whether the target motion information meets the recognition conditions.
[0042] The identification criteria can be understood as a constraint on whether the target's motion information meets certain requirements. The ratio can be the ratio of the target's height to the length of its shadow.
[0043] For example, in S120 above, if the motion information of the person in each frame of the image meets a certain requirement, that is, the recognition condition is satisfied, then the ratio of the person to the person's shadow is determined frame by frame based on the person's characteristics and the person's shadow information.
[0044] S140, based on the amount of change in the ratio of the target to its shadow over at least two frames, determine whether the target is in an outdoor or indoor scene.
[0045] After determining the ratio of the target to its shadow frame by frame, the change in this ratio over at least two frames is calculated. This can be done using a backend-connected server. Specifically, based on the magnitude of the change in the target-to-shadow ratio over at least two frames, it is determined whether the change falls within the range for indoor or outdoor scenes, thus identifying whether the target is in an indoor or outdoor scene.
[0046] For example, based on the magnitude of the change in the ratio of a person to their shadow determined in S130 above over at least two frames, if the magnitude of the change is within the range of ratio changes in an outdoor scene, then the person is determined to be in an outdoor scene; otherwise, the person is determined to be in an indoor scene.
[0047] The technical solution provided in this application involves acquiring a preset number of image frames; performing target detection and motion detection on these frames to determine target information, target shadow information, and target motion information in each frame; if the target motion information meets the recognition criteria, determining the target-to-target shadow ratio frame by frame based on the target information and target shadow information; and determining whether the target is in an outdoor or indoor scene based on the change in the target-to-target shadow ratio over at least two frames. This technical solution can determine the scene category based on the change in the target-to-target shadow ratio over at least two frames, achieving software-based scene category differentiation, improving the accuracy of scene detection results, and thus controlling image parameters according to the scene to improve image quality.
[0048] In some embodiments, determining whether the target is in an outdoor scene or an indoor scene based on the change in the ratio of the target to its shadow over at least two frames includes: if the change in the ratio of the target to its shadow over at least two frames is less than a set threshold, then the target is determined to be in an outdoor scene; if the change in the ratio of the target to its shadow over at least two frames is greater than a set threshold, then the target is determined to be in an indoor scene.
[0049] When calculating the change in the ratio of the target to its shadow over at least two frames to determine whether the target is in an outdoor or indoor scene, this is done by comparing the change in the ratio of the target to its shadow over at least two frames with a threshold. A backend server can be used to compare the change in the ratio of the target to its shadow over at least two frames with the threshold to determine whether the target is in an outdoor or indoor scene. The threshold, also called the critical value, refers to the minimum or maximum value that an effect can produce. It can be used to determine the scene category to which the change in the ratio of the target to its shadow over at least two frames belongs. For example, the range of change in the ratio of the target to its shadow over at least two frames is 0-1.3 for outdoor scenes, and the range of change in the ratio of the target to its shadow over at least two frames is above 1.3 for indoor scenes; therefore, the threshold can be set to 1.3.
[0050] For example, based on the magnitude of the change in the ratio of a person's shadow to that person's shadow determined in the above steps over at least two frames, if the magnitude of the change is less than 1.3, the target is determined to be in an outdoor scene; if the magnitude of the change is greater than 1.3, the target is determined to be in an indoor scene. The advantage of this setting is that it can improve the accuracy of determining whether the target is in an indoor or outdoor scene.
[0051] In some embodiments, the ratio of the target to its shadow is calculated using the following formula:
[0052] ;
[0053] in, The ratio of the target to its shadow. For the target height, The length of the target's shadow.
[0054] When calculating the ratio of the target to its shadow, the ratio is determined by the target height and the length of its shadow. This ratio calculation can be performed on a backend server. Both the target height and shadow length are measured in meters.
[0055] For example, assuming a person's height is 1.8 meters and their shadow is 1.5 meters long, substituting these values into the ratio calculation formula yields a ratio of 1.2 between the person and their shadow. The advantage of this setting is that it allows for a more accurate determination of the scene category in which the target is located.
[0056] In some embodiments, the process of determining whether the target motion information meets the recognition conditions includes: if the cumulative motion displacement of the target motion information reaches a preset length, it is determined that the recognition conditions are met; if the cumulative motion displacement of the target motion information does not reach the preset length, it is determined that the recognition conditions are not met.
[0057] To determine whether target motion information meets the recognition criteria, it is necessary to compare the cumulative displacement of the target motion information with a preset length. This comparison can be performed using a backend server. The preset length can be understood as a critical value for measuring the cumulative displacement of the target motion information, such as 2 meters.
[0058] For example, if the cumulative displacement of a person's motion information is 2.2 meters, reaching a preset length of 2 meters, then the person's motion information is determined to meet the recognition criteria. The advantage of this setting is that it makes the determination of whether the recognition criteria are met more accurate.
[0059] In some embodiments, if the target motion information meets the recognition conditions, the ratio of the target to the target shadow is determined frame by frame based on the target information and the target shadow information, including: if the target motion information meets the recognition conditions, for each frame image, the target information and the target shadow information are numbered according to the frame number; the target information and the target shadow information with the same number are retrieved to calculate the ratio of the target to the target shadow.
[0060] When determining the target-to-shadow ratio frame by frame, the target and shadow information in each frame can be numbered, and the target and shadow information with the same number can be retrieved to calculate the target-to-shadow ratio. The backend server can number the target and shadow information frame by frame and extract the target and shadow information with the same number to calculate the target-to-shadow ratio. The frame number can be understood as the sequence number of each acquired image frame; for example, the first frame has a frame number of 1.
[0061] For example, for a preset number of 3 frames of images captured by a camera, the frame numbers are 1, 2, and 3 respectively. Then, the shadow information of people is numbered according to the frame number, with people numbered 1, 2, and 3 respectively, and the shadow information of people numbered 1, 2, and 3 respectively. The shadow information of people with the same number is retrieved, and the ratio of the shadow of people in each frame is calculated according to the ratio calculation formula. The specific calculation method is the same as above and will not be repeated. Then, the change in the ratio of the shadow of people in at least two frames is calculated. The advantage of this setting is that the ratio is calculated more accurately frame by frame, improving the accuracy of calculating the change in ratio in at least two frames.
[0062] Based on the above technical solutions, Figure 2 This is a flowchart of another scene detection method provided in an embodiment of this application. This solution is a further optimization based on the above-described technical solutions. Figure 2 As shown, the scene detection method includes:
[0063] S210, acquire a preset number of frames of images.
[0064] S220, target detection and motion detection are performed on a preset number of frames of images to determine the target information, target shadow information and target motion information in each frame of images.
[0065] S230, if the target motion information meets the recognition conditions, then the ratio of the target to the target shadow is determined frame by frame based on the target information and the target shadow information.
[0066] S240, based on the amount of change in the ratio of the target to its shadow over at least two frames, determines whether the target is in an outdoor or indoor scene.
[0067] S250, determine the target's motion direction based on the target's motion information; if the target's motion direction is opposite to the direction of the target's shadow, then determine the target's motion state as a backlighting scene; if the target's motion direction is the same as the direction of the target's shadow, then determine the target's motion state as a frontlighting scene.
[0068] After determining whether the target is in an outdoor or indoor scene, it's necessary to determine whether the target's motion state is in a front-lit or back-lit scene based on the target's movement direction and the direction of its shadow. This can be achieved by a backend server using a preset number of image frames captured by the camera to determine the target's movement direction and shadow direction. The target's motion state can be understood as whether the target is in a front-lit or back-lit scene while moving.
[0069] Continuing with the example above, if a person is identified as being in an indoor scene, and the direction of movement in the person's motion information is opposite to the direction of their shadow, then the person's motion state is a backlit scene; otherwise, the person's motion state is a front-lit scene. The advantage of this setting is that image parameters can be determined based on whether the target is in a front-lit or backlit scene, allowing for clearer acquisition of the target's feature information.
[0070] The technical solution provided in this application determines the target's motion direction based on the target's motion information. If the target's motion direction is opposite to the direction of its shadow, the target's motion state is determined to be a backlit scene; if the target's motion direction is the same as the direction of its shadow, the target's motion state is determined to be a front-lit scene. This technical solution can determine the scene category based on the change in the ratio of the target to its shadow over at least two frames. Based on whether the target is in a front-lit or backlit scene, image parameters are determined, allowing for clearer acquisition of the target's feature information. This software-based approach distinguishes scene categories, improves the accuracy of scene detection results, and enables control of image parameters according to the scene to enhance image quality.
[0071] Based on the above technical solutions, this application provides a preferred scene detection method. Figure 3 This is a flowchart of another scene detection method provided in the embodiments of this application, such as... Figure 3 As shown, the method in this embodiment specifically includes the following steps:
[0072] S310, acquire a preset number of frames of images.
[0073] S320 performs target detection and motion detection on a preset number of frames of images to determine the target information, target shadow information, and target motion information in each frame of images.
[0074] After acquiring a preset number of image frames, the backend server transmits the data over the network to perform target detection and motion detection on these frames. Specifically, by performing target detection and motion detection on the preset number of image frames, target information in each frame can be determined. For example, the target information in the acquired preset number of image frames might be that a person's height is 1.8 meters, the target shadow information is that the person's shadow length is 1.2 meters, and the target motion information is that the person's movement direction is the same as the direction of the person's shadow.
[0075] For example, target detection and motion detection are performed on a preset number of frames of images to obtain the height of the person in each frame as 1.8 meters, the length of the person's shadow as 1.2 meters, and the direction of the person's movement as the same as the direction of the person's shadow.
[0076] S330, if the cumulative displacement of the target motion information reaches the preset length, it is determined that the recognition condition is met; if the cumulative displacement of the target motion information does not reach the preset length, it is determined that the recognition condition is not met.
[0077] When determining whether the target motion information meets the recognition criteria, the cumulative displacement in the collected target motion information is compared with a preset length to determine whether the recognition criteria are met. Specifically, the cumulative displacement length can be understood as the distance the target moves during its motion, for example, 2.1 meters. By determining whether the cumulative displacement of the target motion information reaches the preset length, it is thus determined whether the recognition criteria are met.
[0078] For example, if the cumulative displacement of a person's motion information is 2.1 meters, which exceeds the preset length of 2 meters, then the person's motion information does indeed meet the recognition conditions.
[0079] S340, if the target motion information meets the recognition conditions, then for each frame of image, the target information and target shadow information are numbered according to the frame number; the target information and target shadow information with the same number are retrieved to calculate the ratio of the target to the target shadow.
[0080] Once the target motion information is determined to meet the recognition criteria, the target information and target shadow information can be numbered frame by frame. The server connected to the backend retrieves the target information and target shadow information with the same number and calculates the ratio of the target information to the target shadow information.
[0081] For example, if the human motion information meets the recognition conditions, then for the acquired image of a preset number of frames, the height of the person and the shadow of the person are numbered frame by frame according to the frame number. For example, in the first frame of the image, the height of the person is numbered 1 and the shadow of the person is numbered 1. The height information and shadow information of the person are numbered in the second to sixth frames in the same way. Then, the information of the person and the shadow information with the same number are retrieved. For example, the height information of the person is 1.8 meters and the shadow information of the person is 1.2 meters. The ratio of the height of the person to the length of the shadow is calculated according to the ratio formula. The specific calculation method is the same as above and will not be repeated.
[0082] S350: Based on the change in the ratio of the target to its shadow over at least two frames, determine whether the target is in an outdoor scene or an indoor scene; if the change in the ratio of the target to its shadow over at least two frames is less than a set threshold, determine that the target is in an outdoor scene; if the change in the ratio of the target to its shadow over at least two frames is greater than a set threshold, determine that the target is in an indoor scene.
[0083] After calculating the ratio of target information to target shadow information frame by frame, it is necessary to calculate the change in the ratio of target to target shadow over at least two frames to determine the scene category in which the target is located.
[0084] For example, based on the ratio of the person's information to their shadow information obtained in the above steps, the change in the ratio over at least two frames is calculated to determine whether the person is in an outdoor or indoor scene. For instance, if the change in the ratio of the person to their shadow is 1.5 in two frames (the first and third frames) of a predetermined number of images, which is greater than a set threshold of 1.3, then the person is determined to be in an indoor scene.
[0085] S360: Determine the target's motion direction based on the target's motion information; if the target's motion direction is opposite to the direction of the target's shadow, then determine that the target's motion state is a backlighting scene; if the target's motion direction is the same as the direction of the target's shadow, then determine that the target's motion state is a frontlighting scene.
[0086] After determining the scene category of the target, the lighting scene category is further determined by comparing the target's movement direction with the direction of its shadow. For example, based on the fact that the direction of the person's movement is the same as the direction of their shadow in the person's motion information, the person's motion state is determined to be a front-lit scene.
[0087] The technical solution provided in this application can determine the scene category based on the change in the ratio of the target to its shadow in at least two frames. This achieves the use of software to distinguish scene categories, improves the accuracy of scene detection results, and allows for the control of image parameters based on the scene to improve image quality.
[0088] Figure 4 This is a structural block diagram of a scene detection device provided in an embodiment of this application. This device can execute the scene detection method provided in any embodiment of this application, and has the corresponding functional modules and beneficial effects for executing the method. Figure 4 As shown, the device may include:
[0089] Image acquisition module 410 is used to acquire a preset number of frames of images;
[0090] The image information determination module 420 is used to perform target detection and motion detection on the preset number of frames of images to determine the target information, target shadow information and target motion information in each frame of images.
[0091] The ratio determination module 430 is used to determine the ratio of the target to the target shadow frame by frame based on the target information and the target shadow information if the target motion information meets the recognition conditions.
[0092] The scene determination module 440 is used to determine whether the target is in an outdoor scene or an indoor scene based on the change in the ratio of the target to its shadow over at least two frames.
[0093] Optionally, the scene determination module 440 includes:
[0094] An outdoor scene determination unit is used to determine that the target is in an outdoor scene if the change in the ratio of the target to its shadow is less than a set threshold in at least two frames.
[0095] An indoor scene determination unit is used to determine that the target is in an indoor scene if the change in the ratio of the target to its shadow is greater than a set threshold in at least two frames.
[0096] Optionally, the device also includes a motion direction recognition module for:
[0097] The target's motion direction is determined based on the target's motion information; if the target's motion direction is opposite to the direction of the target's shadow, the target's motion state is determined to be a backlit scene; if the target's motion direction is the same as the direction of the target's shadow, the target's motion state is determined to be a frontlit scene.
[0098] Optionally, the ratio of the target to its shadow is calculated using the following formula:
[0099] ;
[0100] in, The ratio of the target to its shadow. For the target height, The length of the target's shadow.
[0101] Optionally, the ratio determination module 430 includes:
[0102] The identification condition determination unit is used to determine that the identification condition is met if the cumulative motion displacement of the target motion information reaches a preset length, and to determine that the identification condition is not met if the cumulative motion displacement of the target motion information does not reach the preset length.
[0103] Optionally, the ratio determination module 430 includes:
[0104] The information numbering unit is used to number the target information and the target shadow information according to the frame number for each frame of image if the target motion information meets the recognition conditions.
[0105] The information number retrieval unit is used to retrieve target information and target shadow information with the same number in order to calculate the ratio of target to target shadow.
[0106] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional modules is merely an example. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. The specific working process of the functional modules described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.
[0107] The apparatus in this embodiment acquires a preset number of image frames through the cooperation of various modules; performs target detection and motion detection on the preset number of image frames to determine target information, target shadow information, and target motion information in each frame; if the target motion information meets the recognition conditions, the ratio of the target to the target shadow is determined frame by frame based on the target information and target shadow information; and the target is determined to be in an outdoor or indoor scene based on the change in the ratio of the target to the target shadow over at least two frames. This technical solution can determine the scene category based on the change in the ratio of the target to the target shadow over at least two frames, realizing the differentiation of scene categories using software, improving the accuracy of scene detection results, and thus controlling image parameters according to the scene to improve image quality.
[0108] This application provides a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the scene detection method provided in all embodiments of this application:
[0109] Acquire a preset number of images;
[0110] Target detection and motion detection are performed on the preset number of frames of images to determine the target information, target shadow information, and target motion information in each frame of the image.
[0111] If the target motion information meets the recognition conditions, then the ratio of the target to the target shadow is determined frame by frame based on the target information and the target shadow information.
[0112] The target is determined to be in an outdoor or indoor scene based on the change in the ratio of the target to its shadow over at least two frames.
[0113] Any combination of one or more computer-readable media may be used. A computer-readable medium can be a computer-readable signal medium or a computer-readable storage medium. A computer-readable storage medium can be, for example—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of computer-readable storage media include: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this document, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device.
[0114] Computer-readable signal media may include data signals propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals may take various forms, including—but not limited to—electromagnetic signals, optical signals, or any suitable combination thereof. Computer-readable signal media may also be any computer-readable medium other than computer-readable storage media, capable of transmitting, propagating, or transmitting programs for use by or in connection with an instruction execution system, apparatus, or device.
[0115] The program code contained on a computer-readable medium may be transmitted using any suitable medium, including—but not limited to—wireless, wire, optical fiber, RF, etc., or any suitable combination thereof.
[0116] Computer program code for performing the operations of this application can be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, and C++, and conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).
[0117] This application provides an electronic device. Figure 5This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. For example... Figure 5 As shown, this embodiment provides an electronic device 500, which includes: one or more processors 520; and a storage device 510 for storing one or more programs. When the one or more programs are executed by the one or more processors 520, the one or more processors 520 implement the scene detection method provided in this embodiment, the method including:
[0118] Acquire a preset number of images;
[0119] Target detection and motion detection are performed on the preset number of frames of images to determine the target information, target shadow information, and target motion information in each frame of the image.
[0120] If the target motion information meets the recognition conditions, then the ratio of the target to the target shadow is determined frame by frame based on the target information and the target shadow information.
[0121] Based on the change in the ratio of the target to its shadow over at least two frames, the location of the target is determined to be either in an outdoor or indoor scene. Of course, those skilled in the art will understand that the processor 520 also implements the technical solutions of the scene detection methods provided in any embodiment of this application.
[0122] Figure 5 The electronic device 500 shown is merely an example and should not impose any limitations on the functionality and scope of use of the embodiments of this application.
[0123] like Figure 5 As shown, the electronic device 500 includes a processor 520, a storage device 510, an input device 530, and an output device 540; the number of processors 520 in the electronic device can be one or more. Figure 5 Taking a processor 520 as an example; the processor 520, storage device 510, input device 530, and output device 540 in the electronic device can be connected via a bus or other means. Figure 5 Taking the connection via bus 550 as an example.
[0124] The storage device 510, as a computer-readable storage medium, can be used to store software programs, computer-executable programs, and module units, such as the program instructions corresponding to the scene detection method in the embodiments of this application.
[0125] Storage device 510 may primarily include a program storage area and a data storage area. The program storage area may store the operating system and at least one application program required for a function; the data storage area may store data created based on terminal usage. Furthermore, storage device 510 may include high-speed random access memory and non-volatile memory, such as at least one disk storage device, flash memory device, or other non-volatile solid-state storage device. In some instances, storage device 510 may further include memory remotely located relative to processor 520, and these remote memories can be connected via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
[0126] Input device 530 can be used to receive input digital, character, or voice information, and to generate key signal inputs related to user settings and function control of the electronic device. Output device 540 may include electronic devices such as a display screen and a speaker.
[0127] The electronic device provided in this application embodiment can determine the scene category based on the change in the ratio of the target to the target shadow in at least two frames. This realizes the use of software to distinguish scene categories, improves the accuracy of scene detection results, and thus controls image parameters according to the scene to improve image quality.
[0128] The scene detection device, medium, and electronic device provided in the above embodiments can execute the scene detection method provided in any embodiment of this application, and have the corresponding functional modules and beneficial effects for executing the method. Technical details not described in detail in the above embodiments can be found in the scene detection method provided in any embodiment of this application.
[0129] Note that the above are merely preferred embodiments and the technical principles employed in this application. Those skilled in the art will understand that this application is not limited to the specific embodiments described herein, and various obvious changes, readjustments, and substitutions can be made without departing from the scope of protection of this application. Therefore, although this application has been described in detail through the above embodiments, this application is not limited to the above embodiments, and may include many other equivalent embodiments without departing from the concept of this application, the scope of which is determined by the scope of the appended claims.
Claims
1. A scene detection method, characterized in that, The method includes: Acquire a preset number of images; Target detection and motion detection are performed on the preset number of frames of images to determine the target information, target shadow information, and target motion information in each frame of the image. If the target motion information meets the recognition conditions, then the ratio of the target to the target shadow is determined frame by frame based on the target information and the target shadow information. The location of the target is determined to be either an outdoor or indoor scene based on the magnitude of the change in the ratio of the target to its shadow over at least two frames.
2. The method according to claim 1, characterized in that, Determining whether the target is in an outdoor or indoor scene based on the change in the ratio of the target to its shadow over at least two frames includes: If the change in the ratio of the target to its shadow is less than a set threshold in at least two frames, then the target is determined to be in an outdoor scene. If the ratio of the target to its shadow changes more than a set threshold in at least two frames, then the target is determined to be in an indoor scene.
3. The method according to claim 1, characterized in that, After determining whether the target is in an outdoor or indoor scene, the method further includes: Determine the target's direction of motion based on the target's motion information; If the direction of the target's movement is opposite to the direction of the target's shadow, then the target's movement state is determined to be a backlighting scene; If the direction of the target's movement is the same as the direction of the target's shadow, then the target's movement state is determined to be a front-lit scene.
4. The method according to claim 1, characterized in that, The ratio of the target to its shadow is calculated using the following formula: ; in, The ratio of the target to its shadow. For the target height, The length of the target's shadow.
5. The method according to claim 1, characterized in that, The process of determining whether the target motion information meets the recognition conditions includes: If the cumulative displacement of the target motion information reaches a preset length, it is determined that the recognition condition is met; If the cumulative displacement of the target motion information does not reach the preset length, it is determined that the recognition condition is not met.
6. The method according to claim 5, characterized in that, If the target motion information meets the recognition conditions, then based on the target information and the target shadow information, the ratio of the target to the target shadow is determined frame by frame, including: If the target motion information meets the recognition conditions, then for each frame of image, the target information and the target shadow information are numbered according to the frame number; Retrieve target information and target shadow information with the same number to calculate the ratio of target to target shadow.
7. A scene detection device, characterized in that, include: The image acquisition module is used to acquire a preset number of images; The image information determination module is used to perform target detection and motion detection on the preset number of frames of images to determine the target information, target shadow information and target motion information in each frame of images; The ratio determination module is used to determine the ratio of the target to the target shadow frame by frame based on the target information and the target shadow information if the target motion information meets the recognition conditions. The scene determination module is used to determine whether the target is in an outdoor scene or an indoor scene based on the magnitude of the change in the ratio of the target to its shadow over at least two frames.
8. The apparatus according to claim 7, characterized in that, The scene determination module includes: An outdoor scene determination unit is used to determine that the target is in an outdoor scene if the change in the ratio of the target to its shadow is less than a set threshold in at least two frames. An indoor scene determination unit is used to determine that the target is in an indoor scene if the change in the ratio of the target to its shadow is greater than a set threshold in at least two frames.
9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the scene detection method as described in any one of claims 1-6.
10. An electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the scene detection method as described in any one of claims 1-6.