Video stream processing method and apparatus, and electronic equipment

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The video stream processing method addresses occlusion issues in live streaming by dynamically adjusting detection and processing frequencies based on occlusion, enhancing effect processing and user experience.

JP2026521779APending Publication Date: 2026-07-01BEIJING ZITIAO NETWORK TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date: 2024-08-29
Publication Date: 2026-07-01

Application Information

Patent Timeline

29 Aug 2024

Application

01 Jul 2026

Publication

JP2026521779A

IPC: H04N21/2343; H04N21/24; H04N5/262

AI Tagging

Technology Topics

Computer graphics (images)Engineering

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In live video streaming, objects in the video stream image frame can become occluded due to changes in pose or occlusion, leading to a deterioration of effect processing such as beautification, necessitating a method to improve effect processing on occluded objects.

Method used

A video stream processing method that detects target objects in real-time, determines the degree of occlusion, adjusts detection frequency based on occlusion, and performs effect processing accordingly to enhance user experience.

Benefits of technology

The method improves the effectiveness of effect processing by dynamically adjusting detection and processing frequencies based on occlusion, resulting in enhanced user experience and balanced performance overhead.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure 2026521779000001_ABST

Patent Text Reader

Abstract

This disclosure provides a video stream processing method and apparatus and electronic equipment, the method comprising: detecting a target object in the current video stream image frame; obtaining a detection result that includes first information representing the degree to which the target object is obscured; determining a detection time for the next detection of the target object in the video stream image frame based on the first information; and performing an effect processing on the target object based on the detection result.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0005] ,

[0004] , , , , , , ,

[0001] This application claims the priority of Chinese Patent Application No. 202311120084.8 filed on August 31, 2023, and hereby incorporates by reference in its entirety the content disclosed in the above-mentioned Chinese Patent Application to form a part of this application.

[0002] This disclosure relates to a video stream processing method and apparatus, and an electronic device.

Background Art

[0003] With the continuous development of network technology and streaming media technology, live video streams have been increasingly applied to the lives of more and more people, providing more entertainment services for people's lives and bringing more convenience. Currently, in the live broadcast process, people generally need to perform effect processing, such as beautification, on specific objects in the video stream image frame. On the other hand, in the live broadcast process, due to the occlusion of specific objects or the change of poses in the image frame, situations where specific objects are occluded occur, resulting in a deterioration of the effect of specific objects, for example, the beautification may disappear. Therefore, there is a need for a method for performing effect processing on live video streams.

Summary of the Invention

[0004] This disclosure provides a video stream processing method and apparatus, and an electronic device.

[0005] According to a first aspect, a video stream processing method is provided, and the method includes: detecting a target object in the current video stream image frame, and obtaining a detection result including first information representing the degree to which the target object is occluded; determining a detection time for detecting the target object in the video stream image frame next time based on the first information; This includes performing effect processing on the target object based on the detection results.

[0006] According to a second aspect, a video stream processing device is provided, the device is A detection module for detecting a target object in the current video stream image frame and obtaining a detection result that includes first information representing the degree to which the target object is obscured, A determination module for determining the detection time for the next detection of the target object in the video stream image frame based on the first information, The system includes a processing module for performing effect processing on the target object based on the detection results.

[0007] According to a third aspect, a computer-readable storage medium is provided, the storage medium storing a computer program, and when the computer program is executed by a processor, the method described in any one of the first aspects is realized.

[0008] According to a fourth aspect, an electronic device is provided, the electronic device comprising a memory, a processor, and a computer program stored in the memory and executable by the processor, wherein when the processor executes the program, the method according to any one of the first aspects is realized. [Brief explanation of the drawing]

[0009] To more clearly illustrate the technical solutions of the embodiments of this disclosure, the following briefly introduces the drawings that may be used in describing the embodiments. Clearly, the drawings in the following description are merely some of the embodiments described in this disclosure, and those skilled in the art can obtain further drawings based on these without any creative work. [Figure 1] This is a schematic diagram of an exemplary system architecture applying the embodiments of this disclosure. [Figure 2]This is a schematic diagram of a video stream processing scene according to one exemplary embodiment of the present disclosure. [Figure 3] This is a flowchart of a video stream processing method according to one exemplary embodiment of the present disclosure. [Figure 4] This is a flowchart of another video stream processing method according to one exemplary embodiment of the present disclosure. [Figure 5] This is a block diagram of a video stream processing device according to one exemplary embodiment of the present disclosure. [Figure 6] This is a schematic block diagram of an electronic device according to some embodiments of the present disclosure. [Figure 7] This is a schematic block diagram of other electronic devices according to some embodiments of the present disclosure. [Figure 8] This is a schematic diagram of a storage medium according to some embodiments of the present disclosure. [Modes for carrying out the invention]

[0010] To enable those skilled in the art to better understand the technical solutions in this disclosure, the technical solutions in the embodiments of this disclosure will be described clearly and completely below with reference to the drawings of this disclosure. Clearly, the embodiments described are only a selection of embodiments of this disclosure, not all embodiments. All other embodiments obtained based on the embodiments of this disclosure without creative effort by those skilled in the art should also fall within the scope of this disclosure.

[0011] Where the following description relates to drawings, unless otherwise stated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of the present disclosure detailed in the claims.

[0012] The terms used in this disclosure are for the purpose of describing specific embodiments only and are not intended to limit this disclosure. The singular forms “one,” “the said,” and “the said” as used in this disclosure are also intended to include many unless the context clearly indicates otherwise. The term “and / or” as used herein should be understood to refer to any or all possible combinations of one or more related enumerated items, and to include these combinations.

[0013] The terms First, Second, Third, etc., may be used in this disclosure to describe various types of information, but it should be understood that such information should not be limited to these terms. These terms are used solely to distinguish information of the same kind from one another. For example, without departing from the scope of this disclosure, First Information may be referred to as Second Information, and similarly, Second Information may be referred to as First Information. Depending on the context, the term “if” as used herein may be interpreted as “when,” “in the event of,” or “in response to a decision.”

[0014] Referring to Figure 1, this is a schematic diagram of an exemplary system architecture applying the embodiments of this disclosure.

[0015] As shown in Figure 1, the system architecture 100 may include terminal equipment 102, terminal equipment 105, a network 103, and a server 104. It should be understood that the number or types of terminal equipment, networks, and servers in Figure 1 are merely illustrative. Any number or type of terminal equipment, networks, and servers may be used as required by implementation.

[0016] Network 103 is used to provide a medium for a communication link between terminal equipment and a server. Network 103 may include various connection types, such as wired communication links, wireless communication links, or fiber optic cables.

[0017] The terminal device 102 and the terminal device 105 can interact with the server via the network 103 to receive or transmit requests, information, etc. The terminal devices 102 and 105 may be various electronic devices including, but not limited to, smartphones, tablet computers, smart wearable devices, and personal digital assistants.

[0018] The server 104 can perform processes such as storage and analysis on the received data, and can also transmit control commands or requests to terminal devices or other servers. The server can provide services in response to user service requests. It should be understood that one server may provide one or more services, or the same service may be provided by multiple servers.

[0019] Based on the system architecture shown in FIG. 1, in an embodiment of the present disclosure, the user 101 can collect image frames of a live video stream by the terminal device 102, detect target objects in at least some of the image frames, and obtain detection results. The detection results include information indicating the degree to which the target object is occluded. Then, based on the detection results, perform effect processing on the target object, determine the detection time for detecting the target object in the image frames of the live video stream next time according to the degree to which the target object is occluded, and perform the next detection based on the detection time. The terminal device 102 transmits the live video stream after the effect processing to the server 104 via the network 103, and the server 104 transmits the above live video stream to the terminal device 105 via the network 103 so that the user 106 can view the live video after the effect processing via the terminal device 105.

[0020] Referring to FIG. 2, it is a scene schematic diagram of video stream processing according to an exemplary embodiment.

[0021] As shown in FIG. 2, first, the image frames of the live video stream S are collected in real time. When the time J0 arrives, the target X in the current image frame T0 is detected and identified to obtain a detection result G0. The detection result G0 may include information on the degree of occlusion of the target X in the image frame T0 and the position information of the target X in the image frame T0. On the other hand, according to the information on the degree of occlusion of the target X in the image frame T0, the time J1 when the target X in the image frame is detected next is determined, and the target X in the image frame Tn when the time J1 arrives can be detected and identified. Here, the greater the degree of occlusion of the target X in the image frame T0, the smaller the number of image frames that are the interval between the image frame Tn and the image frame T0, that is, the higher the detection frequency. The smaller the degree of occlusion of the target X in the image frame T0, the larger the number of image frames that are the interval between the image frame Tn and the image frame T0, that is, the lower the detection frequency. On the other hand, based on the detection result G0, effects on the target X are added to the image frame T0 and subsequent image frames T1, T2... Tn-1 in sequence, and the processed image frames t0, t1, t2... tn-1 can be obtained respectively. When the time J1 arrives, a new round of detection and effect processing can be performed, and the specific process will not be described in detail here.

[0022] Hereinafter, the present disclosure will be described in detail with reference to specific embodiments.

[0023] FIG. 3 is a flowchart of a video stream processing method according to an exemplary embodiment. This method can be applied to terminal devices. In this embodiment, for ease of understanding, a terminal device capable of installing a third-party application program is taken as an example for description. Those skilled in the art should understand that the terminal device may include, but is not limited to, mobile terminal devices such as smartphones, smart wearable devices, tablet computers, laptop portable computers, and desktops. This method may include the following steps.

[0024] As shown in Figure 3, in step 301, the target object in the current video stream image frame is detected and the detection result is obtained.

[0025] In this embodiment, the video stream involved may be a video stream from a live streaming scene, and during the live streaming process, target objects may be detected and identified in the image frames of the video stream at regular intervals. For example, detection of target objects may be performed every m seconds, or every s frame image. Here, the target object may be an object that requires effect processing in the live streaming video stream, and for example, the target object may be a person, an animal, an object, or a specific area of a specific object. The target object may be any object that can be detected and identified, and it should be understood that this embodiment does not limit the specific type of target object.

[0026] In this embodiment, the detection time may be used to detect the target object in the video stream image frame, the detection time may be expressed according to the time interval since the previous detection, or the detection time may be expressed according to the number of image frames in the interval since the previous detection. For example, it may be confirmed that the target has arrived at the detection time at an interval of a seconds from the completion of the previous detection, or it may be confirmed that the target has arrived at the detection time at an interval of b frames from the completion of the previous detection.

[0027] Upon arrival at the detection time, the target object in the current video stream image frame may be detected, and a detection result may be obtained. The detection result may include first information representing the degree to which the target object is occluded. Here, the first information may include a confidence level corresponding to the target object, a ratio of the area of the occluded region of the target object to the area of the region occupied by the target object, or data obtained based on the confidence level corresponding to the target object and the ratio of the area of the occluded region of the target object to the area of the region occupied by the target object. The first information may be any information that can represent the degree to which the target object is occluded, and it should be understood that this embodiment is not limited in this respect.

[0028] Furthermore, the detection result may include second information representing the location of the target object. For example, this second information may include keypoint information corresponding to the target object, or contour information corresponding to the target object.

[0029] In step 302, based on the first information described above, the detection time for the next detection of the target object in the video stream image frame is determined.

[0030] In this embodiment, the detection time for the next detection of the target object in the video stream image frame may be determined according to the degree to which the target object is obscured. In one implementation, it may be determined, based on the first information, whether the degree to which the target object is obscured is less than a first preset degree. If the degree to which the target object is obscured is less than the first preset degree, detection may be performed using the default first detection frequency. If the degree to which the target object is obscured is greater than or equal to the first preset degree, the detection frequency may be increased, and detection may be performed using a second detection frequency that is greater than the first detection frequency. Therefore, the detection time for the next detection of the target object in the video stream image frame may be determined according to the determined detection frequency.

[0031] For example, if it is determined that the degree to which the target object is obscured is less than a first preset degree, detection may be performed at a default detection frequency of A times per second, and the detection time may be 1 / A seconds later. If it is determined that the degree to which the target object is obscured is greater than or equal to a first preset degree, detection may be performed at a detection frequency of B times per second, where A is less than B. The detection time may be 1 / B seconds later.

[0032] Alternatively, for example, if it is determined that the degree to which the target object is occluded is less than a first preset degree, a default C-frame image may be detected once at intervals, and the time after the C-frame image may be used as the detection time. If it is determined that the degree to which the target object is occluded is greater than or equal to a first preset degree, a D-frame image may be detected once at intervals, where C is greater than D, and the time after the D-frame image may be used as the detection time.

[0033] In another implementation, the degree to which the target object is obscured is quantified numerically, and a detection frequency is calculated based on this numerical value according to a pre-set algorithm. A positive correlation is established between the detection frequency and the numerical value (i.e., the greater the degree to which the target object is obscured, the higher the detection frequency), and the detection time for the next detection may be determined based on this detection frequency.

[0034] In this embodiment, the detection time for the next detection is determined based on the degree to which the target object is obscured. Therefore, if the degree to which the target object is obscured is large, the detection frequency is increased. This improves the speed at which the effect is adjusted according to the degree to which the target object is obscured, thereby improving the effectiveness of the effect processing and enhancing the user experience.

[0035] In step 303, based on the above detection results, effect processing is performed on the target object.

[0036] In this embodiment, based on the detection results, effect processing may be performed on the video stream for the target object. Here, the effect processing on the target object may be a transformation process or a color adjustment process for the target object. Specifically, based on the detection results, position information corresponding to the target object may be determined, and based on the first information, the target effect intensity of the effect that needs to be added to the video stream may be determined. Finally, based on the position information corresponding to the target object, effect processing may be performed on the video stream for the image frame target object according to the target effect intensity.

[0037] Based on the first piece of information, if it is confirmed that the degree to which the target object is occluded is greater than a preset threshold (and that threshold is relatively large), or if the target object is not detected, it is necessary to explain that timing or counting should be started, and after images with a preset time length or a preset number of frames interval, if the degree to which the target object is occluded is still greater than a preset threshold or the target object is not detected, the effect may be turned off.

[0038] The video stream processing method according to this disclosure obtains a detection result by detecting a target object in the current video stream image frame. This detection result includes first information representing the degree to which the target object is obscured. Based on this first information, the detection time for the next detection of the target object in the video stream image frame is determined, and effect processing can be performed on the target object based on the detection result. This embodiment can determine the detection frequency for detecting the target object in the video stream image frame according to the degree to which the target object is obscured, thereby improving the effect and enhancing the user experience while balancing the performance overhead of the terminal equipment.

[0039] Figure 4 is a flowchart of another video stream processing method according to an exemplary embodiment, which describes the process of performing effect processing on a target object and includes the following steps.

[0040] As shown in Figure 4, in step 401, the location information corresponding to the target object is determined based on the detection results.

[0041] In this embodiment, the positional information corresponding to the target object may be any information that can represent the position of the target object in the image, for example, the positional information may be information about the keypoint of the target object, and may include the type and coordinate information of the keypoint of the target object.

[0042] Specifically, in one implementation, the degree to which the target object is obscured may be compared with a second preset degree based on the first information included in the detection result. If the degree to which the target object is obscured is less than the second preset degree, the second information included in the detection result may be directly used as location information corresponding to the target object. Alternatively, location reference information may be stored in the cache in advance, and this location reference information needs to be continuously updated using the new location information of the target object. Simultaneously, the location reference information stored in the cache may be updated using the second information. If the degree to which the target object is obscured is greater than or equal to the second preset degree, the pre-stored location reference information may be used as location information corresponding to the target object.

[0043] In another implementation, the degree to which the target object is occluded may be compared to a second preset degree based on the first information included in the detection result. Simultaneously, the difference between the position of the target object in the current image frame and the position of the target object in the previous image frame is compared based on the second information included in the detection result. If the degree to which the target object is occluded is less than the second preset degree, or if the position difference is greater than or equal to the preset difference, the second information included in the detection result may be directly used as the position information corresponding to the target object. Simultaneously, the position reference information pre-stored in the cache is updated using the second information. If the degree to which the target object is occluded is greater than or equal to the second preset degree, and the position difference is less than the preset difference, the pre-stored position reference information may be used as the position information corresponding to the target object.

[0044] It should be understood that this embodiment is not limited in this respect, and that the location information corresponding to the target object may be determined by any other reasonable method. It is necessary to explain that the first preset degree and the second preset degree are two independent and unrelated conditions, and that the first preset degree and the second preset degree may be the same or different. In general, the first preset degree may be set to be greater than the second preset degree.

[0045] In step 402, the target effect intensity is determined based on the first piece of information.

[0046] In this embodiment, the target effect intensity is the intensity at which the effect is applied to the target object. For example, the target effect intensity may be the intensity at which deformation is applied to the target object or the intensity at which color adjustment is applied. Here, the intensity at which deformation is applied to the target object may indicate the amount of deformation of the target object, and the greater the deformation intensity, the greater the shape change of the target object. The intensity at which color adjustment is applied to the target object may indicate the amount of color change of the target object, and the greater the color adjustment intensity, the greater the color transformation of the target object.

[0047] In this embodiment, the target effect intensity may be determined based on the first information, and there may be a negative correlation between the degree to which the target object is obscured and the target effect intensity. In one implementation, an effect intensity adjustment parameter may be determined according to the first information, and this effect intensity adjustment parameter is negatively correlated with the degree to which the target object is obscured. The target effect intensity may be obtained by multiplying this effect intensity adjustment parameter as a coefficient by the default preset effect intensity.

[0048] In another implementation, an effect intensity adjustment parameter may be determined according to the first information, and this effect intensity adjustment parameter is positively or negatively correlated with the degree to which the target object is obscured. Furthermore, according to a pre-set algorithm, the default pre-set effect intensity is adjusted using this effect intensity adjustment parameter to obtain a target effect intensity that is negatively correlated with the degree to which the target object is obscured.

[0049] In this embodiment, there is a negative correlation between the target effect intensity and the degree to which the target object is obscured. That is, the greater the degree to which the target object is obscured, the lower the effect intensity; conversely, the less the target object is obscured, the greater the effect intensity. For example, the more the target object is obscured, the smaller the deformation and / or color change when the effect processing is performed, and the less the target object is obscured, the larger the deformation and / or color change when the effect processing is performed. This achieves a dynamic and smooth gradient effect, improving the effect and further enhancing the user experience.

[0050] In step 403, based on the location information, effect processing is performed on the target object according to the target effect intensity.

[0051] In this embodiment, effect processing may be performed on the target image frame for each frame based on the position information corresponding to the target object, according to the target effect intensity. Here, undetected image frames may be processed according to the position information and target effect intensity of the most recently detected image frame. For example, when image frame Z0 is collected, detection result L0 can be obtained by detecting image frame Z0, and the position information W0 and effect intensity Q0 of the target object can be determined based on the detection result L0. Effect processing is performed on image frame Z0 based on the position information W0 and effect intensity Q0. When image frames Z1, Z2, Z3, Z4, and Z5 are collected, there is no need to detect and identify the target object in any of them, and effect processing can be performed on image frames Z1, Z2, Z3, Z4, and Z5 respectively based on the position information W0 and effect intensity Q0. When image frame Z6 is collected, detection result L1 can be obtained by detecting image frame Z6, and the position information W1 and effect intensity Q1 of the target object can be determined based on the detection result L1. Based on the location information W1 and effect intensity Q1, the image frame Z6 is subjected to effect processing applied to the target object, and the subsequent process will not be described in further detail.

[0052] In this embodiment, based on the detection results for the target object in the image frame, positional information corresponding to the target object is determined, and the target effect intensity is determined according to the degree to which the target object is obscured. Therefore, based on the positional information, effect processing can be performed on the target object according to the target effect intensity. By changing the target effect intensity according to the degree to which the target object is obscured, a gradient effect can be achieved after the target object is obscured, thereby enhancing the user experience.

[0053] In the above embodiments, the operation of the methods of the embodiments of this disclosure is described in a specific order. However, it should be understood that this does not require or suggest that these operations must be performed in that specific order, or that all operations shown must be performed to obtain the desired results. Conversely, the steps depicted in the flowchart may be performed in a different order. Additionally or alternatively, some steps may be omitted, several steps may be combined into one, and / or one step may be broken down into multiple steps.

[0054] In correspondence with the embodiments of the video stream processing method described above, this disclosure further provides embodiments of a video stream processing device.

[0055] As shown in Figure 5, Figure 5 is a block diagram of a video stream processing device according to an exemplary embodiment of the present disclosure, which may include a detection module 501, a determination module 502, and a processing module 503.

[0056] Here, the detection module 501 is used to detect a target object in the current video stream image frame and to obtain a detection result that includes first information representing the degree to which the target object is occluded.

[0057] The decision module 502 is used to determine the detection time for the next detection of the target object in the video stream image frame, based on the first information.

[0058] The processing module 503 is used to perform effect processing on the target object based on the above detection results.

[0059] In some embodiments, the first information may include the confidence level corresponding to the target object, and / or the area ratio of the area where the target object is occupied to the area occupied by the target object.

[0060] In some other embodiments, the determination module 502 is configured as follows: If it is determined based on the first information that the degree to which the target object is occluded is greater than or equal to a first preset degree, the time after a first time length is set as the detection time. If it is determined based on the first information that the degree to which the target object is occluded is less than the first preset degree, the time after a second time length is set as the detection time. Here, the first time length is less than the second time length.

[0061] In some other embodiments, the processing module 503 may include a first decision submodule, a second decision submodule, and a processing submodule (not shown).

[0062] Here, the first decision submodule is used to determine the location information corresponding to the target object based on the detection results described above.

[0063] The second decision submodule is used to determine the target effect intensity based on the first information.

[0064] The processing submodule is used to perform effect processing on the target object according to the target effect intensity based on the above position information.

[0065] In some other embodiments, the detection result may further include second information representing the location of the target object, where the first determination submodule is configured as follows: If it is determined that the first condition is met based on the detection result, the second information is set as location information corresponding to the target object, and the pre-stored location reference information is updated using the second information. If it is determined that the second condition is met based on the detection result, the pre-stored location reference information is set as location information corresponding to the target object.

[0066] In some other embodiments, the first condition may include the degree to which the target object is occluded being less than a second preset degree, or the positional difference of the target object between preceding and succeeding image frames being greater than or equal to a preset difference. The second condition may include the degree to which the target object is occluded being greater than or equal to a second preset degree, and the positional difference of the target object between preceding and succeeding image frames being less than a preset difference.

[0067] In some other embodiments, the second determination submodule is configured as follows: Based on the first information described above, an effect intensity adjustment parameter is determined. The effect intensity adjustment parameter is used to adjust a preset effect intensity to obtain a target effect intensity, and a negative correlation is established between the target effect intensity and the degree to which the target object is occluded.

[0068] For embodiments of the apparatus, refer to the partial description of the embodiment of the method for the relevant parts, as these embodiments substantially correspond to the embodiments of the method. The embodiments of the apparatus described above are merely schematic, and the units shown here as separate parts may or may not be physically separated, and the parts shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Depending on the actual needs, some or all of these modules can be selected to achieve the objectives of the technical proposal of the embodiments of this disclosure. Those skilled in the art will be able to understand and implement these without any creative effort.

[0069] Figure 6 is a schematic block diagram of an electronic device according to some embodiments of the present disclosure. As shown in Figure 6, the electronic device 910 includes a processor 911 and a memory 912, and may be used to implement a client or a server. The memory 912 is used to non-temporarily store computer executable instructions (e.g., one or more computer program modules). The processor 911 is used to execute the computer executable instructions, which, when executed by the processor 911, can perform one or more steps of the video stream processing method described above, thereby implementing the video stream processing method described above. The memory 912 and the processor 911 can be interconnected by a bus system and / or other forms of connection mechanisms (not shown).

[0070] For example, the processor 911 may be a central processing unit (CPU), a graphics processing unit (GPU), or another type of processing unit having data processing capability and / or program execution capability. For example, the central processing unit (CPU) may be an X86 or ARM architecture, etc. The processor 911 may be a general-purpose processor or a dedicated processor, and can control other components in the electronic device 910 to perform desired functions.

[0071] For example, memory 912 may include any combination of one or more computer program products that can be various forms of computer-readable storage media, such as volatile memory and / or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and / or high-speed cache memory (cache). Non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disk read-only memory (CD-ROM), USB memory, flash memory, etc. One or more computer program modules can be stored in the computer-readable storage media, and the processor 911 can execute one or more computer program modules to realize various functions of the electronic device 910. The computer-readable storage media may store various applications and various data, as well as various data used and / or generated by the applications.

[0072] In the embodiments of this disclosure, the specific functions and technical effects of the electronic device 910 can be found in the description of the video stream processing method described above, and will not be described in further detail here.

[0073] Figure 7 is a schematic block diagram of another electronic device according to some embodiments of the present disclosure. The electronic device 920 is suitable, for example, for implementing a video stream processing method according to an embodiment of the present disclosure. The electronic device 920 may be a terminal device and can be used to implement a client or server. The electronic device 920 includes, but is not limited to, mobile terminals such as mobile phones, laptop computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet PCs), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and wearable electronic devices, and fixed terminals such as digital TVs, desktop computers, and smart home devices. Note that the electronic device 920 shown in Figure 7 is merely an example and does not impose any limitations on the functions and scope of use of the embodiments of the present disclosure.

[0074] As shown in Figure 7, the electronic device 920 may include a processing unit (e.g., a central processing unit, graphics processor, etc.) 921 which can perform various appropriate operations and processes depending on the program stored in the read-only memory (ROM) 922 or the program loaded from the storage device 928 into the random access memory (RAM) 923. The RAM 923 further stores various programs and data necessary for the operation of the electronic device 920. The processing unit 921, ROM 922, and RAM 923 are connected to each other via a bus 924. An input / output (I / O) interface 925 is also connected to the bus 924.

[0075] Typically, devices such as input devices 926 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, and gyroscopes; output devices 927 including, for example, liquid crystal displays (LCDs), speakers, and vibrators; storage devices 928 including, for example, magnetic tape and hard disks; and communication devices 929 can be connected to the I / O interface 925. The communication device 929 can enable the electronic device 920 to exchange data with other electronic devices via wireless or wired communication. Figure 7 shows an electronic device 920 with various devices, but it is not necessary to implement or include all of the devices shown, and the electronic device 920 may alternatively implement or include more or fewer devices.

[0076] For example, according to an embodiment of the present disclosure, the video stream processing method can be implemented as a computer software program. For example, an embodiment of the present disclosure includes a computer program product which includes a computer program carried on a non-temporary computer-readable medium, and which includes program code for executing the video stream processing method. In such an embodiment, the computer program can be downloaded and installed from a network by a communication device 929, or installed from a storage device 928, or installed from a ROM 922. When the computer program is executed by a processing device 921, it can realize the functions limited to the video stream processing method according to an embodiment of the present disclosure.

[0077] Figure 8 is a schematic diagram of a storage medium according to some embodiments of the present disclosure. For example, as shown in Figure 8, the storage medium 930 may be a non-temporary computer-readable storage medium for storing non-temporary computer-executable instructions 931. When the non-temporary computer-executable instructions 931 are executed by a processor, the video stream processing methods described in embodiments of the present disclosure can be realized. For example, when the non-temporary computer-executable instructions 931 are executed by a processor, one or more steps of the video stream processing methods described above can be performed.

[0078] For example, the storage medium 930 may be applied to the electronic device described above, and for example, the storage medium 930 may include memory in the electronic device.

[0079] For example, the storage medium may include a smartphone memory card, a tablet PC memory component, a personal computer hard disk, RAM (Random Access Memory), ROM (Read Only Memory), EPROM (Erasable Programmable Read Only Memory), CD-ROM (Portable Compact Disc Read Only Memory), flash memory, or any combination of the above-mentioned storage mediums, or other applicable storage mediums.

[0080] For example, a description of the storage medium 930 can be found in the description of memory in the embodiment of the electronic device, and will not be described in further detail here. The specific functions and technical effects of the storage medium 930 can be found in the description of the video stream processing method described above, and will not be described in further detail here.

[0081] In the context of this disclosure, a computer-readable medium may be a tangible medium that contains or stores a program used by or in combination with an instruction execution system, apparatus, or device. A computer-readable medium may be a computer-readable signal medium, a computer-readable storage medium, or any combination of both. A computer-readable storage medium may be, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of more than these. More specific examples of computer-readable storage media may include, but are not limited to, an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program, which may be used by or in combination with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signaling medium may include data signals propagated in the baseband or as part of a carrier, which carry computer-readable program code. The data signals propagated in this manner may include, but are not limited to, electromagnetic signals, optical signals, or any suitable combination thereof. The computer-readable signaling medium may be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit programs for use by or in combination with instruction execution systems, apparatus, or devices. The program code contained in the computer-readable medium may be transmitted by any suitable medium, including, but not limited to, electric wires, optical cables, RF (radio frequency), or any suitable combination thereof.

[0082] Other embodiments of the Disclosure will be readily conceivable to those skilled in the art, given the examples of the Disclosure. The Disclosure is intended to encompass all variations, uses, or adaptive changes of the Disclosure, including common or customary technical means known in the non-disclosed art, in accordance with the general principles of the Disclosure. The examples of the Disclosure should be considered illustrative only, and the true scope and spirit of the Disclosure are indicated by the claims.

[0083] This disclosure is not limited to the precise structure described above and shown in the drawings, and it should be understood that various modifications and changes may be made as long as they do not deviate from that scope. The scope of this disclosure is limited only by the attached claims.

Claims

1. A method for processing a video stream, wherein the method is The process involves detecting a target object in the current video stream image frame and obtaining a detection result that includes first information representing the degree to which the target object is obscured. Based on the first information, the detection time for the next detection of the target object in the video stream image frame is determined, This includes performing effect processing on the target object based on the detection results, method.

2. The method according to claim 1, wherein the first information includes the confidence level corresponding to the target object, and / or the area ratio of the area in which the target object is shielded to the area occupied by the target object.

3. Based on the first information, determining the detection time for the next detection of the target object in the video stream image frame is: If it is determined that the degree to which the target object is shielded is greater than or equal to a first preset degree based on the first information, the time after the first time length is set to be the detection time. The method according to claim 1 or 2, wherein, based on the first information, it is determined that the degree to which the target object is shielded is less than the first preset degree, the time after a second time length is set to be the detection time, the first time length is less than the second time length.

4. Performing effect processing on the target object based on the detection results is: Based on the detection results, determine the location information corresponding to the target object. Based on the information described above, the target effect intensity is determined, The method according to any one of claims 1 to 3, further comprising performing effect processing on the target object according to the target effect intensity based on the position information.

5. The detection result further includes second information representing the location of the target object, and determining the location information corresponding to the target object based on the detection result is: If it is determined that the first condition is met based on the detection result, the second information is used as location information corresponding to the target object, and the pre-stored location reference information is updated using the second information. The method according to claim 4, further comprising setting the pre-stored location reference information to the location information corresponding to the target object when it is determined that the second condition is met based on the detection result.

6. The first condition includes that the degree to which the target object is occluded is less than a second preset degree, or that the positional difference of the target object in the preceding and succeeding image frames is greater than or equal to a preset difference. The method according to claim 5, wherein the second condition includes that the degree to which the target object is occluded is greater than or equal to the second preset degree, and the positional difference of the target object in the preceding and succeeding image frames is smaller than the preset difference.

7. Based on the information described above, determining the target effect intensity is: Based on the first piece of information mentioned above, the effect intensity adjustment parameters are determined, The method according to any one of claims 4 to 6, comprising: adjusting a preset effect intensity using the effect intensity adjustment parameter to obtain the target effect intensity; and making the target effect intensity and the degree to which the target object is obscured negatively correlated.

8. A device for processing a video stream, said device, A detection module is configured to detect a target object in the current video stream image frame and to obtain a detection result that includes first information representing the degree to which the target object is obscured. A decision module configured to determine the detection time for the next detection of the target object in the video stream image frame based on the first information, A processing module configured to perform effect processing on the target object based on the detection result, Device.

9. A computer-readable storage medium in which a computer program is stored, wherein when the computer program is executed on a computer, the computer causes the computer to execute the method according to any one of claims 1 to 7.

10. An electronic device comprising memory and a processor, wherein executable code is stored in the memory, and when the processor executes the executable code, the method according to any one of claims 1 to 7 is implemented.