An audio playing method, device, equipment and storage medium
By generating a virtual video stream on the terminal device and adjusting the audio using background video interface controls, the problem of bandwidth and performance consumption in video rendering is solved, and the smoothness of audio playback and device performance are optimized.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING QIYI CENTURY SCI & TECH CO LTD
- Filing Date
- 2023-06-25
- Publication Date
- 2026-06-30
AI Technical Summary
When users are using other applications on their terminal devices, video rendering in existing technologies consumes bandwidth and reduces the performance of the terminal devices, resulting in choppy audio playback.
By responding to application switching commands, the rendering of the video stream is stopped and a background video interface is loaded. Visual image data is obtained and a virtual video stream is generated. Audio is adjusted using the target controls in the background video interface, avoiding requests for video data from the server and reducing bandwidth consumption and terminal computing resource usage.
Optimize terminal device performance, improve audio playback smoothness, reduce bandwidth consumption, and enhance user experience and audio operation convenience.
Smart Images

Figure CN116708903B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of audio and video processing technology, and in particular to an audio playback method, device and storage medium. Background Technology
[0002] As video content in video applications becomes increasingly abundant, some users, while operating other applications on their devices, still want to keep the video playing so they can hear some high-quality audio even when they cannot focus on the video content.
[0003] In existing technologies, picture-in-picture video mode is typically used in the aforementioned business scenarios. This involves switching the video application to the background and then opening a small window to continue playing the video and audio. However, in this case, video rendering consumes bandwidth and reduces the performance of the terminal device. Summary of the Invention
[0004] The purpose of this invention is to provide an audio playback method, device, and storage medium to reduce the bandwidth required for users to listen to audio from videos in picture-in-picture format, thereby optimizing terminal device performance. The specific technical solution is as follows:
[0005] In a first aspect of the present invention, an audio playback method is provided, applied in a terminal device, the method comprising:
[0006] In response to the application switching command, stop rendering the video stream of the currently playing video and load a background video interface in the terminal interface;
[0007] Acquire visual image data, which includes multiple target controls for audio and video adjustment;
[0008] The visual image data is output in a loop at regular intervals to obtain the virtual video stream corresponding to the visual image data;
[0009] The virtual video stream is input into the background video interface for display, so that an audio adjustment request can be initiated through the target control in the background video interface.
[0010] In another aspect of the present invention, a computer-readable storage medium is also provided, wherein instructions are stored therein, which, when executed on a computer, cause the computer to perform any of the audio playback methods described above.
[0011] In another aspect of the present invention, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to perform any of the audio playback methods described above.
[0012] This invention provides an audio playback method that, in response to an application switching command, first stops rendering the video stream of the currently playing video while maintaining the rendering of the corresponding audio stream, and loads a background video interface on the terminal interface. Then, visual image data is acquired, including multiple target controls for adjusting the audio and video. Next, the visual image data is periodically and cyclically output to obtain a virtual video stream corresponding to the visual image data. Finally, the virtual video stream is input to the background video interface for display, allowing users to initiate audio adjustment requests through the target controls in the background video interface. Thus, a virtual video stream is generated locally on the terminal based on the visual image data, and the audio of the video before the application switch continues to play. Although it uses a picture-in-picture display method, it avoids the bandwidth consumption of requesting video data from the server and the excessive computing resources required for the terminal to parse video data. Furthermore, users can adjust the audio through the target controls in the visual image data, which has beneficial effects such as optimizing terminal device performance and improving audio playback smoothness. Attached Figure Description
[0013] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below.
[0014] Figure 1 This is a flowchart illustrating the steps of an audio playback method provided in an embodiment of the present invention;
[0015] Figure 2 This is a flowchart illustrating the steps of another audio playback method provided in an embodiment of the present invention;
[0016] Figure 3 A schematic diagram illustrating an audio playback step provided in an embodiment of the present invention;
[0017] Figure 4 This is a structural block diagram of an audio playback device provided in an embodiment of the present invention;
[0018] Figure 5 This is a schematic diagram of the device architecture of an electronic device provided in an embodiment of the present invention. Detailed Implementation
[0019] The technical solutions of the present invention will now be described with reference to the accompanying drawings in the embodiments of the present invention.
[0020] Reference Figure 1 This diagram illustrates a step-by-step flowchart of an audio playback method provided by an embodiment of the present invention, applicable to a terminal device. The method may include:
[0021] S101. In response to the application switching command, stop rendering the video stream of the currently playing video and keep rendering the corresponding audio stream, and load a background video interface in the terminal interface.
[0022] In this embodiment of the invention, the application switching instruction is obtained based on a switching operation of the video application. In response to the application switching instruction, the rendering of the video stream corresponding to the currently playing video stops, while the rendering of the audio stream of the currently playing video continues, and a background video interface is loaded on the terminal interface. Thus, while the video application is running in the background, the audio stream corresponding to the currently playing video continues to be rendered and played, while the corresponding video stream stops rendering and displaying.
[0023] S102. Acquire visual image data, wherein the visual image data includes multiple target controls for audio and video adjustment.
[0024] S103. The visual image data is output in a timed loop to obtain the virtual video stream corresponding to the visual image data.
[0025] The visual image data includes multiple target controls for audio and video adjustment. Each target control includes at least one of the following: play, pause, fast forward, rewind, speed adjustment, and cut. This allows for the periodic, cyclical output of the visual image data to obtain the corresponding virtual video stream.
[0026] S104. The virtual video stream is input into the background video interface for display, so as to initiate an audio adjustment request through the target control in the background video interface.
[0027] In this embodiment of the invention, the virtual video stream is input to the background video interface for display. Thus, a virtual video stream is generated locally on the terminal based on visual image data, and the audio of the video before the application switch continues to play. Although it uses a picture-in-picture display method, it avoids the bandwidth consumption of requesting video data from the server and the problem of the terminal needing to consume a lot of computing resources to parse video data. Therefore, while saving user data and reducing bandwidth, it can improve the smoothness of the user's audio listening and facilitate the user's adjustment of the audio based on the target controls displayed in the background video interface.
[0028] Reference Figure 2 This diagram illustrates a step-by-step flowchart of another audio playback method provided by an embodiment of the present invention, applied in a terminal device. The method may include:
[0029] S201. In response to the application switching command, stop rendering the video stream of the currently playing video and keep rendering the corresponding audio stream, and load a background video interface in the terminal interface.
[0030] In this embodiment of the invention, the method can be applied to a video application. When the video application starts, the user obtains a video playback command based on a click operation of the playback control in the display interface. In response to the video playback command, the system requests audio and video data of the currently playing video from the server. The video stream of the currently playing video is then sent to the video rendering layer located in the display interface, as shown below. Figure 3 As shown, the video data is rendered and displayed after passing through the video rendering layer.
[0031] In one example, referencing Figure 3 As shown, during the playback of the currently playing video, a picture-in-picture (PiP) instance and a virtual rendering layer can be created. These two instances are then attached to a PiP controller, and the controller's automatic background playback attribute is enabled. The PiP controller is used to initiate the PiP video playback mode when the video application is detected to be switching to the background. The PiP instance creates the background video interface, which can be understood as the video playback interface in PiP mode. The virtual rendering layer is the video rendering layer in PiP mode, rendering a virtual video stream converted locally on the terminal. Therefore, based on monitoring of the video application, it can be determined whether to activate the PiP controller when the application is running in the background.
[0032] The application switching command is obtained based on the exit operation of the video application. In response to the application switching command, the rendering of the video stream corresponding to the currently playing video stops, while the rendering of the audio stream corresponding to the currently playing video continues. That is, the terminal stops requesting video data of the currently playing video from the server, but continues to request audio data of the currently playing video. Thus, the audio data of the currently playing video can continue to be rendered and played, while the corresponding video data stops being displayed. In response to the application switching command, the picture-in-picture controller is started, and a picture-in-picture instance is started through the picture-in-picture controller, so that a background video interface can be loaded and generated on the terminal interface through the picture-in-picture instance.
[0033] S202. Acquire visual image data, wherein the visual image data includes multiple target controls for audio and video adjustment.
[0034] In this embodiment of the invention, the visual image data can be pre-set, and may include two types: images and controls. For example, the images may include interface visual images, and the controls may include multiple target controls for audio and video adjustment. The target controls include at least one of the following: play, pause, fast forward, rewind, speed adjustment, and cut. The target controls are displayed in the background video interface during background audio playback, thereby facilitating user adjustment of the audio of the currently playing video based on the target controls.
[0035] In one example, the visual image data can be preset. For example, in response to the control adjustment operation of the target control, the position of the target control in the interface visual image is changed, and so on. Based on the control adjustment operations of multiple target controls, the position of each target control in the interface visual image is determined, so as to present the current visual effect to the user.
[0036] In one optional embodiment of the invention, the plurality of target controls are arranged side by side, and the ratio between the length and width of the default interface of the background video interface is greater than or equal to a preset first ratio. The preset first ratio can be selected by those skilled in the art based on the screen size of different terminal devices, and is not limited thereto. The plurality of target controls can be arranged horizontally or vertically side by side. For example, the default interface of the background video interface obtained based on the preset first ratio can be a long, thin strip. When the plurality of target controls are arranged horizontally side by side, the background video interface can be displayed as a long, thin strip at the horizontal edge of the terminal interface. When the plurality of target controls are arranged vertically side by side, the background video interface can be displayed as a long, thin strip at the vertical edge of the terminal interface.
[0037] This allows users to adjust the audio using the target controls in the background video interface while minimizing the area occupied by the background video interface on the terminal screen. This improves the ease of operation for applications other than the background video application and enhances the user experience.
[0038] In another optional embodiment of the invention, the method may further include:
[0039] Obtain adjustment instructions from the background video interface, wherein the adjustment instructions include at least size parameters.
[0040] Based on the aforementioned size parameters, the size of the background video interface is adjusted.
[0041] In this embodiment of the invention, an adjustment command for the background video interface can also be obtained based on the user's scaling operation on the interface border of the background video interface. The adjustment command includes at least size parameters. These size parameters can be the length and width of the background video interface. For example, the four vertices of the background video interface correspond to four adjustment controls. When a user touches an adjustment control, the adjusted size parameters of the background video interface are determined based on the control coordinates before and after the touch operation. This adjusted length and width can be determined using the spatial coordinates of the four adjustment controls. Therefore, the interface size of the background video interface can be adjusted based on the size parameters in the adjustment command, allowing users to adjust the background video interface to their preferred size for different terminal devices.
[0042] S203. Perform binary conversion on the visual image data to obtain the target image data.
[0043] S204. The target image data is output in a timed loop to determine the virtual video stream corresponding to the visual image data.
[0044] In this embodiment of the invention, the visual image data may include at least the following types: images and controls. For example, a virtual decoder may be preset, which converts the visual image data into target image data through binary conversion. The target image data can be understood as image data that supports a general data structure for transmitting video data in the current video processing pipeline. For example, the data structure corresponding to the target image data may be CMSampleBuffer, etc.
[0045] In another example, the virtual decoder may include a timer, and those skilled in the art can determine the timing interval corresponding to the timer based on the actual situation. For example, it may be a cyclic output of the corresponding target image data every 0.1 seconds. Thus, the visual image data can be converted into a continuous virtual video stream through the cyclical data output. The virtual video stream can be understood as a local video stream used to replace the currently playing video stream for picture-in-picture playback. It is obtained by the terminal based on the visual image data. For example, the data type of the virtual video stream can be a databuff, thereby supporting the loading of the virtual video stream into the background video interface. In other words, picture-in-picture playback requires a video stream to start. Playing a locally generated virtual video stream in the background video interface avoids the bandwidth consumption of requesting video data from the server and the problem of the terminal needing to consume a lot of computing resources to parse video data.
[0046] S205. Input the virtual video stream into the virtual rendering layer in the background video interface.
[0047] S206. The virtual video stream is rendered and displayed through the virtual rendering layer, so as to initiate an audio adjustment request through the target control in the background video interface.
[0048] In this embodiment of the invention, the virtual video stream (which can also be understood as multiple consecutive target image data) is input into a virtual rendering layer in the background video interface. The virtual rendering layer can be an AVSampleBufferDisplayLayer, used to render and display the virtual video stream output by the virtual decoder. The virtual rendering layer can be pre-created. In response to the application switching command, the virtual rendering layer is activated in the background video interface. In one example, the virtual rendering layer can be overlaid on the background video interface and hidden. In response to the application switching command, after the virtual rendering layer is activated in the background video interface, it renders and displays the virtual video stream. In another example, in response to the application switching command, a pre-set virtual rendering layer is overlaid on the background video interface, and the rendering and display of the virtual video stream is activated. Then, the user can initiate an audio adjustment request to the server through the target controls in the background video interface.
[0049] Upon receiving the adjustment request, the server issues an audio adjustment command based on the request, allowing the terminal to adjust the audio accordingly. This enables users to easily adjust the audio using target controls within the background video interface, while also improving the ease of operation for other applications besides the background video application, thus enhancing the user experience.
[0050] On the other hand, the server continues to send the audio stream of the currently playing video to the terminal while stopping the sending of the video stream. This improves the user experience and reduces data transmission bandwidth for the video application operator, thereby lowering data usage costs. In another example, when the terminal plays the audio stream of the currently playing video in the background, it can increase the total playback time of the video to some extent, which can also help increase the number of advertisements generated by the video application.
[0051] In an optional embodiment of the invention, the method may further include:
[0052] The number of touches on the interface area of the background video interface that is not the target control is detected.
[0053] If the number of touches is greater than or equal to a preset touch threshold, the background video interface is adjusted in the terminal interface according to preset interface adjustment rules.
[0054] In this embodiment of the invention, the background video interface can be monitored, and the number of touches on non-target control areas of the background video interface can be detected while the video application is running in the background. If the cumulative number of detected touches is greater than or equal to a preset touch threshold, it can be determined that the current display of the background video interface is affecting the user's ease of operation with other applications besides the video application. The preset touch threshold can be determined by those skilled in the art based on actual business scenarios, and is not limited here. Then, the background video interface can be adjusted on the terminal interface according to preset interface adjustment rules.
[0055] In one example, the interface adjustment rules may include: aligning the current background video interface as close as possible to the border of the terminal interface; or, further reducing the size of the current background video interface; or, changing the current background video interface from landscape to portrait mode; or, changing the current background video interface from portrait to landscape mode. Landscape mode refers to the background video interface being symmetrical in length and width, while portrait mode refers to the background video interface being symmetrical in length and width. This can further improve the ease of operation for applications other than background video applications, thereby enhancing the user experience.
[0056] In summary, this invention discloses an audio playback method. The method may include first stopping the rendering of the currently playing video stream and maintaining the rendering of the corresponding audio stream in response to an application switching command, and loading a background video interface on the terminal interface. Then, visual image data is acquired, including multiple target controls for audio and video adjustment. Next, the visual image data is periodically and cyclically output to obtain a virtual video stream corresponding to the visual image data. Finally, the virtual video stream is input to the background video interface for display, allowing users to initiate audio adjustment requests through the target controls in the background video interface. Thus, a virtual video stream is generated locally on the terminal based on visual image data, and the audio of the video before the application switch continues to play. Although it uses a picture-in-picture display method, it avoids the bandwidth consumption of requesting video data from the server and the excessive computing resources required for the terminal to parse video data. Furthermore, users can adjust the audio through the target controls in the visual image data, which has beneficial effects such as optimizing terminal device performance and improving audio playback smoothness.
[0057] It should be noted that, for the sake of simplicity, the method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments of this application are not limited to the described order of actions, because according to the embodiments of this application, some steps can be performed in other orders or simultaneously. Secondly, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the embodiments of this application.
[0058] Reference Figure 4 This illustration shows an audio playback device provided by an embodiment of the present invention, the device may include:
[0059] The interface loading module 401 is used to respond to the application switching command, stop rendering the video stream of the currently playing video, and load a background video interface in the terminal interface.
[0060] The view acquisition module 402 is used to acquire visual image data, which includes multiple target controls for audio and video adjustment.
[0061] The virtual video stream output module 403 is used to periodically and cyclically output the visual image data to obtain the virtual video stream corresponding to the visual image data.
[0062] The video stream display module 404 is used to input the virtual video stream into the background video interface for display, and to initiate an audio adjustment request through the background video interface.
[0063] In an optional embodiment of the invention, the virtual video stream output module 403 can also be used for:
[0064] The visual image data is converted into binary to obtain the target image data.
[0065] The target image data is output in a timed loop to determine the virtual video stream corresponding to the visual image data.
[0066] In an optional embodiment of the invention, the video stream display module 404 may include:
[0067] The data transmission submodule is used to input the virtual video stream into the virtual rendering layer in the background video interface.
[0068] The video stream display submodule is used to render and display the virtual video stream through the virtual rendering layer.
[0069] In an optional embodiment of the invention, the device may further include:
[0070] The layer creation module is used to create virtual rendering layers;
[0071] It is also used to activate the virtual rendering layer in the background video interface in response to the application switching command.
[0072] In one optional embodiment of the invention, the target control includes at least one of the following: play, pause, fast forward, rewind, speed adjustment, and set switching.
[0073] In one optional embodiment of the invention, the plurality of target controls are arranged side by side, and the ratio between the length and width of the default interface of the background video interface is greater than or equal to a preset first ratio.
[0074] In an optional embodiment of the invention, the device may further include:
[0075] The instruction acquisition module is used to acquire adjustment instructions from the background video interface, wherein the adjustment instructions include at least size parameters.
[0076] The interface adjustment module is used to adjust the interface size of the background video interface according to the size parameters.
[0077] In an optional embodiment of the invention, the device may further include:
[0078] The touch detection module is used to detect the number of touches on the interface area of the non-target control in the background video interface.
[0079] The interface adjustment module is used to adjust the background video interface in the terminal interface according to preset interface adjustment rules when the number of touches is greater than or equal to a preset touch threshold.
[0080] In summary, this invention discloses an audio playback device. The device may first, in response to an application switching command, stop rendering the video stream of the currently playing video while maintaining the rendering of the corresponding audio stream, and load a background video interface on the terminal interface. Then, visual image data is acquired, including multiple target controls for audio and video adjustment. Next, the visual image data is periodically and cyclically output to obtain a virtual video stream corresponding to the visual image data. Finally, the virtual video stream is input to the background video interface for display, allowing users to initiate audio adjustment requests through the target controls in the background video interface. Thus, a virtual video stream is generated locally on the terminal based on visual image data, and the audio of the video before the application switch continues to play. Although it uses a picture-in-picture display method, it avoids the bandwidth consumption of requesting video data from the server and the excessive computing resources required for the terminal to parse video data. Furthermore, users can adjust the audio through the target controls in the visual image data, which has beneficial effects such as optimizing terminal device performance and improving audio playback smoothness.
[0081] This invention also provides an electronic device, such as... Figure 5 As shown, it includes a processor 501, a communication interface 502, a memory 503, and a communication bus 504, wherein the processor 501, the communication interface 502, and the memory 503 communicate with each other through the communication bus 504.
[0082] Memory 503 is used to store computer programs.
[0083] When processor 501 executes the program stored in memory 503, it performs the following steps:
[0084] In response to the application switching command, stop rendering the video stream of the currently playing video while maintaining the rendering of the corresponding audio stream, and load a background video interface in the terminal interface.
[0085] Acquire visual image data, which includes multiple target controls for audio and video adjustment.
[0086] The visual image data is output in a loop at regular intervals to obtain the virtual video stream corresponding to the visual image data.
[0087] The virtual video stream is input into the background video interface for display, so that an audio adjustment request can be initiated through the target control in the background video interface.
[0088] The communication bus mentioned above can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used to represent it in the diagram, but this does not mean that there is only one bus or one type of bus.
[0089] The communication interface is used for communication between the aforementioned terminal and other devices.
[0090] The memory may include random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
[0091] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc. They can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
[0092] In another embodiment of the present invention, a computer-readable storage medium is also provided, which stores instructions that, when executed on a computer, cause the computer to perform any of the audio playback methods described in the above embodiments.
[0093] In another embodiment of the present invention, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to perform any of the audio playback methods described in the above embodiments.
[0094] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk (SSD)).
[0095] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0096] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.
[0097] The above description is merely a preferred embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention are included within the scope of protection of the present invention.
Claims
1. An audio playing method, characterized in that, When applied in a terminal device, the method includes: In response to the application switching command, stop rendering the video stream of the currently playing video and keep rendering the corresponding audio stream, and load a background video interface in the terminal interface; Acquire visual image data, which includes multiple target controls for audio and video adjustment; The visual image data is output in a loop at regular intervals to obtain the virtual video stream corresponding to the visual image data; The virtual video stream is input into the background video interface for display, so as to initiate an audio adjustment request through the target control in the background video interface; Detect the number of touches on the interface area of the background video interface that is not the target control; If the number of touches is greater than or equal to a preset touch threshold, the background video interface is adjusted in the terminal interface according to preset interface adjustment rules.
2. The audio playback method of claim 1, wherein, The step of periodically and cyclically outputting the visual image data to obtain the virtual video stream corresponding to the visual image data includes: The visual image data is converted into binary to obtain the target image data; The target image data is output in a timed loop to determine the virtual video stream corresponding to the visual image data.
3. The audio playback method of claim 2, wherein, The step of inputting the virtual video stream into the background video interface for display includes: The virtual video stream is input into the virtual rendering layer of the background video interface; The virtual video stream is rendered and displayed using the virtual rendering layer.
4. The audio playback method of claim 3, wherein, Before inputting the virtual video stream into the virtual rendering layer in the background video interface, the method further includes: Create a virtual rendering layer; In response to the application switching command, the virtual rendering layer is activated in the background video interface.
5. The audio playback method of claim 1, wherein, The target control includes at least one of the following: play, pause, fast forward, rewind, speed adjustment, and set switching.
6. The audio playback method according to claim 4, characterized in that, The multiple target controls are arranged side by side, and the ratio between the length and width of the default interface of the background video interface is greater than or equal to a preset first ratio.
7. The audio playback method according to claim 1, characterized in that, The method further includes: Obtain adjustment instructions from the background video interface, wherein the adjustment instructions include at least size parameters; Based on the aforementioned size parameters, the size of the background video interface is adjusted.
8. An audio playback device, characterized in that, The device includes: The interface loading module is used to respond to application switching commands, stop rendering the video stream of the currently playing video and keep rendering the corresponding audio stream, and load a background video interface in the terminal interface. A view acquisition module is used to acquire visual image data, which includes multiple target controls for audio and video adjustment. The virtual video stream output module is used to periodically and cyclically output the visual image data to obtain the virtual video stream corresponding to the visual image data. The video stream display module is used to input the virtual video stream into the background video interface for display, and to initiate an audio adjustment request through the target control in the background video interface. The touch detection module is used to detect the number of touches on the interface area of the non-target control in the background video interface; The interface adjustment module is used to adjust the background video interface in the terminal interface according to preset interface adjustment rules when the number of touches is greater than or equal to a preset touch threshold.
9. An electronic device, characterized in that, It includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; Memory, used to store computer programs; A processor, when executing a program stored in memory, implements the steps of the method described in any one of claims 1-7.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the method as described in any one of claims 1-7.