Distributed audio-video production system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a distributed audio and video production system, audio and video data are produced at multiple levels using sensors, perception centers, and effectors, overcoming the limitations of internet audio production and achieving more flexible and comprehensive audio production.

CN117133268BActive Publication Date: 2026-06-23WAVARTS TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: WAVARTS TECH CO LTD
Filing Date: 2022-05-20
Publication Date: 2026-06-23

Application Information

Patent Timeline

20 May 2022

Application

23 Jun 2026

Publication

CN117133268B

IPC: G10L13/033; H04L67/288; H04L67/2885; H04L67/60; G11B27/031

AI Tagging

Application Domain

Electronic editing digitised analogue information signalsTransmission

Technology Topics

Video productionAudio frequency

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Graphical user interface for electronic devices with AI-generated video capabilities
CN310045834SState diagramHuman–computer interaction
Canvas-based content generation method and device, electronic equipment and storage medium
CN122248240AReduce stepsImprove production efficiency Selective content distributionVideo productionHuman–computer interaction
Method and apparatus for automated video production
US20260172647A1Character and pattern recognition Speech recognition EngineeringVideo production
Video display method and apparatus, video processing method, apparatus, and system, device, and medium
US12666116B2Selective content distribution Instruments Computer graphics (images)Original data
Audio-based user engagement detection
WO2026128430A1MicrophonesSignal processingEngineeringRelevant feature

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing internet-based audio production systems cannot mount plugins, output multi-channel audio, or add metadata, making audio production inconvenient.

Method used

A distributed audio and video production system is adopted, including sensors, perception centers, and effectors, distributed across multiple levels. Audio and video data are produced through the concept of metaverse acoustic systems, enabling flexible and comprehensive production.

Benefits of technology

It improves the flexibility and comprehensiveness of audio and video data production, overcomes the limitations of internet-based methods, and provides more convenient audio production capabilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN117133268B_ABST

Patent Text Reader

Abstract

The embodiment of the present application provides a distributed audio and video production system, comprising a sensor, a sensing center and an effector; the sensor comprises at least one first subsystem; the sensing center comprises at least one second subsystem; the effector comprises at least one third subsystem; the first subsystem, the second subsystem and the third subsystem are all identical, or all different, or all different; and each subsystem is distributed in the same or different levels, and comprises: a sub-sensor for acquiring audio information sent by a subsystem of any level except the current level or acquiring audio information sent by other subsystems of the current level except the subsystem; a sub-sensing center for processing the audio information to acquire feedback information; and a sub-effector for feeding back the feedback information to other subsystems of the current level except the subsystem or outputting the feedback information to a subsystem of any level except the current level.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The embodiments of the present invention relate to the field of audio production technology, and more particularly to a distributed audio and video production system. Background Technology

[0002] With the development of technology, users can create audio content via the internet. For example, users can access a remote server through a webpage and input corresponding editing commands, which the remote server will then use to create the audio content.

[0003] However, this internet-based audio production method has many limitations, such as the inability to mount plugins, output multi-channel audio, or add metadata during the production process. These limitations cause numerous inconveniences for audio production. Therefore, there is an urgent need for a new audio production system that can more conveniently and comprehensively create audio data. Summary of the Invention

[0004] Embodiments of the present invention provide a distributed audio and video production system, which aims to effectively improve the convenience and comprehensiveness of audio and video data production.

[0005] This invention provides a distributed audio and video production system, comprising: a sensor, a sensory center, and an effector; wherein,

[0006] The sensor includes at least one first subsystem; the sensory center includes at least one second subsystem; the effector includes at least one third subsystem; the first subsystem, the second subsystem, and the third subsystem are identical to each other, or all three are identical, or all three are different; and each subsystem is distributed in the same or different levels.

[0007] For each subsystem, the subsystem includes a sub-sensor, a sub-sensor center, and a sub-effector;

[0008] The sub-sensor is used to acquire audio and video information sent by subsystems at any level other than its own level, or to acquire audio and video information sent by subsystems at its own level other than the subsystem in question.

[0009] The sub-sensing center is used to process the audio and video information and obtain feedback information;

[0010] The sub-effector is used to feed back the feedback information to other subsystems in this level besides the subsystem itself, or to output the feedback information to any other subsystem in any level other than this level.

[0011] In one embodiment, the hierarchy is one or more combinations of the following:

[0012] At least one service layer, at least one client layer, and at least one playback layer;

[0013] The service level includes at least one service subsystem; the client level includes at least one client subsystem; and the playback level includes at least one playback subsystem.

[0014] The sensor, the sensing center, and the effector are all distributed at the service level, the client level, and / or the playback level.

[0015] In one embodiment, in the service subsystem, the sub-sensor includes at least one first server; the sub-sensing center includes at least one second server; the sub-effector includes at least one third server; and the first server, the second server, and the third server are either identical to each other, or all three are identical, or all three are different.

[0016] And / or,

[0017] In the client subsystem, the sub-sensor includes at least one first client; the sub-sensing center includes at least one second client; the sub-effector includes at least one third client; and the first client, the second client, and the third client are either identical to each other, or all three are identical, or all three are different.

[0018] And / or,

[0019] In the playback subsystem, the sub-sensor includes at least one first playback device; the sub-sensing center includes at least one second playback device; the sub-effector includes at least one third playback device; and the first playback device, the second playback device, and the third playback device are either identical to each other, or all three are identical, or all three are different.

[0020] In one embodiment, the server comprises one or more of the following servers:

[0021] Audio and video data storage server, data processing server, user data storage server, transaction processing server, copyright management server, community forum server, codec server, AI processing server, and sound effect algorithm server.

[0022] In one embodiment, the client comprises one or more of the following: laptop computers, desktop computers, display devices, tablet computers, audio / video editing devices, and mobile devices;

[0023] or,

[0024] The client consists of one or more of the following units: user operation unit, interface display unit, data storage unit, data processing unit, audio and video editing and production unit, audio and video playback unit, encoding and decoding processing unit, rendering processing unit, AI processing unit, and encapsulation unit;

[0025] or,

[0026] The client consists of one or more of the following: laptops, desktops, display devices, tablets, audio / video editing devices, and mobile devices;

[0027] It consists of one or more of the following units: user operation unit, interface display unit, data storage unit, data processing unit, audio and video editing and production unit, audio and video playback unit, encoding and decoding processing unit, rendering processing unit, AI processing unit, and packaging unit.

[0028] In one embodiment, the playback device comprises one or more of the following playback devices:

[0029] Televisions, sound cards, soundbars, speaker arrays, headphones, and cinema processors.

[0030] In one embodiment, the subsystem is configured to process audio and video data and / or switch the interaction to one or more other subsystems in the same level that have the same function as the at least one subsystem interacting with the current level or other levels when it is determined that at least one subsystem interacting with the current level or other levels has failed and is unable to interact with at least one subsystem interacting with the current level or other levels.

[0031] or

[0032] The subsystem is used to process audio and video data and / or switch the interaction to one or more subsystems in the upper or lower level of the current level that have the same function as the at least one subsystem interacting with when it determines that at least one subsystem interacting with the upper or lower level of the current level is faulty or unable to interact with at least one subsystem interacting with the upper or lower level of the current level.

[0033] In one embodiment, the server is configured to, when it determines that at least one server or other subsystem at a different level has malfunctioned and is unable to interact with the at least one server or other subsystem at a different level, process the audio and video data itself and / or switch the interaction to one or more other servers with the same function as the at least one server interacting with; or switch the interaction to a corresponding device at any other level other than the level where the server is located.

[0034] And / or,

[0035] The client is used to process audio and video data and / or switch the interaction to one or more other clients with the same function as the at least one client being interacted with, or to switch the interaction to a device at any other level other than the level where the client is located, when it is determined that at least one client or other subsystem at the level it is interacting with has failed and is unable to interact with the at least one client or other subsystem at the level it is interacting with.

[0036] And / or,

[0037] The playback device is used to process audio and video data and / or switch the interaction to one or more other playback devices with the same function as the at least one playback device being interacted with, or to switch the interaction to a corresponding device at any other level other than the level where the playback device is located, when it is determined that at least one playback device or other subsystem at a certain level has malfunctioned and is unable to interact with the at least one playback device or other subsystem at a certain level.

[0038] In one embodiment, the service subsystem of the service level is used to receive different types of data information from other subsystems of the service level, or from subsystems of other levels other than the service level.

[0039] The service subsystem is also used to perform corresponding processing based on different types of data information;

[0040] The service subsystem is also used to send the data obtained after operation processing to other subsystems at the service level, or to subsystems at other levels other than the service level.

[0041] In one embodiment, the different types of data information include one or more combinations of the following:

[0042] Audio and video editing commands, audio and video editing collaboration commands, audio and video editing relay commands, file read and write commands, audio and video data, user information, and authentication results.

[0043] In one embodiment, the service subsystem is specifically used to perform one or a combination of the following processes based on different types of data information:

[0044] File management and processing, audio and video data editing and processing, user information management and processing, copyright processing of audio and video data, transaction processing of audio and video data, and interactive processing of audio and video data.

[0045] In one embodiment, the client-level client subsystem is configured to acquire audio and video data from other client subsystems at the client level, or from subsystems at other levels besides the client level, and to perform operations on the audio and video data based on the acquired operation information.

[0046] In one embodiment

[0047] The operation information includes one or more of the following combinations:

[0048] Information on audio and video data production, separation of audio and video data, collaborative production of audio and video data, relay production of audio and video data, display and processing of audio and video data, playback and processing of audio and video data, and reading and writing of audio and video data.

[0049] In one embodiment, the playback subsystem of the playback level is used to obtain audio and video data to be played from other playback subsystems of the playback level, or from subsystems of other levels other than the playback level, and to process the audio and video data to be played.

[0050] In one embodiment, the service subsystem of the service level is used to reallocate audio production tasks to the service subsystem of the service level and / or the subsystems of the other levels based on network status information, performance information from subsystems of other levels other than the service level, and / or the current working status of the subsystems of the other levels.

[0051] In one embodiment, the client-level client subsystem is used to configure and process audio and video data, and output the processed audio and video data to a playback device in the playback subsystem of the playback level corresponding to the type of the processed audio and video data.

[0052] This invention provides a distributed audio and video production system. The distributed audio production system includes a sensor, a sensing center, and an effector. The sensor includes at least one first subsystem; the sensing center includes at least one second subsystem; and the effector includes at least one third subsystem. The first, second, and third subsystems are either identical to each other, all three are identical, or all three are different. Each subsystem is distributed in the same or different hierarchical levels. For each subsystem, the subsystem includes a sub-sensor, a sub-sensing center, and a sub-effector. The sub-sensor is used to acquire audio and video information sent by subsystems at any other level besides its own, or to acquire audio and video information sent by subsystems at its own level besides the subsystem it is currently in. The sub-sensing center is used to process the audio and video information and acquire feedback information. The sub-effector is used to send the feedback information back to other subsystems at its own level besides the subsystem it is currently in, or to output the feedback information to other subsystems at any other level besides the subsystem it is currently in. Compared to existing technologies that can only produce audio via the internet, i.e., web-based methods, this application provides a distributed audio and video production system. It adopts the concept of a metaverse acoustic system, which consists of at least three components: a sensor, a sensory center, and an effector. These components are distributed across multiple levels of the system to produce audio and video data across multiple levels according to environmental changes and adaptive needs. This makes the production of audio and video data more flexible, convenient, and comprehensive. Attached Figure Description

[0053] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with the invention and, together with the description, serve to explain the principles of the invention.

[0054] Figure 1 This is a schematic diagram of the structure of a distributed audio and video production system provided in Embodiment 1 of this application;

[0055] Figure 2 A schematic diagram of the distributed audio and video production system provided in Embodiment 1 of this application, implemented through a two-level distributed architecture;

[0056] Figure 3 This is a schematic diagram of the structure of the subsystem provided in Embodiment 1 of this application;

[0057] Figure 4 This is a schematic diagram of the structure of a distributed audio and video production system provided in Embodiment 2 of this application;

[0058] Figure 5 This is a schematic diagram of the structure of a service subsystem in a service layer of a distributed audio and video production system provided in Embodiment 3 of this application;

[0059] Figure 6 A schematic diagram of the connection structure for a similar extension of a user data storage server is shown.

[0060] Figure 7 This is a schematic diagram of the structure of the client subsystem in the client layer of a distributed audio and video production system provided in Embodiment 4 of this application;

[0061] Figure 8 This is a schematic diagram of the structure of a playback subsystem in a playback layer of a distributed audio and video production system provided in Embodiment 5 of this application;

[0062] Figure 9 This is a schematic diagram illustrating the specific structure of the service layer in a distributed audio and video production system provided in Embodiment Six of this application.

[0063] Figure 10 This is a specific structural example diagram of the client layer in a distributed audio and video production system provided in Embodiment Six of this application;

[0064] Figure 11 This is a schematic diagram of the playback layer in a distributed audio and video production system provided in Embodiment Six of this application. Detailed Implementation

[0065] To make the objectives, technical solutions, and advantages of the present invention clearer, the technical solutions of the present invention will be described in more detail below with reference to the accompanying drawings of the preferred embodiments. In the drawings, the same or similar reference numerals denote the same or similar components or components having the same or similar functions throughout. The described embodiments are some, but not all, embodiments of the present invention. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain the present invention, and should not be construed as limiting the present invention. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative effort are within the scope of protection of the present invention. The embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0066] The terms "first," "second," and "third" (if applicable) in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that embodiments of the invention described herein can be implemented, for example, in orders other than those illustrated or described herein.

[0067] Furthermore, the terms “comprising” and “having”, and any variations thereof, are intended to cover non-exclusive inclusion, such that a process, method, system, product, or display that includes a series of steps or units is not necessarily limited to those steps or units that are explicitly listed, but may include other steps or units that are not explicitly listed or that are inherent to such process, method, product, or display.

[0068] In existing technologies, audio production is generally only possible through web-based methods, thus requiring a centralized server-side approach. Addressing this technical problem, the inventive concept of this application is to provide a novel distributed audio and video production system to achieve distributed audio production.

[0069] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will now be described with reference to the accompanying drawings.

[0070] Figure 1 This is a schematic diagram of the structure of a distributed audio and video production system provided in Embodiment 1 of this application, as shown below. Figure 1 As shown, this distributed audio and video production system can also be called the metaverse acoustic system, which includes three components: a sensor 11, a sensory center 12, and an effector 13.

[0071] Furthermore, the perceptron 11, perception center 12, and effector 13 of this distributed audio-visual production system can all be implemented hierarchically through different system structures. That is, the perceptron 11 may include at least one first subsystem; the perception center 12 may include at least one second subsystem; and the effector 13 may include at least one third subsystem. The first, second, and third subsystems may be identical to each other, identical to all three, or all three may be different. In addition, each subsystem is distributed in the same or different hierarchical levels.

[0072] For example, Figure 2 The distributed audio and video production system provided in Example 1 is illustrated in the structural diagram of a two-level distributed architecture, as shown below. Figure 2As shown, the sensor 11 includes two first subsystems 111, the sensory center 12 includes three second subsystems 121, and the effector 13 includes two third subsystems 131. The first subsystems 111, 121, and 131 are all different. The first subsystems 111 are distributed across both the first and second levels; two of the second subsystems 121 are located at the first level, and the remaining one at the second level; the third subsystems 131 are distributed across both the first and second levels. It should be noted that the subsystems under the sensor 11, the sensory center 12, and the effector 13 can be the same or different. For example, the two first subsystems 111 can be the same device or different devices; the second subsystem 121 at the second level can be the same device as or a different device from one of the second subsystems 121 at the first level; and the two third subsystems 131 can be the same device or different devices.

[0073] It should also be noted that this distributed audio and video production system can also be implemented through three-level, four-level, and five-level distributed architectures.

[0074] in addition, Figure 3 This is a schematic diagram of the subsystem provided in Embodiment 1, as shown below. Figure 3 As shown, each subsystem can be composed of three basic components: a sub-perceptor 21, a sub-perceptual center 22, and a sub-effector 23. Specifically, the sub-perceptor 21 is used to acquire audio and video information sent by subsystems at any level other than its own, or to acquire audio and video information sent by subsystems at its own level other than the subsystem it is currently in; the sub-perceptual center 22 is used to process the audio and video information and acquire feedback information; the sub-effector 23 is used to feed back the feedback information to other subsystems at its own level other than the subsystem it is currently in, or to output the feedback information to any subsystem at any level other than its own.

[0075] In this embodiment, the distributed audio production system includes a sensor, a sensing center, and an effector; wherein, the sensor includes at least one first subsystem; the sensing center includes at least one second subsystem; and the effector includes at least one third subsystem; the first, second, and third subsystems are identical to each other, or all three are identical, or all three are different; and each subsystem is distributed in the same or different levels; for each subsystem, the subsystem includes a sub-sensor, a sub-sensing center, and a sub-effector; the sub-sensor is used to acquire audio and video information sent by subsystems at any level other than its own level, or to acquire audio and video information sent by subsystems at its own level other than the subsystem it is currently in; the sub-sensing center is used to process the audio and video information and acquire feedback information; and the sub-effector is used to feed back the feedback information to other subsystems at its own level other than the subsystem it is currently in, or to output the feedback information to other subsystems at any level other than the subsystem it is currently in. Compared to existing technologies that can only produce audio via the internet, i.e., web-based methods, this application provides a distributed audio and video production system. It adopts the concept of a metaverse acoustic system, which consists of at least three components: a sensor, a sensory center, and an effector. These components are distributed across multiple levels of the system to produce audio and video data across multiple levels according to environmental changes and adaptive needs. This makes the production of audio and video data more flexible, convenient, and comprehensive.

[0076] Figure 4 This is a schematic diagram of a distributed audio and video production system provided in Embodiment 2 of this application. Based on Embodiment 1, the aforementioned layers may include one or more combinations of the following: at least one service layer, at least one client layer, and at least one playback layer. Specifically, the service layer includes at least one service subsystem; the client layer includes at least one client subsystem; the playback layer includes at least one playback subsystem; and the sensor, the sensing center, and the effector are all distributed across the service layer, the client layer, and / or the playback layer.

[0077] In this embodiment, a three-tier architecture is used as an example: the system includes at least a service tier, a client tier, and a playback tier. Each tier has at least one corresponding subsystem, and the sensor, the sensing center, and the effector are distributed across the service tier, the client tier, and / or the playback tier. Specific examples can be found as follows... Figure 4 As shown.

[0078] It should be noted that, Figure 4This is merely an example; the distributed audio and video production system can also be a two-tier, four-tier, or five-tier architecture. For instance, in a four-tier architecture, the system can include at least two service tiers, one client tier, and one playback tier. The above is just an illustrative example; the distributed audio production system is not limited to two-tier, three-tier, four-tier, or five-tier architectures, but can have more layers, and the names of the tiers are not limited to service tier, client tier, or playback tier, but can be other types of tiers.

[0079] Alternatively, a service subsystem can also consist of three components: a sub-perceptor, a sub-perceptual hub, and a sub-effector. The sub-perceptor may include at least one first server, the sub-perceptual hub may include at least one second server, and the sub-effector may include at least one third server. The first, second, and third servers may be identical to each other, identical to each other, or all three may be different. For example, a sub-perceptor may include a first server, a sub-perceptual hub may include a second server, and a sub-effector may include a third server; in this case, the first and second servers are identical, while the third server is different from both the first and second servers. That is, the first server can be the second server, while the third server is different from the first server; in this case, the first server constitutes both the sub-perceptor and the sub-perceptual hub, and the third server constitutes the sub-effector.

[0080] The client subsystem can also consist of three components: a sub-perceptor, a sub-sensory center, and a sub-effector. The sub-perceptor may include at least one first client, the sub-sensory center may include at least one second client, and the sub-effector may include at least one third client. The first, second, and third clients may be identical to each other, identical to all three, or all three may be different. For example, the sub-perceptor may include a first client, the sub-sensory center may include a second client, and the sub-effector may include a third client; in this case, the first client constitutes the sub-perceptor, the sub-sensory center, and the sub-effector.

[0081] The playback subsystem can also consist of three components: a sub-perceptor, a sub-sensing center, and a sub-effector. The sub-perceptor may include at least one first playback device, the sub-sensing center may include at least one second playback device, and the sub-effector may include at least one third playback device. The first, second, and third playback devices may be identical to each other, identical to all three, or all three may be different. For example, the sub-perceptor may include a first playback device, the sub-sensing center may include a second playback device, and the sub-effector may include a third playback device; wherein the first, second, and third playback devices are all different, that is, the first playback device constitutes the sub-perceptor, the second playback device constitutes the sub-sensing center, and the third playback device constitutes the sub-effector.

[0082] Optionally, in one possible embodiment, the subsystem is configured to process audio and video data and / or switch the interaction to one or more other subsystems in the same layer that have the same function as the at least one subsystem interacting with the current layer or other layers when it is determined that at least one subsystem interacting with the current layer or other layers has failed and is unable to interact with at least one subsystem in the current layer or other layers.

[0083] Alternatively, the subsystem may, upon determining that at least one subsystem interacting with the upper or lower level of this layer is faulty, or is unable to interact with at least one subsystem interacting with the upper or lower level of this layer, process the audio and video data itself and / or switch the interaction to one or more other subsystems in the upper or lower level of this layer that have the same function as the at least one subsystem interacting with.

[0084] Optionally, in one possible embodiment, the server is configured to process audio and video data itself and / or switch the interaction to one or more other servers with the same function as the at least one server interacting with when it determines that at least one server or other subsystem at another level has failed and is unable to interact with the at least one server or other subsystem at another level; or switch the interaction to a corresponding device at any other level other than the level where the server is located.

[0085] And / or,

[0086] The client is used to process audio and video data and / or switch the interaction to one or more other clients with the same function as the at least one client or other subsystem at a different level when it determines that at least one client or other subsystem at a different level has failed and is unable to interact with the at least one client or other subsystem at a different level.

[0087] And / or,

[0088] The playback device is used to process audio and video data and / or switch the interaction to one or more other playback devices with the same function as the at least one playback device or other subsystem at a different level when it determines that at least one playback device or other subsystem at a different level has malfunctioned and is unable to interact with the at least one playback device or other subsystem at a different level.

[0089] Optionally, in one possible embodiment, the service subsystem of this service level is configured to receive different types of data information from other subsystems within the same service level, or from subsystems at other levels. The service subsystem is also configured to perform corresponding processing based on the different types of data information. Furthermore, the service subsystem is configured to send the processed data to other subsystems within the same service level, or to subsystems at other levels.

[0090] Optionally, in one possible embodiment, the different types of data information include one or more combinations of the following:

[0091] Audio and video editing commands, audio and video editing collaboration commands, audio and video editing relay commands, file read and write commands, audio and video data, user information, and authentication results.

[0092] Optionally, in one possible embodiment, the service subsystem is specifically configured to perform one or more of the following processing methods based on different types of data information:

[0093] File management and processing, audio and video data editing and processing, user information management and processing, copyright processing of audio and video data, transaction processing of audio and video data, and interactive processing of audio and video data.

[0094] Optionally, in one possible embodiment, the client subsystem at the client level is used to obtain audio and video data from other client subsystems at the client level, or from subsystems at other levels other than the client level, and to operate on the audio and video data according to the obtained operation information.

[0095] The operational information includes one or more of the following combinations:

[0096] Information on audio and video data production, separation of audio and video data, collaborative production of audio and video data, relay production of audio and video data, display and processing of audio and video data, playback and processing of audio and video data, and reading and writing of audio and video data.

[0097] Optionally, in one possible embodiment, the playback subsystem of the playback level is used to obtain the audio and video data to be played from other playback subsystems of the playback level, or from subsystems of other levels other than the playback level, and to process the audio and video data to be played.

[0098] Optionally, in one possible embodiment, the service subsystem of the service level is configured to reallocate audio production tasks to the service subsystem of the service level and / or the subsystems of the other levels based on network status information, performance information from subsystems of other levels besides the service level, and / or the current working status of the subsystems of the other levels.

[0099] For example, this service subsystem can assign different computational tasks to clients based on network connection status, network bandwidth, client parameter information, and client computing power. For instance, if the service subsystem determines that the client's current network signal is poor or unstable, it will not assign audio production tasks or will assign a smaller audio production task to that client. Simultaneously, the service subsystem will continue to monitor the client's network status and, once the client's network status improves, will reassign audio production tasks or assign a larger audio production task to that client. Another example is that the service subsystem assigns corresponding audio production tasks based on client parameter information. If the service subsystem determines that the client's current performance is poor based on the parameter information, it will assign a smaller audio production task to the client; if the service subsystem determines that the client's current performance is good based on the parameter information, it will assign a larger audio production task to that client. Yet another example is that the service subsystem assigns audio production tasks to specific clients based on the division of labor among other terminals. For instance, if other clients have already produced dialogue and ambient sounds, the service subsystem will assign a specific audio production task to that specific client, such as an audio production task to enhance audio playback effects.

[0100] Optionally, in one possible embodiment, the client-level client subsystem is used to configure and process the audio and video data, and output the processed audio and video data to the playback device in the playback subsystem of the playback level corresponding to the type of the processed audio and video data.

[0101] For example, many sound cards now support different types of output, such as analog output and digital output; digital output is further divided into several types, such as MADI, AES, ADAT, SPDIF, etc. Some sound cards support multiple output types simultaneously (such as 8 AES outputs + 2 analog outputs + 6 ADAT outputs, etc.), so the client can configure it, for example, sending the client's calculated 5.1.2 data to be monitored in PCM format to the sound card's AES output port, while simultaneously mixing the 5.1.2 into 2.0 PCM and sending it to the sound card's analog output port, etc.

[0102] Figure 5 This is a schematic diagram of the service subsystem in the service layer of a distributed audio and video production system provided in Embodiment 3 of this application. Based on Embodiment 2 above, as follows... Figure 5 As shown, the server subsystem 31 includes a sub-sensor 311, a sub-sensing center 312, and a sub-effector 313; wherein, the sub-sensor 311 includes a first server 3111 and a first server 3112; the sub-sensing center 312 includes a first server 3112 (i.e., a second server) and a second server 3122; and the sub-effector 313 includes a third server 3131 and a third server 3132.

[0103] In this embodiment, there can be multiple service layers; this embodiment uses one service layer as an example for illustration. When multiple service layers exist, the service layers can also be merged, decomposed, extended in the same category, or extended in different categories according to deployment, business volume, process, and processing capacity requirements. For example, two service layers can be merged into one, or one service layer can be decomposed into two service layers, or at least one service layer can be extended in the same category based on a service layer, or at least one service layer can be extended in different categories based on a service layer.

[0104] Furthermore, the various service subsystems at the service level can be merged, decomposed, expanded in the same category, or expanded in different categories based on requirements such as deployment, business volume, processes, and processing capacity. For example, when the business volume in a certain region is small, the service subsystems deployed in that region can be merged or reduced.

[0105] Similarly, servers at the service level can be merged, decomposed, expanded in the same category, or expanded in different categories based on deployment, workload, processes, and processing capacity requirements. For example, audio / video data storage servers and data processing servers can be merged to form a single processing and storage server, which then interacts with other servers, service subsystems, and / or client subsystems at the client level, maintaining its original structure. Alternatively, a data processing server can be broken down into codec servers, AI servers, and audio effect algorithm servers. Or, similar user data storage servers can be expanded to manage user passwords, production projects, and audio / video data separately, allowing for interconnection according to established data protocols. Figure 6 The diagram illustrates the connection structure of a similar extension of a user data storage server. Alternatively, a different type of extension could be a community forum server for user communication. This extended community forum server could interact with an audio / video data storage server for user login, audio / video data sharing, and transactions; or interact with the internet for sharing and downloading network resources; or interact with a transaction processing server for selling and buying audio / video data.

[0106] Furthermore, when the service subsystem includes an AI processing server, this subsystem possesses self-learning capabilities, significantly accelerating response time and improving audio and video production efficiency. For example, the service layer can include an audio and video data storage server, a data processing server, and an AI processing server. While the data processing server reads and performs other operations on the audio and video data, and the audio and video data storage server stores the data, the AI processing server records the types of instructions received from the data processing server and the audio and video data storage server, as well as the processing procedures. This includes details such as the types of instructions received, the order of instructions, the steps involved in the audio and video production process, and the processing frequency. The AI processing server uses this recorded information for training, enabling it to mimic the client-level client subsystem or the client within the client subsystem, sending instructions to the audio and video data storage server and / or the data processing server to automatically complete the audio and video data production.

[0107] For example, an AI processing server can be connected to an encoding processing server to capture patterns in encoding parameters and continuously train and learn, and then the learning results can be transmitted to the encoding processing server to optimize encoding efficiency.

[0108] Furthermore, the service subsystems also possess self-healing capabilities, meaning they can still function normally even when at least one service subsystem is missing from the service hierarchy. For example, a service subsystem might have three data processing servers for distributed computing processing of audio and video data. If two of these data processing servers fail, all servers, service subsystems, and / or client subsystems interacting with those two servers can automatically redirect to the third data processing server. If this third data processing server also fails, the work being processed by the three servers can be transferred to the internet, data processing servers in other service subsystems, and / or client subsystems, thus ensuring the continued successful completion of audio and video production.

[0109] Furthermore, in this embodiment, the aforementioned servers include, but are not limited to: audio and video data storage servers, data processing servers, user data storage servers, transaction processing servers, copyright management servers, community forum servers, encoding and decoding servers, AI processing servers, and sound effect algorithm servers.

[0110] In addition, for example, when the server is an audio and video data storage server, one specific implementation of Embodiment 3 is as follows:

[0111] When the first server 3111 and / or the first server 3112 in the sub-sensor 311 is an audio and video data storage server, the first server 3111 and / or the first server 3112 can receive different types of data from the client subsystem in the client layer or from the server in any service subsystem in the service layer, such as: audio and video operation instructions, file read and write instructions, audio and video data, user information, authentication results, etc.

[0112] Furthermore, the first server 3111 and / or the first server 3112 can further manage the input connections, that is, manage the source, size, and / or type of the received audio and video data. In this way, the first server 3111 and / or the first server 3112 can perform different transmission processes on the received audio and video data based on its source, size, and / or type. For example, under conditions of high traffic and high concurrency, the first server 3111 and / or the first server 3112 can send the received audio and video data to the sub-sensing center capable of processing it, based on its source, size, and / or type.

[0113] And / or,

[0114] When the first server 3112 (i.e., the second server) and the second server 3122 included in the sub-sensing center 312 are audio and video data storage servers, the first server 3112 (i.e., the second server) and the second server 3122 are used to manage and maintain different types of received data. For example, they perform management operations on the received audio and video data, such as adding / deleting / modifying the audio and video data, adding / modifying / deleting network links for the stored audio and video data, managing user information identity, and managing user information permissions.

[0115] And / or,

[0116] When the third server 3131 and the third server 3132 included in the sub-effector 313 are audio and video data storage servers, the third server 3131 and the third server 3132 can output data to at least one client subsystem at the client level, or to at least one server in at least one service subsystem at the service level.

[0117] Furthermore, the third servers 3131 and 3132 can further manage the output queue, that is, output the output data selectively based on the destination, data size, and / or data type of the output data. For example, under conditions of high business volume and high concurrency, the third servers 3131 and 3132 can output the output data to at least one client subsystem at the appropriate client level, or at least one server in at least one service subsystem at the service level, according to the destination, data size, and / or data type of the output data.

[0118] When the server is a data processing server, another specific implementation of Example 3 is as follows:

[0119] When the first server 3111 and / or the first server 3112 in the sub-sensor 311 is a data processing server, the first server 3111 and / or the first server 3112 can receive different types of data from the client subsystem in the client layer or any service subsystem in the service layer, such as audio and video operation instructions, processed audio and video data and / or unprocessed audio and video data.

[0120] And / or,

[0121] When the first server 3112 (i.e., the second server) and the second server 3122 included in the sub-sensing center 312 are data processing servers, the first server 3112 (i.e., the second server) and the second server 3122 are used to process audio and video data, such as encoding, decoding, mixing, sound effects processing, AI processing, and / or spatial rendering of audio and video data. Alternatively, the first server 3112 (i.e., the second server) and the second server 3122 are also used to forward data to the sub-effector 313, and the forwarded data can be audio and video operation instructions, file read and write instructions, etc.

[0122] and / or

[0123] When the third server 3131 and the third server 3132 included in the sub-effector 313 are data processing servers, the third server 3131 and the third server 3132 can output data to at least one client subsystem at the client level or at least one server in at least one service subsystem at the service level. For example, they can send unprocessed audio and video data to other data processing servers at the service level, send file read and write instructions to user data storage servers at the service level, send processed audio and video data to client subsystems at the client level, and send data without registered copyright to copyright management servers at the service level.

[0124] When the server is a copyright management server, another specific implementation of Example 3 is as follows:

[0125] The copyright management server can interact with at least one server in at least one service subsystem of the service tier. For example, it can interact with an audio / video storage server in the service tier to perform copyright verification and / or registration of audio / video data. Alternatively, it can interact with the Internet to perform copyright verification on network data. Or, it can interact with a user data storage server in the service tier to query and determine copyright information owned by a user.

[0126] When the server is a transaction processing server, another specific implementation of Example 3 is as follows:

[0127] The transaction processing server can interact with at least one server in at least one service subsystem of the service tier. For example, it can interact with an audio / video storage server in the service tier to process audio / video data, such as through transactions. Alternatively, it can interact with a user data storage server in the service tier to query and determine the audio / video data, assets, and information such as recharges and point redemptions held by a user. Or, it can interact with the internet to process network data transactions.

[0128] When the server is a community forum server, another specific implementation of Example 3 is as follows:

[0129] The community forum server can interact with at least one server in at least one service subsystem of the service tier. For example, it can interact with an audio / video storage server in the service tier to share and download audio / video data. Alternatively, it can interact with a user data storage server in the service tier to handle user login, posting, and online chat interactions.

[0130] It should also be noted that the Internet can interact with at least one server in at least one service subsystem within the service layer. For example, the Internet can interact with copyright management servers to verify the copyright of online data. Alternatively, the Internet can interact with data processing servers to directly process online resources. Or, the Internet can interact with audio and video data storage servers to store online resources.

[0131] Figure 7 This is a schematic diagram of the client subsystem in the client layer of a distributed audio and video production system provided in Embodiment 4 of this application. Based on Embodiment 2 or Embodiment 3 above, as follows... Figure 7 As shown, the client subsystem 41 includes a sub-sensor 411, a sub-sensing center 412, and a sub-effector 413; wherein, the sub-sensor 411 includes a first client 4111 and a first client 4112; the sub-sensing center 412 includes a first client 4112 (i.e., a second client) and a second client 4122; and the sub-effector 413 includes a second client 4122 and a third client 4131.

[0132] In this embodiment, there can be multiple client layers; this embodiment uses one client layer as an example for explanation. Each client subsystem at the client layer can be composed of one or more clients, including sub-sensors, sub-sensor centers, and sub-effectors. Multiple clients can also jointly constitute a sub-sensor and a sub-sensor center, or a sub-sensor and a sub-effector, or a sub-sensor center and a sub-effector, or a sub-sensor, a sub-sensor center, and a sub-effector.

[0133] Additionally, the client can consist of one or more of the following: laptops, desktop computers, display devices, tablets, audio / video editing devices, and mobile devices. Alternatively, the client can consist of one or more of the following units: a user operation unit, an interface display unit, a data storage unit, a data processing unit, an audio / video editing and production unit, an audio / video playback unit, a codec processing unit, a rendering processing unit, an AI processing unit, and a packaging unit. Or, the client can be composed of any combination of the above-mentioned clients and units.

[0134] It should be noted that data storage units and data processing units can be merged to form a data storage and processing unit. The client subsystem comprising the data storage and processing unit can then be called a storage processing subsystem, continuing to interact with other client subsystems according to its original structure. Similarly, user operation units and interface display units can also be merged. The client subsystem comprising the user operation unit and interface display unit can form a graphical interface and perform audio editing, playback, etc. Furthermore, clients can be merged as needed to form a single, unified client.

[0135] In addition, the data processing unit can be broken down into multiple units to implement a specific function, such as an editing code processing unit, a rendering processing unit, an AI processing unit, etc. Thus, the client subsystem formed by the client composed of this unit can each implement a specific function, and the client subsystems can also interact with each other. For example, a compressed audio stream can be decoded, rendered, encoded, and encapsulated by other client subsystems in sequence.

[0136] Furthermore, client subsystems can be of the same type and can be interconnected, or they can connect with other types of subsystems. For example, three client subsystems, each including a data storage unit, can store local files, SD card files, and USB flash drive files respectively. These three subsystems can interact with other types of subsystems, such as a client subsystem with a user operation unit (which changes the content and structure of files according to operation instructions) or a client subsystem with a data processing unit (which processes the file content).

[0137] Furthermore, one or more perceptual structural models and other functional clients can be added to the client subsystem, interacting with one or more subsystems of the existing system according to a specific connection method. For example, adding a modeling subsystem to capture and analyze the user's ear structure for more accurate sound processing and playback can allow this subsystem to interact with the client subsystem, which has a data storage unit, to input and save the user's model data; or it can interact with the client subsystem, which has a data processing unit, to utilize the model data when processing audio.

[0138] Furthermore, the client subsystem also possesses self-healing capabilities. For example, even when one or more specific subsystems are missing, the client system can still perform all or part of its functions. For instance, a client subsystem might consist of two clients, one with a user operation unit and the other with an interface display unit, used for editing and the other for playback. If the user operation unit in one client malfunctions, the user operation unit in the other client can still interact with the interface display unit to successfully complete the production process. If one client is completely unusable, the other client can still perform its tasks, such as listening without acting or acting without listening.

[0139] Finally, client subsystems can also be divided into different levels of client subsystems based on features such as functional attributes and production processes. For example, client subsystems with user operation units and client subsystems with interface display units have a strict execution order, so they can be divided into two different client levels.

[0140] Furthermore, this embodiment will describe the composition and functions of the client in more detail, as well as the functions of the sub-sensors, sub-sensing centers, and sub-effectors:

[0141] Optionally, the client can consist of one or more user operation units. In other words, user operation units can constitute sub-perceptors, sub-perceptual centers, and / or effectors for performing specific client operations during the audio production process, including adding / deleting / modifying audio files, adjusting gain, mounting plugins, dragging audio object tracks, and playing audio data.

[0142] For example, as the initiator of audio editing actions, when the sub-sensor includes a user operation unit, its function is to generate user operation data and send it to the sub-sensing hub. For instance, it senses screen touch, identifies user operation data, and sends the identified data to the sub-sensing hub. Additionally, the sub-sensor can interact with other clients composed of user operation units, forwarding the processing results from other clients to the sensing hub. For example, when multi-finger dragging on the screen is inconvenient, a physical device such as a joystick can be used in the foreground to enable dragging, thereby identifying user operation data and forwarding it to the sensing hub.

[0143] And / or,

[0144] When the sub-sensing center includes a user operation unit, after receiving the operation data, the sub-sensing center will process the operation data (such as internal hardware circuit and algorithm processing, internal software instruction format conversion, etc.), generate operation results, and send them to the effector.

[0145] And / or,

[0146] When a sub-effector includes a user operation unit, the operation results received by the sub-effector are processed into operation instructions and sent to other subsystems. Since the sub-effector can interact directly with other subsystems, and the types of subsystems it interacts with can be various, the type of user operation unit included in the sub-effector will also differ. This difference can manifest in, but is not limited to, differences in data type, interface form, protocol type, and other processing methods. For example, when sending an operation result from a joystick to a PC, the sub-effector of the user operation unit (i.e., the joystick) must convert the operation result into operation instructions using the Bluetooth or USB protocol. As another example, when sending an operation result from a mobile phone to a server, the mobile phone acts as the user operation unit, and its sub-effector converts the operation result into operation instructions using the TCP protocol.

[0147] Optionally, the client can consist of one or more user operation units. In other words, user operation units can constitute sub-perceptors, sub-perceptual centers, and / or effectors for performing specific client operations during the audio production process, including adding / deleting / modifying audio files, adjusting gain, mounting plugins, dragging audio object tracks, and playing audio data.

[0148] For example, as the initiator of audio editing actions, when the sub-sensor includes a user operation unit, its function is to generate user operation data and send it to the sub-sensing hub. For instance, it senses screen touch, identifies user operation data, and sends the identified data to the sub-sensing hub. Additionally, the sub-sensor can interact with other clients composed of user operation units, forwarding the processing results from other clients to the sensing hub. For example, when multi-finger dragging on the screen is inconvenient, a physical device such as a joystick can be used in the foreground to enable dragging, thereby identifying user operation data and forwarding it to the sensing hub.

[0149] And / or,

[0150] When the sub-sensing center includes a user operation unit, after receiving the operation data, the sub-sensing center will process the operation data (such as internal hardware circuit and algorithm processing, internal software instruction format conversion, etc.), generate operation results, and send them to the effector.

[0151] And / or,

[0152] When a sub-effector includes a user operation unit, the operation results received by the sub-effector are processed into operation instructions and sent to other subsystems. Since the sub-effector can interact directly with other subsystems, and the types of subsystems it interacts with can be various, the type of user operation unit included in the sub-effector will also differ. This difference can manifest in, but is not limited to, differences in data type, interface form, protocol type, and other processing methods. For example, when sending an operation result from a joystick to a PC, the sub-effector of the user operation unit (i.e., the joystick) must convert the operation result into operation instructions using the Bluetooth or USB protocol. As another example, when sending an operation result from a mobile phone to a server, the mobile phone acts as the user operation unit, and its sub-effector converts the operation result into operation instructions using the TCP protocol.

[0153] Optionally, the client can consist of one or more data storage units. In other words, data storage units can form sub-sensors, sub-sensory centers, and / or effectors, used to store all relevant information during the production process and all data that may be used for audio production on the client, including current project information, audio data in the project, and other locally stored audio files. Furthermore, this data storage unit can be internal storage within the phone or an SD card, etc. Moreover, data storage units can interact with each other for data transfer.

[0154] When a sub-sensor includes a data storage unit, it can receive data from one or more subsystems (such as service subsystems, client subsystems, playback subsystems, etc.), or more specifically, user operation units, interface display units, and other data storage units. Furthermore, the sub-sensor needs to manage the received data, for example, by managing it according to dimensions such as data source, data size, and data type, so that it can be sent to the sub-sensor hub in a targeted manner. Moreover, because the sub-sensor can directly interact with other subsystems, it can receive different types of data, such as audio operation commands, file read / write commands, audio data, and spatial location information of playback devices.

[0155] And / or,

[0156] When the sub-sensing center includes a data storage unit, after processing by the preceding input queue, the sub-sensing center performs file management operations based on the received data type, such as adding / deleting / modifying files, adding / modifying / deleting configuration tables for stored files, etc.

[0157] And / or,

[0158] When a sub-effecter includes a data storage unit, it can send data to one or more subsystems (such as a service subsystem, playback subsystem, other client subsystems, etc.), or more specifically, user operation units, interface display units, and other data storage units. Therefore, the sub-effecter can further perform different processing based on different destinations, data sizes, data types, protocol types, and other attributes. Since the sub-effecter also interacts directly with other subsystems, it can send different types of data to different types of subsystems; for example, sending audio data to a data processing unit, file organization structure data to an interface display unit, and audio data to a playback device.

[0159] Optionally, the client can consist of one or more data processing units. In other words, data processing units can form sub-sensors, sub-sensing centers, and / or effectors to perform calculations and processing on the received data, and then send the processed data to other subsystems; or they can directly forward the data to other subsystems. Data processing units can interact with each other to transmit data.

[0160] Additionally, it should be noted that the data processing unit can be an integration of one or more of the following: audio and video editing and production unit, encoding and decoding processing unit, rendering processing unit, AI processing unit, and packaging unit.

[0161] When a sub-sensor includes a data processing unit, it can interact directly with other subsystems and receive different types of data, such as audio operation commands, processed audio data, and unprocessed audio data.

[0162] And / or,

[0163] When the sub-sensing center includes a data processing unit, it can process the data to form processed data, such as encoding, decoding, AI processing, and virtual rendering of audio and video data. Alternatively, it can directly forward the data to the sub-effecters, such as audio operation commands and text reading / writing commands.

[0164] And / or,

[0165] When a sub-effecter includes a data processing unit, it can directly interact with other subsystems and send different types of data to different types of subsystems. For example, it can send unprocessed audio and video data to other clients that include data processing units, send file read and write commands to clients that include data storage units, send processed audio and video data to playback devices, and submit audio data to service subsystems.

[0166] Figure 8 This is a schematic diagram of the playback subsystem in the playback layer of a distributed audio and video production system provided in Embodiment 5 of this application. Based on any of Embodiments 2 to 4 above, as follows... Figure 8 As shown, the playback subsystem 51 includes a sub-sensor 511, a sub-sensing center 512, and a sub-effector 513; wherein, the sub-sensor 511 includes a first playback device 5111 and a first playback device 5112; the sub-sensing center 512 includes a second playback device 5121 and a second playback device 5122; and the sub-effector 513 includes a third playback device 5132 and a third playback device 5132.

[0167] In this embodiment, the playback device comprises one or more of the following playback devices: a television, a sound card, a soundbar, a speaker array, headphones, and a cinema processor.

[0168] For the sub-sensor 511 in the playback subsystem, it can receive audio data to be played from the client subsystem (such as regular playback operations on mobile phones, TVs, etc.), or from other playback subsystems or playback devices (such as a TV connected to a soundbar via HDMI). The same playback device can have one or more input methods; for example, a sound card can receive AES, DANTE, MADI, and other audio signals, while a cinema processor can receive analog, BNC, SPDIF, and other audio signals. Since the sub-sensor may receive data from different types of subsystems, and the data types may also be different, protocol parsing is required after reception to form processable audio data before sending it to the sub-sensor hub.

[0169] The sub-perception center 512 in the playback subsystem can process audio data, such as DSP processing, equalization, delay, AD conversion, DA conversion, and even sound effect processing and encoding / decoding, and send the processed audio data to the sub-effects.

[0170] For sub-effect unit 513 in the playback subsystem, the processed audio data can be played directly, such as from a mobile phone or tablet, directly from a speaker or headphones; or the processed audio data can be sent to other playback devices for final playback, such as casting a mobile phone screen to a TV, connecting a computer to a sound card, or connecting a computer to a soundbar. When sending data to other playback devices, necessary protocol encapsulation must be performed according to the specific interface between the devices (wired or wireless, digital or analog, etc.).

[0171] In this embodiment, it should also be noted that multiple playback devices can be combined into one, such as a sound card and a cinema processor. Specifically, due to interface limitations, the cinema processor can only receive a limited type of audio data; for example, it can only receive analog audio data. The sound card, however, can convert the audio data type. For instance, it can convert AES audio data into analog audio data, allowing the cinema processor to receive the converted audio data. Therefore, through the sound card's type conversion, the cinema processor can receive and play any type of audio data.

[0172] In addition, a playback device can be broken down into multiple playback devices, each of which can perform its own playback function. For example, a playback device consisting of a TV and a soundbar can be broken down into separate devices to play audio through the TV or the soundbar. The separate playback devices can also interact with other devices or with the client subsystem. For instance, the detached soundbar can be connected to a computer to directly play the results created by the client subsystem.

[0173] Furthermore, multiple playback devices can connect to each other, as well as to one or more client subsystems in the client layer. For example, three sound cards can be connected sequentially if their interfaces match, and each sound card can output its own audio data; at the same time, each sound card can connect to a computer to perform audio data transmission, sound card driver configuration, and other operations (i.e., a client subsystem with a data processing unit and a data storage unit).

[0174] Furthermore, playback levels can be divided into different levels based on the connection order of the playback devices. For example, the playback levels can be divided into two levels. The level with switching functionality can be classified as the upper level. After receiving audio data sent by the client subsystem, the playback device in the playback subsystem of this level can transfer the audio data to other devices for playback, such as sound cards and televisions. The level without switching functionality can be classified as the lower level. After receiving audio data, the playback device in the playback subsystem of this level can play it directly, such as speaker arrays, soundbars, and headphones.

[0175] Figures 9 to 11 These are schematic diagrams illustrating the specific structure of the service layer, client layer, and playback layer in a distributed audio and video production system provided in Embodiment Six of this application.

[0176] In this embodiment, Figure 9 As a specific implementation of the service level, such as Figure 9 As shown, the production process revolves around a service subsystem with an audio and video data storage server. In other words, other servers or other service subsystems need to return all or part of the processing results to the service subsystem with the audio and video data storage server.

[0177] Additionally, it should be noted that if the aforementioned service levels and service subsystems at those levels satisfy characteristics such as scalability and self-organization, then the "hub (i.e., the service subsystem formed by the audio and video data storage server)" in this embodiment can be replaced with a service subsystem consisting of a data processing server or a data processing server, or a client subsystem with processing capabilities, or it can be connected to a service subsystem consisting of a data processing server or a data processing server, or a client subsystem with processing capabilities, etc., without affecting the production process.

[0178] More specifically, this embodiment takes a service level as an example. The service level may include at least one service subsystem. Each service subsystem includes a sub-sensor, a sub-sensing center, and a sub-effector. These sub-sensors, sub-sensing centers, and sub-effectors may consist of 3 audio and video data storage servers, 4 data processing servers, and 4 other servers.

[0179] For data storage servers, the specific functions include, but are not limited to, the following:

[0180] The primary function is to receive audio commands and / or audio data from the client subsystem. For example, the audio command refers to user operation data sent by the client subsystem, such as audio cutting, audio object position editing, and gain adjustment in the Editor tool, or OSD control, user login, and project saving in the Player tool. The audio data refers to data that is directly audible (including audible after decoding) or indirectly audible (referring to data combined with other data besides audio data), such as traditional channel data (2.0 AAC, 5.1 WAV, 7.1.4 WANOS, etc., directly audible), object metadata (spatial coordinates, rotation angles, etc., audible after being combined with channel data), and project data (start and end times, Mute / Solo, plugin mounting status, etc., audible after being combined with channel data).

[0181] The second function is to change the internal file organization structure. For example, if an audio command contains "file" related content, the internal file structure needs to be changed according to the command; for example, if the client drags the position of a certain audio track on the timeline during the editing process, the server needs to find the file corresponding to that audio track and modify the start and end times of that file in the configuration file.

[0182] The third function is to send internal data and / or audio commands from the client subsystem to data processing servers such as encoding / decoding, rendering, and separation servers. For example, if an audio command contains "processing" related content, then the command needs to be sent to the data processing server; if the client subsystem requests AI track separation of a stereo audio file to generate multiple mono audio files, then the command needs to be sent to the AI processing server.

[0183] The fourth function is to send audio commands to the project management server. For example, if the audio command contains content related to "project," then the command needs to be sent to the project management server; for instance, if the client subsystem wants to open a project under a certain user's name, or save the currently created project.

[0184] Fifth function: Receive and save processed data sent from the data processing server.

[0185] Sixth function: Save data sent by the client subsystem; for example, the client subsystem can upload audio files during the production process, and the data storage server can receive and save them.

[0186] The seventh function: Send the audio organization structure data to the project management server to save the project. This audio organization structure data refers to the audio data and its related configuration files, plugin file links, etc. Each production project must record this information in detail and update it in real time. For example, it includes information such as what files are in the project, which track each file is mounted on, its start and end times, whether Mute / Solo is enabled, and whether plugins are mounted.

[0187] Eighth function: Read projects from the project management server and receive audio organization structure data.

[0188] In addition, the aforementioned project management server, as a type of data server, has, but is not limited to, the following functions:

[0189] First function: Receive audio commands sent by the client subsystem.

[0190] The second function is to receive and save audio organization structure data. Commands related to "projects" are processed in a service subsystem with a project management server, which maintains the audio organization structure data corresponding to each project, such as saving / reading / adding / deleting projects.

[0191] The third function is to send audio organization structure data to the data storage server, so that the data storage server can locate the file locations in the project.

[0192] The fourth function is to send audio commands and / or project organization structure data to the user management server. Project organization structure data refers to the project information under a username, such as the number of projects, names, and attributes (game projects / film projects, etc.). Additionally, all commands related to the "user" must be sent to the user management server for processing. For example, if the client subsystem wants to synchronize the current project to a specific username, the project management server must send the current project organization structure data to the user management server.

[0193] Fifth function: Log in as a user from the user management server and obtain project organizational structure data.

[0194] Furthermore, the aforementioned user management server, as a type of data server, has, but is not limited to, the following functions:

[0195] The primary function is to receive audio commands from the client. Commands related to the "user" must be processed in a service subsystem with a user management server, such as logging in / out of a user or opening a project under that user's name.

[0196] The second function is to receive and save project organizational structure data. This function can be used when project organizational structure data needs to be synchronized to a user's account.

[0197] The third function is to send project organizational structure data to the project management server to open a project from the project list under a user's name.

[0198] Optionally, the data processing server may specifically include an audio encoding server. In this embodiment, its function is to encode audio data into a compressed bitstream and send it to a server for a specific purpose or return it to a data storage server. In this embodiment, the data source for the encoding system is the data storage server. In practice, depending on the specific usage scenario and the scalability of the system architecture, it can also interact with other servers. For example, it can obtain audio data from a network material library, encode it, and send it to the data storage server for storage. Alternatively, it can receive data from a decoding server and encode it, i.e., a transcoding process. Or, it can directly send the encoded data to the client subsystem for playback.

[0199] In addition, the data processing server can specifically include: an audio decoding server, an audio rendering server, and an AI audio separation server. These three can also interact with other servers according to the scalability of the system architecture. For example, directly performing AI processing on audio data from the internet and sending the processed results to a data storage server for storage would result in a server connection method of "data storage <---> AI <---> online media library"; or decoding and rendering a segment of audio stored during production and sharing it to the internet as pure PCM audio would result in a server connection method of "storage <---> decoding <---> rendering <---> online media library".

[0200] It should also be noted that the online material library belongs to the Internet server. Users can download online materials for production (which can be sent to multiple servers such as encoding, decoding, AI, and storage), or share the finished product to the Internet (which can be sent from multiple servers such as storage and encoding).

[0201] Furthermore, film production, streaming media production, and game engines are all specific-purpose servers. Currently, these production processes are all conducted locally, but theoretically, they can all be moved to the cloud. The client can not only send commands during the production process, as in this embodiment, but also interact directly with these three servers. Additionally, storage servers, processing servers, and other servers can directly retrieve data from these three servers, process the data, and return it. For example, if a game is temporarily modified before release, the processing server can be connected to the game engine server to directly process some data, quickly generating the modified game.

[0202] In this embodiment, Figure 10 This is a specific implementation method at the client level, such as... Figure 10As shown, the client-side layer includes at least one client subsystem, and each client subsystem further includes a sub-sensor, a sub-sensor center, and a sub-effector. Figure 10 Specifically, the document defines the clients and / or units that constitute the sub-perceptor, sub-perceptual hub, and sub-effecter. Specifically, the WANOS Editor is a combination of user operation units and data processing. When users perform audio project creation processes in the editor, such as adding files, adding plugins, and adjusting track equalizers, the editor reads the necessary audio data from the local data storage unit, processes it, returns the processed data to the local storage unit for saving, and simultaneously sends the data to be displayed to the editing interface. Furthermore, while the editor obtains audio and video data from the local storage unit, it can also directly obtain data from the service subsystem.

[0203] Furthermore, the WANOS Player is a combination of a user operation unit and a data processing unit. During audio and video playback and OSD playback control, the player reads the necessary audio data from the local data storage unit, processes it, and returns the processed data to the local storage system for saving. Simultaneously, it sends the data to be displayed to the editing interface. Additionally, the player can obtain data from both the local storage unit (directly reading files) and the service subsystem (streaming media).

[0204] Furthermore, the editing and playback interfaces belong to the interface display units. After receiving the data to be displayed from the editing and playback tools, they are processed into displayable data and displayed on the interface, such as audio track waveforms and visualization of control plugins.

[0205] In addition, local data storage is a data storage unit responsible for transmitting and receiving audio and video data to and from the service subsystem, receiving audio operation instructions and processing them accordingly, and storing audio data generated during the production process.

[0206] In this embodiment, local data storage and editing tools and playback tools interact with each other, but they can interact with different subsystems depending on the actual situation, for example:

[0207] When a track is dragged on the editing tool, the data storage unit receives the operation command, modifies the audio data and project data related to that track, and then uploads it to the service subsystem for synchronization; or the service subsystem modifies the data first and then sends it to the data storage unit for synchronization.

[0208] And / or,

[0209] When monitoring and playing back on the playback tool, at least one of the following operations can be performed: First, the playback tool can send operation commands to the data storage unit, which reads the relevant audio track information and sends it to the playback tool for playback; second, the playback tool can directly read data from the service subsystem in streaming media format for playback. Due to the adaptive characteristics of the client subsystem, even if the local data storage unit is not working, both of the above situations still hold true, and the entire audio production process is completed.

[0210] In this embodiment, Figure 11 One specific implementation of playback hierarchy, such as Figure 11 As shown, the playback hierarchy includes at least one playback subsystem, and each playback subsystem further includes a sub-sensor, a sub-sensor center, and a sub-effector. Figure 11 The specific device given is the playback device that constitutes the sub-sensor, sub-sensor center and sub-effector. There can be two types, but there can be many actual application scenarios.

[0211] The playback device can be a regular audio device. A regular audio device refers to a device that only interacts with audio data and does not generate other data. The client subsystem sends the processed data to the regular audio device, which then plays it directly, or sends it to other audio devices cascaded with it. Typical devices include mobile phones, computers, regular headphones, speaker arrays, and in-vehicle infotainment systems.

[0212] Additionally, the playback device can be specifically a hybrid audio device. A standard audio device refers to a device that only interacts with audio data and does not generate other data. The client subsystem sends the processed data to the standard audio device for direct playback, or sends it to other cascaded audio devices. Typical devices include mobile phones, computers, standard headphones, speaker arrays, and in-vehicle infotainment systems.

[0213] Other embodiments of the invention will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. The embodiments of the invention are intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary techniques in the art not disclosed herein. The specification and embodiments are to be considered exemplary only, and the true scope and spirit of the invention are indicated by the following claims.

[0214] It should be understood that the present invention is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of the invention is limited only by the appended claims.

Claims

1. A distributed audio and video production system, characterized in that, include: Perceptor, sensory center, and effector; among them, The sensor includes at least one first subsystem; the sensory center includes at least one second subsystem; the effector includes at least one third subsystem; the first subsystem, the second subsystem, and the third subsystem are identical to each other, or all three are identical, or all three are different; and each subsystem is distributed in the same or different levels. For each subsystem, the subsystem includes a sub-sensor, a sub-sensor center, and a sub-effector; The sub-sensor is used to acquire audio and video information sent by subsystems at any level other than its own level, or to acquire audio and video information sent by subsystems at its own level other than the subsystem in question. The sub-sensing center is used to process the audio and video information and obtain feedback information; The sub-effector is used to feed back the feedback information to other subsystems in this level besides the subsystem itself, or to output the feedback information to any other subsystem in any level besides the current level.

2. The distributed audio and video production system according to claim 1, characterized in that, The hierarchy is one or more of the following combinations: At least one service layer, at least one client layer, and at least one playback layer; The service level includes at least one service subsystem; the client level includes at least one client subsystem; and the playback level includes at least one playback subsystem. The sensor, the sensing center, and the effector are all distributed at the service level, the client level, and / or the playback level.

3. The distributed audio and video production system according to claim 2, characterized in that, In the service subsystem, the sub-sensor includes at least one first server; the sub-sensing center includes at least one second server; the sub-effector includes at least one third server; and the first server, the second server, and the third server are either identical to each other, or all three are identical, or all three are different. And / or, In the client subsystem, the sub-sensor includes at least one first client; the sub-sensing center includes at least one second client; the sub-effector includes at least one third client; and the first client, the second client, and the third client are either identical to each other, or all three are identical, or all three are different. And / or, In the playback subsystem, the sub-sensor includes at least one first playback device; the sub-sensing center includes at least one second playback device; the sub-effector includes at least one third playback device; and the first playback device, the second playback device, and the third playback device are either identical to each other, or all three are identical, or all three are different.

4. The distributed audio and video production system according to claim 3, characterized in that, The server consists of one or more of the following servers: Audio and video data storage server, data processing server, user data storage server, transaction processing server, copyright management server, community forum server, codec server, AI processing server, and sound effect algorithm server.

5. The distributed audio and video production system according to claim 3, characterized in that, The client consists of one or more of the following: laptops, desktops, display devices, tablets, audio / video editing devices, and mobile devices; or, The client consists of one or more of the following units: user operation unit, interface display unit, data storage unit, data processing unit, audio and video editing and production unit, audio and video playback unit, encoding and decoding processing unit, rendering processing unit, AI processing unit, and encapsulation unit; or, The client consists of one or more of the following: laptops, desktops, display devices, tablets, audio / video editing devices, and mobile devices; It consists of one or more of the following units: user operation unit, interface display unit, data storage unit, data processing unit, audio and video editing and production unit, audio and video playback unit, encoding and decoding processing unit, rendering processing unit, AI processing unit, and packaging unit.

6. The distributed audio and video production system according to claim 3, characterized in that, The playback device comprises one or more of the following playback devices: Televisions, sound cards, soundbars, speaker arrays, headphones, and cinema processors.

7. The distributed audio and video production system according to any one of claims 1 to 6, characterized in that, The subsystem is used to process audio and video data and / or switch the interaction to one or more other subsystems in the same level that have the same function as the at least one subsystem interacting with the current level or other levels when it is determined that at least one subsystem interacting with the current level or other levels has failed or is unable to interact with at least one subsystem interacting with the current level or other levels. or The subsystem is used to process audio and video data and / or switch the interaction to one or more subsystems in the upper or lower level of the current level that have the same function as the at least one subsystem interacting with when it determines that at least one subsystem interacting with the upper or lower level of the current level is faulty or unable to interact with at least one subsystem interacting with the upper or lower level of the current level.

8. The distributed audio and video production system according to any one of claims 3 to 6, characterized in that, The server is used to process audio and video data and / or switch the interaction to one or more other servers with the same function as the at least one server interacting with when it determines that at least one server or other subsystem at another level has failed and is unable to interact with the at least one server or other subsystem at another level. Alternatively, the interaction can be switched to the corresponding device at any other level besides the level where the server is located; And / or, The client is used to process audio and video data and / or switch the interaction to one or more other clients with the same function as the at least one client being interacted with, or to switch the interaction to a device at any other level other than the level where the client is located, when it is determined that at least one client or other subsystem at the level it is interacting with has failed and is unable to interact with the at least one client or other subsystem at the level it is interacting with. And / or, The playback device is used to process audio and video data and / or switch the interaction to one or more other playback devices with the same function as the at least one playback device being interacted with, or to switch the interaction to a corresponding device at any other level other than the level where the playback device is located, when it is determined that at least one playback device or other subsystem at a certain level has malfunctioned and is unable to interact with the at least one playback device or other subsystem at a certain level.

9. The distributed audio and video production system according to any one of claims 2 to 6, characterized in that, The service subsystem of the service level is used to receive different types of data information from other subsystems of the service level, or from subsystems of other levels other than the service level. The service subsystem is also used to perform corresponding processing based on different types of data information; The service subsystem is also used to send the data obtained after operation processing to other subsystems at the service level, or to subsystems at other levels other than the service level.

10. The distributed audio and video production system according to claim 9, characterized in that, The different types of data information include one or more of the following combinations: Audio and video editing commands, audio and video editing collaboration commands, audio and video editing relay commands, file read and write commands, audio and video data, user information, and authentication results.

11. The distributed audio and video production system according to claim 9, characterized in that, The service subsystem is specifically used to perform one or more of the following processing methods based on different types of data information: File management and processing, audio and video data editing and processing, user information management and processing, copyright processing of audio and video data, transaction processing of audio and video data, and interactive processing of audio and video data.

12. The distributed audio and video production system according to any one of claims 2 to 6, characterized in that, The client-level client subsystem is used to obtain audio and video data from other client subsystems at the client level, or from subsystems at other levels other than the client level, and to operate on the audio and video data according to the obtained operation information.

13. The distributed audio and video production system according to claim 12, characterized in that, The operation information includes one or more of the following combinations: Information on audio and video data production, separation of audio and video data, collaborative production of audio and video data, relay production of audio and video data, display and processing of audio and video data, playback and processing of audio and video data, and reading and writing of audio and video data.

14. The distributed audio and video production system according to any one of claims 2 to 6, characterized in that, The playback subsystem of the playback level is used to obtain audio and video data to be played from other playback subsystems of the playback level, or from subsystems of other levels other than the playback level, and to process the audio and video data to be played.

15. The distributed audio and video production system according to any one of claims 2 to 6, characterized in that, The service subsystem of the service level is used to reallocate audio and video production tasks to the service subsystem of the service level and / or the subsystems of the other levels based on network status information, performance information from subsystems of other levels other than the service level, and / or the current working status of the subsystems of the other levels.

16. The distributed audio and video production system according to any one of claims 2 to 6, characterized in that, The client-level client subsystem is used to configure and process audio and video data, and output the processed audio and video data to the playback device in the playback subsystem of the playback level corresponding to the type of the processed audio and video data.