Intelligent audio generation system, method, device and medium supporting multi-entity interaction

By reading the unique identifiers of multiple NFC cards through a smart audio device and utilizing the story logic rule base and audio segment library on a cloud server, differentiated audio content is dynamically generated, solving the problem of the single interaction mode in existing technologies and realizing a rich audio experience.

CN122196223APending Publication Date: 2026-06-12SHENZHEN WELLDY TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHENZHEN WELLDY TECH CO LTD
Filing Date
2026-03-13
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In the existing interaction methods of smart audio devices, there is a fixed correspondence between physical objects and audio content, resulting in a single interaction mode, a lack of logical continuity in content, and a lack of personalized experience.

Method used

By reading the unique identifiers of multiple NFC cards through a smart audio device to generate a combined request command, and utilizing the story logic rule library and audio segment library of a cloud server, differentiated audio content is dynamically generated, supporting multi-entity interaction.

Benefits of technology

It enables dynamic content generation based on multi-entity combinations, enriching the interactive depth and variability of audio content, and meeting users' needs for in-depth interaction and personalized auditory experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122196223A_ABST
    Figure CN122196223A_ABST
Patent Text Reader

Abstract

The application discloses a kind of intelligent audio generation system, method, equipment and medium supporting multi-entity interaction, it is related to audio equipment field.System includes: multiple NFC cards with unique identifier and being defined as story element attribute;Intelligent audio device is used to detect and read the unique identifier of multiple NFC cards in induction area by NFC card reading module, then combination generates combination request instruction and sends to cloud server;Cloud server is used to receive combination request instruction, according to multiple unique identifiers in story logic rule base matching determines corresponding story line logic, retrieves audio segment from audio segment library and generates ordered audio segment playing sequence and issues;Intelligent audio device receives audio segment playing sequence and plays audio content therein in order.The application can generate differentiated audio content with logical association, break the single solidified interaction limit, and enrich the audio content interaction experience of user.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of audio devices, and more particularly to an intelligent audio generation system, method, device, and medium that supports multi-entity interaction. Background Technology

[0002] Smart audio devices (such as children's story machines and smart speakers) are widely used in children's education and entertainment as edutainment electronic products. These devices typically have Radio Frequency Identification (NFC) functionality, enabling them to trigger audio playback by sensing external physical objects (such as cards or toys), thus facilitating interaction between the user and the device.

[0003] In existing technologies, the logic for playing audio on smart audio devices typically employs a "single trigger, fixed mapping" model. Specifically, when a user places a card with an NFC tag on the device, the device reads the card's identifier and searches for a specific audio file directly bound to that unique identifier in local storage or the cloud for playback. For example, if a user places a "tiger" card, the device will play a pre-set audio introduction or story about tigers; to play other content, the user must use a different card.

[0004] However, in the above interaction methods, there is a fixed correspondence between physical objects and audio content. This means that when a user interacts with the same physical object, the feedback received is always the same, and the device cannot generate differentiated or logically continuous content based on the user's specific operating context. This singular and fixed interaction mode limits the richness and variability of audio content, easily leading to user boredom and failing to meet users' needs for deep interaction and personalized auditory experiences. Summary of the Invention

[0005] To address the aforementioned technical problems and deficiencies, the purpose of this invention is to provide an intelligent audio generation system, method, device, and medium that supports multi-entity interaction, which can alleviate the problem of limited and fixed interactive content in existing intelligent audio devices.

[0006] To achieve the above objectives, in a first aspect, the present invention provides an intelligent audio generation system supporting multi-entity interaction, comprising NFC cards, an intelligent audio device, and a cloud server; each NFC card stores a unique identifier and is predefined as a specific story element attribute, which includes at least one of character attributes, scene attributes, prop attributes, or event attributes; the intelligent audio device includes an NFC card reader module, a network communication module, a control module, and an audio output module; the intelligent audio device is configured to: detect and read the unique identifier of the NFC card placed in the sensing area through the NFC card reader module, and combine the read multiple unique identifiers to generate a combined request. The system requests instructions and sends the combined request instructions to a cloud server via a network communication module. The cloud server stores a story logic rule base and an audio segment library. The cloud server is configured to: receive the combined request instructions; match them in the story logic rule base according to the multiple unique identifiers contained in the combined request instructions to determine the corresponding storyline logic; retrieve the corresponding audio segments from the audio segment library according to the storyline logic and generate an ordered audio segment playback sequence; and send the audio segment playback sequence to the smart audio device. The smart audio device is also configured to: receive the audio segment playback sequence and play the audio content in the audio segment playback sequence in sequence through an audio output module.

[0007] This invention employs the aforementioned system architecture. The system utilizes an NFC card reader module to read the unique identifiers of multiple NFC cards (i.e., multiple entity objects) and generates a combined request command containing a set of multiple identifiers on the device side, thereby achieving the capture and parsing of multi-entity interaction intentions at the technical level. By configuring a smart audio device to read the unique identifiers of multiple NFC cards and generate combined request commands, the system utilizes a story logic rule library and audio segment library on a cloud server to achieve dynamic content generation based on multi-entity combinations. The system can parse the meaning of the set of multiple identifiers, match the corresponding storyline logic in the cloud, and retrieve segments to generate an ordered playback sequence. This means that users can drive the system to generate differentiated and logically coherent new story content by changing the combination of entity objects involved in the interaction (e.g., changing the combination of characters and scenes). This mechanism supporting multi-entity interaction achieves a leap from single-point triggering to combined creation, enriching the dimensions of audio content generation and meeting users' needs for deep interaction and personalized narrative experiences. It effectively solves the problems of rigid interaction, singular feedback, and lack of logical continuity caused by single-entity triggering of fixed content patterns in existing technologies.

[0008] In some implementations, the cloud server is configured to: identify the story element attributes corresponding to different unique identifiers in the combined request instruction, and determine a unique storyline logic or trigger a specific branch plot according to the preset attribute combination rules in the story logic rule base.

[0009] By employing the above approach, NFC cards are assigned specific attributes such as roles, scenes, props, or events. The cloud server then identifies and processes these attributes according to combination rules, enabling more refined control of the story logic. The system can precisely determine a unique storyline or trigger specific branching plots based on different attribute combinations (such as "role + scene + prop"), making the interaction logic more aligned with human narrative habits and enhancing the rationality, richness, and logical rigor of the generated story.

[0010] In some implementations, the smart audio device is further configured to: when encountering a story branch point during the playback of an audio segment playback sequence, issue an interactive signal through voice prompts or light prompts, and restart the NFC card reader module to wait for the detection of a new NFC card; if the NFC card reader module detects the unique identifier of a new NFC card, the smart audio device generates an update request instruction containing the new unique identifier and sends it to the cloud server; the cloud server determines the subsequent story branch direction based on the update request instruction, generates the subsequent audio segment playback sequence, and sends it to the smart audio device for continued playback.

[0011] By adopting the above solution, real-time interactive functionality is added during story playback. By issuing prompts at story branching points and restarting the NFC card reader module to await new cards, the device allows users to determine the subsequent direction of the story by inputting new entities. This instant feedback mechanism transforms the passive "listening to the story" into the active "participation in the story," significantly enhancing the user's immersion and interactive enjoyment, and achieving a truly non-linear narrative experience.

[0012] In some implementations, the smart audio device is also equipped with a local storage module; after reading the unique identifier of the NFC card, the smart audio device is configured to: firstly search in the local storage module for whether there is a cached audio segment corresponding to a combination of multiple unique identifiers; if a cached audio segment exists, it is played directly through the audio output module; if no cached audio segment exists, it executes the operation of sending a combination request instruction to the cloud server, and after receiving the audio segment playback sequence sent by the cloud server, it stores it in the local storage module.

[0013] By adopting the above solution and setting up a local storage module with a priority retrieval cache mechanism, the device's response speed and network dependency are significantly optimized. For pre-combined card sets, the device can play them directly from the local storage without repeatedly requesting the cloud, which not only saves network traffic and server resources but also ensures smooth and immediate audio playback in unstable network environments or without a network connection, thus improving the user experience.

[0014] In some implementations, the NFC card reader module also has a data writing function; the smart audio device is also configured to: after successfully receiving the audio segment playback sequence and completing the download, control the NFC card reader module to write status identification information into the storage chip of the NFC card participating in the interaction; the status identification information includes at least a status bit indicating that the card has been used, and the device identifier or user account information of the current smart audio device.

[0015] The above solution endows the NFC card reader module with data writing functionality, enabling the system to write back the card's usage status, bound device identifier, or user account information to the NFC card chip. This mechanism provides underlying data support for subsequent copyright protection, device binding verification, and prevention of unauthorized card misuse, facilitating effective management of the physical card's lifecycle, circulation scope, and activation status.

[0016] In some implementations, when the smart audio device reads an NFC card through the NFC card reader module, it is also configured to: read the status identifier information stored in each NFC card; the control module of the smart audio device is configured to execute verification logic: determine whether the device identifier or user account information in the status identifier information matches the device identifier of the current smart audio device or the currently logged-in user account; if they do not match, it is determined to be a non-bound device, and the smart audio device is prohibited from generating combination request commands or from playing audio segment playback sequences.

[0017] By adopting the above scheme, a comprehensive permission verification mechanism has been established. By comparing the status information stored in the card with the current device or account, the system can effectively identify and block usage requests from unbound devices. This not only achieves strict digital content copyright protection, preventing the disorderly spread of pirated or unauthorized content, but also safeguards the commercial interests based on the user account system and protects the rights of legitimate users.

[0018] In some implementations, when generating a combination request instruction, the smart audio device is also configured to: detect the placement order information of each NFC card when it is read by the NFC card reader module, and encapsulate the placement order information into the combination request instruction; when determining the corresponding storyline logic, the cloud server is configured to: simultaneously match the corresponding storyline logic in the story logic rule base based on the set content of multiple unique identifiers and the placement order information.

[0019] By adopting the above approach, the key dimension of placement order is introduced, allowing the cloud server to consider not only the content of the card set but also the order in which the cards are read when matching story logic. This enables the same set of cards to generate different storylines depending on their placement order (e.g., obtaining the item before entering the scene versus entering the scene before obtaining the item), greatly expanding the possibilities for story combinations and further enhancing the exploratory and playable aspects of the interaction.

[0020] Secondly, the present invention provides a smart audio generation method supporting multi-entity interaction, applied to the smart audio device of the first aspect. The method includes: detecting multiple NFC cards placed in the sensing area via an NFC card reader module; reading unique identifiers stored in the multiple NFC cards via the NFC card reader module; combining the read unique identifiers to generate a combination request instruction; sending the combination request instruction to a cloud server via a network communication module to request the cloud server to perform logical matching based on the combination request instruction; receiving an audio segment playback sequence returned by the cloud server via the network communication module, the audio segment playback sequence being generated by the cloud server in response to the combination request instruction; parsing the audio segment playback sequence; and playing the audio content contained in the audio segment playback sequence sequentially via an audio output module.

[0021] Thirdly, the present invention provides a smart audio device, comprising: one or more processors and a memory; the memory being coupled to the one or more processors, the memory being used to store computer program code, the computer program code including computer instructions, the one or more processors invoking the computer instructions to cause the smart audio device to perform the methods described in the second aspect and any possible implementation thereof.

[0022] Fourthly, the present invention provides a computer-readable storage medium storing computer instructions that, when executed on a smart audio device, cause the smart audio device to perform the method described in the second aspect and any possible implementation thereof.

[0023] Fifthly, the present invention provides a computer program product including computer instructions that, when executed on a smart audio device, cause the smart audio device to perform the method described in the second aspect and any possible implementation thereof.

[0024] Understandably, the method provided in the second aspect, the intelligent audio device provided in the third aspect, the storage medium provided in the fourth aspect, and the computer program product provided in the fifth aspect are all used in the system provided in the first aspect. Therefore, the beneficial effects they can achieve can be referred to the beneficial effects in the corresponding systems, and will not be repeated here. Attached Figure Description

[0025] Figure 1 This is a schematic diagram of the architecture of an intelligent audio generation system supporting multi-entity interaction according to an embodiment of the present invention.

[0026] Figure 2 This is a technical roadmap diagram of an intelligent audio generation system supporting multi-entity interaction according to an embodiment of the present invention;

[0027] Figure 3 This is a schematic diagram of the architecture of a smart audio device according to an embodiment of the present invention;

[0028] Figure 4 This is a flowchart illustrating an intelligent audio generation method supporting multi-entity interaction according to an embodiment of the present invention.

[0029] Figure 5 This is a schematic diagram of the electronic device hardware architecture of a smart audio device according to an embodiment of the present invention. Detailed Implementation

[0030] The terminology used in the following embodiments of the present invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification and appended claims of the present invention, the singular expressions “a,” “an,” “the,” “the,” “the,” and “this” are intended to include the plural expressions as well, unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used in the present invention refers to any or all possible combinations comprising one or more of the listed items.

[0031] Hereinafter, the terms "first" and "second" are used for descriptive purposes only and should not be construed as implying or suggesting relative importance or implicitly indicating the number of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature, and in the description of the embodiments of the present invention, unless otherwise stated, "a plurality of" means two or more.

[0032] To make the objectives, technical solutions, and effects of this invention clearer, the invention will be further described in detail below with reference to specific embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of protection of this invention.

[0033] First, some characteristic entities and technical terms involved in the embodiments of this invention will be explained:

[0034] A unique identifier is a piece of data information that can uniquely distinguish each NFC card. The unique identifier can be a serial number that is embedded in the chip when the NFC card is manufactured, or it can be a content identifier that is written into the NFC card's storage chip later. The unique identifier is used to identify and distinguish different NFC cards in the system.

[0035] Story element attributes are category labels predefined for NFC cards. Story element attributes are used to characterize the role or function of NFC cards in the story generation process. Story element attributes include at least one of the following: characters, scenes, props, and events.

[0036] The story logic rule base is a database pre-stored on a cloud server. The story logic rule base stores the mapping relationship between combinations of multiple unique identifiers and the corresponding storyline logic.

[0037] An audio clip library refers to a database of audio resources pre-stored on a cloud server. The audio clip library contains a large number of standardized story clip audio files, including opening audio, character dialogue audio, scene description audio, event description audio, and ending audio.

[0038] Storyline logic refers to the story development path and plot direction determined by a specific combination of NFC cards.

[0039] An audio clip playback sequence refers to an ordered set of multiple audio clips retrieved from an audio clip library by a cloud server according to the storyline logic and arranged in the order of the story's development.

[0040] This invention provides an intelligent audio generation system (hereinafter referred to as the system) that supports multi-entity interaction, such as... Figure 1 As shown, the system includes multiple NFC cards 1, smart audio devices 2, and a cloud server 3.

[0041] The NFC card 1 contains an embedded NFC chip, which includes a radio frequency antenna and a storage chip. The radio frequency antenna is used for near-field wireless communication with the NFC card reader module 21. The storage chip stores a unique identifier, which can be a serial number embedded in the NFC chip at the factory or a content identifier written to the storage chip later. Each NFC card 1 is predefined with specific story element attributes, including at least one of characters, scenes, props, and events. In this embodiment, the NFC card 1 can be in the form of a flat card, a three-dimensional doll, or a nameplate. The surface of the NFC card 1 is printed with graphics, colors, or story theme patterns corresponding to the story element attributes. The graphics, colors, or story theme patterns printed on the surface of the NFC card 1 facilitate intuitive identification of the story element represented by the NFC card 1 by the user.

[0042] In this embodiment, the smart audio device 2 may include, but is not limited to, children's story machines, smart speakers, or other audio playback devices with NFC reading capabilities. For example... Figure 2 As shown, the smart audio device 2 includes an NFC card reader module 21, a network communication module 22, a control module 23, and an audio output module 24.

[0043] The NFC card reader module 21 includes an NFC card reader chip and an NFC antenna. The NFC card reader chip is used to generate radio frequency signals and parse the received radio frequency signals. The NFC antenna is used to transmit and receive radio frequency signals. The NFC card reader module 21 is configured to detect NFC cards 1 placed in the sensing area and read the unique identifier stored in the NFC card 1 via near-field wireless communication. The sensing area refers to the physical area on the smart audio device 2 for placing the NFC card 1, and the sensing area is located within the effective sensing range of the NFC antenna. In this embodiment, the sensing area is designed to be a large area capable of accommodating at least two NFC cards 1 simultaneously, thereby supporting the simultaneous placement and recognition of multiple NFC cards 1. The NFC card reader module 21 can detect multiple NFC cards 1 simultaneously, or can rapidly and sequentially detect multiple NFC cards 1 within a very short time interval.

[0044] The network communication module 22 includes a wireless communication chip and a wireless communication antenna. The wireless communication chip is used to process network communication protocols and data transmission and reception. The wireless communication antenna is used to transmit and receive wireless signals. In this embodiment, the network communication module 22 adopts a Wi-Fi communication module, which connects to the Internet via the Wi-Fi protocol to achieve data communication with the cloud server 3. In other embodiments, the network communication module 22 may also adopt a cellular mobile communication module or other wireless communication modules.

[0045] The control module 23 is a functional module combining software and hardware. The hardware portion of the control module 23 includes a microcontroller or processor chip, which executes computer instructions and control logic. The software portion of the control module 23 includes firmware or application programs stored in the internal memory of the microcontroller or processor chip. These firmware or application programs are used to implement functions such as generating combined request instructions, controlling data transmission and reception, and controlling playback. The control module 23 is connected to the NFC card reader module 21, the network communication module 22, and the audio output module 24, respectively, and coordinates the operation between these modules.

[0046] The audio output module 24 includes an audio decoding chip, a power amplifier, and a speaker. The audio decoding chip decodes digital audio data into analog audio signals. The power amplifier amplifies the analog audio signals. The speaker converts the amplified analog audio signals into sound waves and outputs them.

[0047] In this embodiment, the operation of the smart audio device 2 is as follows:

[0048] The smart audio device 2 detects the NFC card 1 placed in the sensing area via the NFC card reader module 21. When the NFC card reader module 21 detects the presence of the NFC card 1 in the sensing area, it reads the unique identifier stored in the NFC card 1 and transmits the read unique identifier to the control module 23. When multiple NFC cards 1 are placed in the sensing area, the NFC card reader module 21 reads the unique identifiers stored in multiple NFC cards 1 sequentially or simultaneously. The control module 23 receives the multiple unique identifiers transmitted by the NFC card reader module 21 and combines the read multiple unique identifiers to generate a combination request command. The combination request command is a data packet containing the set information of multiple unique identifiers. The control module 23 transmits the combination request command to the network communication module 22, and the network communication module 22 sends the combination request command to the cloud server 3 via the Internet.

[0049] In this embodiment, cloud server 3 is a computing device or cluster of computing devices deployed on the Internet. Cloud server 3 includes a processor, memory, and network interface. The memory of cloud server 3 stores a story logic rule library and an audio segment library.

[0050] The story logic rule base is a database stored in the cloud server's storage. It contains multiple story logic rules. Each rule defines a mapping between a combination of unique identifiers and the corresponding storyline logic. For example, one rule might be defined as follows: when the received combination request instruction contains the unique identifier corresponding to the character card "Prince" and the scene card "Castle," the corresponding storyline logic is the Prince's story in the castle. Another rule might be defined as follows: when the received combination request instruction contains the unique identifier corresponding to the character card "Prince," the scene card "Castle," and the item card "Sword," the corresponding storyline logic is the Prince using the sword to defeat the dragon.

[0051] The audio clip library is a database stored in the cloud server's storage. It contains a large number of standardized story clip audio files. These files are categorized and stored according to story elements and plot points, including opening audio, character dialogue audio, scene description audio, event description audio, and ending audio. Each story clip audio file has a unique audio identifier, which is used to locate and retrieve the corresponding story clip audio file in the library.

[0052] The working process of cloud server 3 is as follows:

[0053] Cloud server 3 receives a combined request command from smart audio device 2 via a network interface. The processor of cloud server 3 parses the combined request command and extracts multiple unique identifiers. Based on these unique identifiers, the processor matches them against a story logic rule base. It compares each unique identifier with a stored story logic rule to determine the corresponding storyline logic. According to the determined storyline logic, the processor retrieves multiple story segment audio files from an audio segment library. It then arranges these retrieved audio files in chronological order to generate an ordered audio segment playback sequence. This sequence contains the audio identifiers of the multiple story segment audio files and their corresponding playback order information, or it may directly contain the audio data of the multiple story segment audio files. Finally, cloud server 3 sends the audio segment playback sequence to smart audio device 2 via the network interface.

[0054] The smart audio device 2 receives an audio segment playback sequence from the cloud server 3. The control module 23 parses the audio segment playback sequence, obtaining multiple story segment audio files and their corresponding playback order. The control module 23 transmits the audio data of the story segment audio files sequentially to the audio output module 24 according to the playback order. The audio output module 24's audio decoding chip decodes the audio data, the power amplifier amplifies the decoded analog audio signal, and the speaker converts the amplified analog audio signal into sound waves for output, thus enabling the sequential playback of the audio content in the audio segment playback sequence.

[0055] In some embodiments, the story element attributes of NFC card 1 include at least one of character attributes, scene attributes, prop attributes, or event attributes. An NFC card 1 with a character attribute is called a character card, used to represent a person or animal character in the story, such as a prince, princess, dinosaur, Little Red Riding Hood, etc. An NFC card 1 with a scene attribute is called a scene card, used to represent the location or environment where the story takes place, such as a castle, forest, volcano, ocean, etc. An NFC card 1 with a prop attribute is called a prop card, used to represent items or tools appearing in the story, such as a sword, magic wand, hunting rifle, treasure chest, etc. An NFC card 1 with an event attribute is called an event card, used to represent an action or event that occurs in the story, such as attack, escape, calling for help, exploration, etc.

[0056] There is a predefined correspondence between the unique identifier of each NFC card 1 and the story element attribute of the NFC card 1. The correspondence is stored in the cloud server 3, and the cloud server 3 can identify the corresponding story element attribute of the NFC card 1 based on the unique identifier.

[0057] The cloud server 3 is configured to perform the following process: After receiving a combined request instruction from the smart audio device 2, the cloud server 3 extracts multiple unique identifiers from the combined request instruction. Based on a predefined correspondence between the unique identifiers and story element attributes, the cloud server 3 identifies the story element attributes corresponding to different unique identifiers in the combined request instruction. For example, the cloud server 3 identifies that the NFC card 1 corresponding to the first unique identifier in the combined request instruction has a character attribute, the NFC card 1 corresponding to the second unique identifier has a scene attribute, and the NFC card 1 corresponding to the third unique identifier has a prop attribute.

[0058] The story logic rule base contains preset attribute combination rules. These rules define the mapping relationship between different story element attributes and their corresponding storyline logic or branching plots. Based on these preset attribute combination rules and the story element attributes corresponding to each unique identifier identified, cloud server 3 determines a unique storyline logic or triggers a specific branching plot.

[0059] For example, attribute combination rules can be defined as follows: when a combination request command contains a unique identifier for a character attribute and a unique identifier for a scene attribute, a unique storyline logic is determined, which is the standard storyline for that character in that scene. Attribute combination rules can also be defined as follows: when a combination request command contains a unique identifier for a character attribute, a unique identifier for a scene attribute, and a unique identifier for an item attribute, a specific branching storyline is triggered, which is the branching storyline where the character uses that item in that scene.

[0060] To illustrate with a specific example, when the combined request instruction received by cloud server 3 contains a unique identifier for the prince and a unique identifier for the castle, cloud server 3 identifies the story element attribute of the unique identifier for the prince as a character attribute and the story element attribute of the unique identifier for the castle as a scene attribute.

[0061] Cloud server 3 determines the corresponding storyline logic as the standard storyline of the prince in the castle based on the attribute combination rules. Cloud server 3 generates the corresponding audio segment playback sequence, which includes the opening audio of the prince in the castle, the event audio of the dragon attack, and the ending audio of the prince going into battle.

[0062] When the combined request command received by cloud server 3 also contains a unique identifier corresponding to the sword, cloud server 3 identifies the story element attribute of the unique identifier as an item attribute. Cloud server 3 triggers a specific branching storyline based on the attribute combination rules; this branching storyline is a storyline where the prince uses the sword to defeat the dragon. Cloud server 3 generates a corresponding audio clip playback sequence, which includes the opening audio of the prince in the castle, the event audio of the dragon attack, the branching audio of obtaining the sword, and the ending audio of the prince defeating the dragon.

[0063] This embodiment effectively solves the technical problems of rigid interaction, monotonous content and lack of logical continuity caused by the "single trigger, fixed mapping" mode of existing smart audio devices.

[0064] In existing technologies, there is a fixed correspondence between physical objects and audio, making it impossible to generate differentiated content based on context. This embodiment addresses this by configuring a smart audio device to read the unique identifiers of multiple NFC cards and generate combination request instructions. Utilizing a story logic rule library and audio segment library on a cloud server, it achieves dynamic content generation based on multi-entity combinations. The system can parse the meaning of the set of multiple identifiers, match the corresponding storyline logic in the cloud, and retrieve segments to generate an ordered playback sequence. This means that users only need to change the way the cards are combined to obtain entirely new story content with differentiation and logical coherence. This breaks the monotonous, fixed feedback pattern of traditional devices, enriches the interactive depth and variability of audio content, and meets users' needs for personalized auditory experiences.

[0065] In some embodiments, such as Figure 3 As shown, the smart audio device 2 also includes an interactive prompt module 25. The interactive prompt module 25 includes a status indicator light and a voice prompt module. The status indicator light is used to provide visual prompts to the user through flashing lights or color changes. The voice prompt module is used to provide voice prompts to the user through a speaker.

[0066] The audio segment playback sequence generated by cloud server 3 can include story branch point markers. Story branch point markers are used to identify a specific position in the audio segment playback sequence as a story branch point. At the story branch point, the story can enter different branch plots based on the user's selection.

[0067] The intelligent audio device 2 is configured to perform the following processing procedure. During the playback of an audio segment sequence, the control module 23 continuously monitors whether the current playback position has reached a story branch point. When the control module 23 detects that the current playback position has reached a story branch point, it controls the interactive prompt module 25 to issue an interactive signal. The interactive signal can be a voice prompt signal, output through the voice prompt module, and the content of the voice prompt signal can be, for example, "Warrior, do you want to attack or flee? Please select a card." The interactive signal can also be a light prompt signal, output through a status indicator light, and the light prompt signal can be a flashing status indicator light or a color change of the status indicator light. The interactive signal can also be a combination of voice prompt signals and light prompt signals.

[0068] After sending the interaction signal, the control module 23 restarts the NFC card reader module 21. Restarting the NFC card reader module 21 means that the control module 23 activates the detection function of the NFC card reader module 21, causing the NFC card reader module 21 to enter a state of waiting to detect a new NFC card 1.

[0069] While waiting to detect a new NFC card 1, the NFC card reader module 21 continuously monitors the sensing area for the presence of a new NFC card 1. When the NFC card reader module 21 detects a new NFC card 1 placed in the sensing area, it reads the unique identifier stored in the new NFC card 1 and transmits the unique identifier of the new NFC card 1 to the control module 23.

[0070] After receiving the unique identifier of the new NFC card 1, the control module 23 generates an update request command. The update request command is a data packet containing the unique identifier of the new NFC card 1, and may also include the identification information of the current story branch point. The control module 23 transmits the update request command to the network communication module 22, which then sends it to the cloud server 3 via the Internet.

[0071] After receiving the update request command from the smart audio device 2, the cloud server 3 extracts the unique identifier of the new NFC card 1 and the identification information of the current story branch point from the update request command. Based on the unique identifier of the new NFC card 1 and the identification information of the current story branch point, the cloud server 3 matches them in the story logic rule base to determine the subsequent story branch direction. According to the determined subsequent story branch direction, the cloud server 3 retrieves the corresponding story segment audio file from the audio segment library and generates the subsequent audio segment playback sequence. The cloud server 3 then sends the subsequent audio segment playback sequence to the smart audio device 2.

[0072] The smart audio device 2 receives the subsequent audio segment playback sequence from the cloud server 3, and then plays the subsequent audio segment playback sequence continuously through the audio output module 24. Continuous playback means seamlessly connecting the audio content of the subsequent audio segment playback sequence at the current story branch point, thereby making the playback of the entire story coherent and smooth.

[0073] To illustrate with a specific example, during the playback of the Prince's Treasure Hunt story on the volcano, when the story reaches the branching point where the dragon is discovered, the control module 23 detects that the current location is a story branching point. The control module 23 controls the status indicator light to flash, and simultaneously controls the voice prompt module to output a voice prompt: "Warrior, do you want to attack or flee? Please select a card." The control module 23 restarts the NFC card reader module 21 to wait for the detection of a new NFC card 1. The user places the attack event card in the sensing area, and the NFC card reader module 21 detects the attack event card and reads its unique identifier. The smart audio device 2 generates an update request command containing the attack event card's unique identifier and sends it to the cloud server 3. Based on the update request command, the cloud server 3 determines that the subsequent story branch is the Prince attacking the dragon, and generates a subsequent audio segment playback sequence, which is then sent to the smart audio device 2. The smart audio device 2 receives the subsequent audio segment playback sequence and continues playing it, and the story continues to develop according to the Prince attacking the dragon branching plot.

[0074] In some embodiments, such as Figure 3 As shown, the smart audio device 2 also includes a local storage module 26. The local storage module 26 is a hardware module, and it can be a flash memory chip or other non-volatile memory. The local storage module 26 is used to store downloaded audio segment playback sequences and corresponding audio data.

[0075] The smart audio device 2 is configured to perform the following process: After the NFC card reader module 21 detects and reads the unique identifiers of multiple NFC cards 1 placed in the sensing area, the NFC card reader module 21 transmits the multiple unique identifiers to the control module 23. After receiving the multiple unique identifiers, the control module 23 first performs a search operation in the local storage module 26. The search operation means that the control module 23 uses the combination of multiple unique identifiers as search conditions to search in the local storage module 26 for whether there is a cached audio segment corresponding to the combination of multiple unique identifiers. The local storage module 26 stores a correspondence index table between unique identifier combinations and cached audio segments. The control module 23 determines whether there is a matching cached audio segment by querying the correspondence index table.

[0076] If the control module 23 finds a cached audio segment corresponding to a combination of multiple unique identifiers in the local storage module 26, the control module 23 directly reads the cached audio segment from the local storage module 26 and transmits it to the audio output module 24 for playback. With the cached audio segment present, the smart audio device 2 does not need to send a combination request command to the cloud server 3 through the network communication module 22. The smart audio device 2 can directly play the cached audio segment offline, thereby improving playback response speed and saving network traffic.

[0077] If the control module 23 finds no cached audio segment corresponding to the combination of multiple unique identifiers in the local storage module 26, then the control module 23 sends a combination request instruction to the cloud server 3. The control module 23 combines the multiple unique identifiers to generate a combination request instruction, which is then transmitted to the network communication module 22. The network communication module 22 sends the combination request instruction to the cloud server 3 via the Internet. Upon receiving the combination request instruction, the cloud server 3 performs logical matching based on the multiple unique identifiers in the instruction and generates an audio segment playback sequence. The cloud server 3 then distributes the audio segment playback sequence to the smart audio device 2.

[0078] After receiving the audio segment playback sequence from the cloud server 3, the network communication module 22 transmits the audio segment playback sequence to the control module 23. The control module 23 stores the received audio segment playback sequence in the local storage module 26. The control module 23 establishes a correspondence index between multiple combinations of unique identifiers and the audio segment playback sequence in the local storage module 26, so that cached audio segment playback sequences can be quickly retrieved later based on the same combination of unique identifiers. The control module 23 also transmits the audio segment playback sequence to the audio output module 24 for playback.

[0079] With local storage and caching capabilities, when a user interacts using the same NFC card combination 1, the smart audio device 2 can directly read and play cached audio segments from the local storage module 26 without sending a request to the cloud server 3 each time. This improves the system's response speed and user experience while reducing reliance on network connectivity.

[0080] In some embodiments, the NFC card reader module 21 also has a data writing function. The data writing function means that the NFC card reader module 21 can not only read the data stored in the NFC card 1, but also write data to the storage chip of the NFC card 1. The NFC card reader module 21 sends a write command and the data to be written to the storage chip of the NFC card 1 located within the sensing area via near-field wireless communication. After receiving the write command and the data to be written, the storage chip of the NFC card 1 stores the data to be written in the writable storage area of ​​the storage chip.

[0081] The smart audio device 2 is configured to perform the following process. After successfully receiving the audio segment playback sequence from the cloud server 3 and completing the download of the audio segment playback sequence, the control module 23 generates status identification information.

[0082] Status identification information is a type of data information. It includes at least a status bit indicating that the card has been used, and the device identifier or user account information of the current smart audio device 2. The status bit indicating that the card has been used is a flag bit used to indicate that the NFC card 1 has been used. The value of this status bit can be set to a specific numerical value or character, such as the number 1 or the character Y.

[0083] The device identifier of the current smart audio device 2 refers to an identification code that uniquely identifies the current smart audio device 2. The device identifier can be a serial number assigned to the smart audio device 2 at the factory or a unique number assigned when the device is registered in the system. The user account information refers to the account identifier of the user currently logged into the smart audio device 2. The user account information can be the username, user ID, or other information that can uniquely identify the user's identity when registered in the system.

[0084] After generating status identification information, control module 23 controls NFC card reader module 21 to write the status identification information into the storage chip of the NFC card 1 participating in the interaction. The NFC card 1 participating in the interaction refers to the NFC card 1 that was placed in the sensing area and had its unique identifier read during this interaction. NFC card reader module 21 sequentially writes status identification information into the storage chip of each NFC card 1 participating in the interaction. After the status identification information is written to the writable storage area of ​​the NFC card 1's storage chip, the status identification information will be persistently stored in the NFC card 1 until it is overwritten or cleared by subsequent write operations.

[0085] By writing status information to NFC card 1, the usage status of NFC card 1 and the device or user information bound to NFC card 1 when it is first used can be recorded. The status information can be used for subsequent usage verification and access control, thereby realizing content copyright protection and user device binding functions.

[0086] In some embodiments, when the smart audio device 2 reads the NFC card 1 via the NFC card reader module 21, it is also configured to read the status identification information stored in each NFC card 1. The NFC card reader module 21 reads the status identification information stored in the storage chip of the NFC card 1 simultaneously or after reading the unique identifier of the NFC card 1. The NFC card reader module 21 transmits the read status identification information to the control module 23.

[0087] Control module 23 is configured to execute verification logic. Verification logic is a type of software program logic, stored in the memory of control module 23 and executed by its processor. The execution process of the verification logic is as follows.

[0088] The control module 23 first determines whether the NFC card 1 stores status identification information. If the NFC card 1 does not store status identification information, it indicates that the NFC card 1 is a new card being used for the first time, and the control module 23 allows the subsequent combination request command generation and audio playback operations to continue.

[0089] If the NFC card 1 stores status identifier information, the control module 23 extracts the device identifier or user account information from the status identifier information. The control module 23 determines whether the device identifier or user account information in the status identifier information matches the device identifier of the current smart audio device 2 or the currently logged-in user account. The specific matching process is as follows: The control module 23 obtains the device identifier of the current smart audio device 2, which is stored in the memory of the control module 23 or in other storage components of the smart audio device 2. The control module 23 obtains the currently logged-in user account, which is stored in the memory of the control module 23 or obtained through communication with the cloud server 3. The control module 23 compares the device identifier in the status identifier information with the device identifier of the current smart audio device 2, or compares the user account information in the status identifier information with the currently logged-in user account.

[0090] If the device identifier in the status identifier information is the same as the device identifier of the current smart audio device 2, or the user account information in the status identifier information is the same as the currently logged-in user account, then a successful match is determined. A successful match indicates that the device or user bound to the NFC card 1 during previous use belongs to the same user system as the current device or user, and the control module 23 allows the continued execution of subsequent combination request command generation and audio playback operations.

[0091] If the device identifier in the status identifier information is different from the device identifier of the current smart audio device 2, and the user account information in the status identifier information is also different from the currently logged-in user account, then a mismatch is determined. A mismatch indicates that the device or user previously bound to the NFC card 1 does not belong to the same user system as the current device or user, and the control module 23 determines that the current smart audio device 2 is an unbound device. When the smart audio device 2 is determined to be an unbound device, it performs prohibited operations. Prohibited operations include prohibiting the generation of combined request instructions or prohibiting the playback of audio segment sequences. Prohibiting the generation of combined request instructions means that the control module 23 does not perform the operation of combining multiple unique identifiers to generate a combined request instruction, thereby preventing requests for audio content from the cloud server 3. Prohibiting the playback of audio segment sequences means that even if the smart audio device 2 receives an audio segment playback sequence, the control module 23 does not allow the audio output module 24 to play it.

[0092] The verification logic effectively prevents NFC card 1 from being used on unlinked devices, thus protecting digital content copyright and preventing the unauthorized dissemination of content. Simultaneously, the verification logic can be integrated with the parent terminal 4's account system. One parent account can be linked to multiple family devices, all of which are considered part of the same user system, allowing NFC card 1 to be used seamlessly across devices within the same user system.

[0093] In some embodiments, when generating a combination request instruction, the smart audio device 2 is also configured to detect the placement order information of each NFC card 1 when it is read by the NFC card reader module 21. The placement order information refers to the order in which multiple NFC cards 1 are placed in the sensing area and read by the NFC card reader module 21.

[0094] The NFC card reader module 21 detects the placement order information as follows: The NFC card reader module 21 continuously scans and detects the sensing area. When the NFC card reader module 21 first detects an NFC card 1 entering the sensing area, it records the unique identifier of the NFC card 1 and the timestamp or sequence number of the NFC card 1 being detected. When the NFC card reader module 21 subsequently detects another NFC card 1 entering the sensing area, it records the unique identifier of that other NFC card 1 and the timestamp or sequence number of that other NFC card 1 being detected. This process continues, with the NFC card reader module 21 recording the unique identifier and corresponding timestamp or sequence number of all NFC cards 1 entering the sensing area. The NFC card reader module 21 then transmits the unique identifier and corresponding timestamp or sequence number of each NFC card 1 to the control module 23.

[0095] The control module 23 determines the placement order information of each NFC card 1 when it is read by the NFC card reader module 21 based on the timestamp or sequence number of each NFC card 1. The placement order information can be represented as an ordered sequence, where the order of the unique identifiers in the ordered sequence is the order in which the corresponding NFC card 1 is placed in the sensing area. For example, if the user places the character card Prince, the scene card Castle, and the prop card Sword in the sensing area in sequence, the placement order information is that the unique identifier corresponding to Prince is in the first position, the unique identifier corresponding to Castle is in the second position, and the unique identifier corresponding to Sword is in the third position.

[0096] When generating the combination request instruction, the control module 23 encapsulates the placement order information into the combination request instruction. The combination request instruction contains not only a set of multiple unique identifiers but also the placement order information. The smart audio device 2 sends the combination request instruction containing the placement order information to the cloud server 3 via the network communication module 22.

[0097] When determining the corresponding storyline logic, cloud server 3 is configured to match based on the content of multiple unique identifier sets and their placement order information simultaneously. The processing procedure of cloud server 3 is as follows: After receiving a combination request instruction, cloud server 3 extracts the content of multiple unique identifier sets and their placement order information from the instruction. The story logic rules in the story logic rule base define not only the mapping relationship between the unique identifier sets and the storyline logic, but also the mapping relationship between the placement order and the storyline logic. Based on the content of the multiple unique identifier sets and their placement order information, cloud server 3 matches the corresponding story logic rules in the story logic rule base to determine the corresponding storyline logic.

[0098] To illustrate with a specific example, the story logic rule base can be configured with the following rules: When the detected NFC card combination 1 is the character card Prince, the scene card Castle, and the prop card Sword, and the placement order is Prince, Castle, Sword, the corresponding storyline logic is the storyline where the Prince arrives at the castle and obtains the sword. When the detected NFC card combination 1 is also the character card Prince, the scene card Castle, and the prop card Sword, but the placement order is Sword, Prince, Castle, the corresponding storyline logic is the storyline where the Prince carries the sword into the castle. By simultaneously considering the set content of the unique identifiers and the placement order information, the same NFC card combination 1 can trigger different storyline logics under different placement orders, thereby further enriching the variability of the story content and the fun of interaction.

[0099] In some embodiments, such as Figure 3 As shown, the smart audio device 2 also includes a status indicator module 27. The status indicator module 27 is a hardware module, comprising a status indicator light and an indicator light driver circuit. The status indicator light can be a multi-color LED, capable of emitting different colors to represent different device states. The indicator light driver circuit receives control signals from the control module 23 and drives the status indicator light to illuminate. The status indicator module 27 displays the current operating status of the smart audio device 2, including network connection status, audio playback status, and card recognition status.

[0100] The control module 23 is configured to control the status indicator module 27 to display the corresponding status indicator information according to the current working status of the smart audio device 2.

[0101] When the smart audio device 2 is online, the online state means that the network communication module 22 has successfully connected to the Internet and can communicate normally with the cloud server 3. The control module 23 controls the status indicator to display the first color, which can be green, indicating that the network connection is normal.

[0102] When the smart audio device 2 is offline, the offline state means that the network communication module 22 is not connected to the Internet or cannot communicate normally with the cloud server 3. The control module 23 controls the status indicator to display a second color, which can be red, indicating that the network connection is abnormal.

[0103] When the smart audio device 2 is downloading audio content, the control module 23 controls the status indicator light to flash a third color, which can be blue. A flashing blue light indicates that data download is in progress.

[0104] When the smart audio device 2 is playing audio content, the control module 23 controls the status indicator light to display a third color in a constantly lit manner. Through the status display function of the status indicator module 27, the user can intuitively understand the current working status of the smart audio device 2.

[0105] In some embodiments, the sensing area of ​​the smart audio device 2 supports multiple NFC card 1 placement methods, including surface sensing, shape matching insertion, and card slot insertion.

[0106] Surface sensing refers to the user placing the NFC card 1 directly on the sensing area surface of the smart audio device 2. When the NFC card 1 is in contact with or near the sensing area surface, the NFC card reader module 21 detects and reads the unique identifier stored in the NFC card 1 via near-field wireless communication. The advantage of surface sensing is its simplicity and intuitiveness; the user only needs to place the NFC card 1 on the sensing area surface to trigger audio playback. Surface sensing is suitable for younger children.

[0107] The shape-matching insertion method refers to the smart audio device 2 having a groove structure in its sensing area that matches the outline of the NFC card 1. The shape of the groove structure corresponds to the outline of the NFC card 1, and the user inserts the NFC card 1 into the groove structure according to its outline. The advantage of the shape-matching insertion method is that it can enhance children's hands-on ability and cognitive experience. Children need to recognize the shape of the NFC card 1 and correctly insert it into the corresponding groove structure to trigger audio playback, thus providing an educational function.

[0108] The card slot insertion method refers to the smart audio device 2 having a card slot structure, which is a slot-shaped space for accommodating the NFC card 1. The user inserts the NFC card 1 into the card slot structure. The smart audio device 2 may also have an automatic ejection mechanism, a mechanical structure that automatically ejects the NFC card 1 from the card slot structure when the user presses it or the control module 23 issues an ejection command. The advantage of the card slot insertion method is that it keeps the NFC card 1 fixed on the smart audio device 2, preventing it from shifting or falling off during playback. This method is suitable for scenarios requiring long-term audio playback. The automatic ejection mechanism automatically ejects the NFC card 1 after audio playback ends or when the user needs to replace it, improving convenience and user experience.

[0109] In some embodiments, the system is further configured with a parent terminal 4. The parent terminal 4 can be an application running on a mobile terminal, such as a smartphone or tablet. The parent terminal 4 communicates with a cloud server 3 via the internet and is used to remotely manage and monitor the smart audio device 2.

[0110] Parental Controls 4 includes a device binding module, a network configuration module, a playback history viewing module, a content management module, and a device status monitoring module.

[0111] The device binding module is a software module in the parent terminal 4 used to bind the smart audio device 2 to the parent's account. The working process of the device binding module is as follows: Parents register and log in to their parent account on the parent terminal 4. The parent account is the parent's unique identifier in the system. Parents scan the QR code on the smart audio device 2 or enter the device identifier of the smart audio device 2 through the device binding module. The device binding module associates and binds the parent account with the device identifier of the smart audio device 2. The device binding module sends the binding relationship information to the cloud server 3, and the cloud server 3 stores the binding relationship between the parent account and the device identifier of the smart audio device 2. One parent account can bind multiple smart audio devices 2. Multiple smart audio devices 2 bound to the same parent account are considered to be in the same user system, and the NFC card 1 can be used seamlessly among smart audio devices 2 within the same user system.

[0112] The network configuration module is a software module in the parent terminal 4 used to configure the network connection parameters of the smart audio device 2. The network configuration module works as follows: The parent enters the network name and password of the home wireless network through the network configuration module. The network configuration module sends the network name and password to the smart audio device 2 via Bluetooth or acoustic communication. After receiving the network name and password, the control module 23 controls the network communication module 22 to connect to the home wireless network using the network name and password. After the network communication module 22 successfully connects to the home wireless network, the smart audio device 2 can connect to the Internet and communicate with the cloud server 3 through the home wireless network.

[0113] The playback history viewing module is a software module in the parent terminal 4 used to view the playback history of the smart audio device 2. Playback history refers to the historical information of audio content played by the smart audio device 2, including information such as story name, number of plays, playback duration, and playback time.

[0114] During audio playback, the control module 23 records playback information and stores it in the local storage module 26. The control module 23 periodically or after playback ends sends the playback information to the cloud server 3 via the network communication module 22. The cloud server 3 receives and stores the playback information and associates it with the corresponding parent account. Parents can send a playback record query request to the cloud server 3 through the playback record viewing module on their parent terminal 4. The cloud server 3 returns the playback record information associated with the parent account based on the query request. The playback record viewing module receives the playback record information returned by the cloud server 3 and displays it on the parent's mobile terminal screen, allowing parents to understand their child's use of the smart audio device 2.

[0115] The content management module is a software module in the parent terminal 4 used to manage the audio content stored in the smart audio device 2. The content management module's functions include viewing download records and deleting local content. Viewing download records means that parents can use the content management module to view the historical records of audio content downloaded by the smart audio device 2 from the cloud server 3. Download records include information such as the name of the downloaded audio content, download time, and audio file size. Deleting local content means that parents can remotely delete the audio content stored in the local storage module 26 of the smart audio device 2 through the content management module. Parents send a deletion command to the cloud server 3 through the content management module. The cloud server 3 forwards the deletion command to the corresponding smart audio device 2, and the control module 23 deletes the corresponding audio content from the local storage module 26 after receiving the deletion command.

[0116] The device status monitoring module is a software module in the parent terminal 4 used to monitor the operating status of the smart audio device 2. This module can display real-time information such as the online status, battery level, and current playback status of the smart audio device 2. The control module 23 periodically sends the device status information to the cloud server 3 via the network communication module 22. The cloud server 3 stores the device status information and returns it when requested by the parent terminal 4. Parents can remotely monitor the operating status of the smart audio device 2 through the device status monitoring module.

[0117] In some embodiments, the cloud server 3 is also configured to perform a content authorization verification function. The content authorization verification function refers to the function of the cloud server 3, upon receiving a combined request instruction from the smart audio device 2, to verify the permissions of the unique identifier contained in the combined request instruction.

[0118] The content authorization verification process of cloud server 3 is as follows: Cloud server 3 receives a combined request command sent by smart audio device 2. Cloud server 3 extracts multiple unique identifiers and the device identifier or user account information of smart audio device 2 from the combined request command. Cloud server 3 queries the authorization status of the corresponding NFC card 1 based on the unique identifier. The authorization status includes whether NFC card 1 has been activated, whether NFC card 1 is within its validity period, and whether NFC card 1 has been bound to a specific user account. Cloud server 3 determines whether the device identifier or user account information of smart audio device 2 has the permission to use the audio content corresponding to NFC card 1. If cloud server 3 determines that smart audio device 2 has the permission, cloud server 3 performs a logical matching operation and generates an audio segment playback sequence, which is then sent to smart audio device 2. If cloud server 3 determines that smart audio device 2 does not have the permission, cloud server 3 returns an authorization failure message to smart audio device 2. After receiving the authorization failure message, control module 23 plays a prompt tone through audio output module 24 or displays a prompt message through status indicator module 27 to inform the user that the authorization verification has failed.

[0119] In some embodiments, cloud server 3 is also configured to support integration with third-party content platforms. A third-party content platform refers to an external service platform that provides audio content resources; such a platform may be a professional children's story content provider or a digital content publisher. Cloud server 3 interacts with the third-party content platform via an application programming interface (API). When the audio segment library of cloud server 3 does not contain audio content corresponding to the combination request instruction, cloud server 3 can obtain the corresponding audio content from the third-party content platform through the API. The audio content obtained by cloud server 3 from the third-party content platform can be stored in the audio segment library of cloud server 3 for later use, or it can be directly forwarded to smart audio device 2 for playback. By integrating with third-party content platforms, cloud server 3 can provide users with a richer and more diverse range of audio content resources.

[0120] In some embodiments, the smart audio device 2 also supports a variety of playback control operations during the playback of audio segment sequences, including pause operation, resume playback operation, card switching operation, and playback mode switching operation.

[0121] The pause operation refers to the user triggering the pause of audio playback by pressing the pause button on the smart audio device 2 or by removing the NFC card 1 from the sensing area. After detecting the pause command, the control module 23 stops transmitting audio data to the audio output module 24, the audio output module 24 stops outputting sound, and the control module 23 records the current playback position for subsequent continuation of playback.

[0122] The "continue playback" operation refers to the user triggering continued audio playback by pressing the play button on the smart audio device 2 or by placing the NFC card 1 back into the sensing area. After detecting the continue playback command, the control module 23 continues to transmit audio data to the audio output module 24 from the recorded playback position, and the audio output module 24 continues to output sound.

[0123] The card switching operation refers to the user changing the NFC card 1 in the sensing area during playback. After the NFC card reader module 21 detects the change in the NFC card 1 in the sensing area, it reads the unique identifier of the newly placed NFC card 1 and transmits it to the control module 23. The control module 23 stops the playback of the current audio content and, based on the new unique identifier, executes the generation and sending of a combined request command or retrieves the corresponding cached audio segment from the local storage module 26 for playback.

[0124] Playback mode switching refers to the user switching audio playback modes using the mode switching button on the smart audio device 2. Audio playback modes include sequential playback mode and single-track loop mode. Sequential playback mode means the smart audio device 2 plays each audio segment sequentially according to the playback order in the audio segment playback sequence, stopping playback or entering standby mode after all audio segments have been played. Single-track loop mode means the smart audio device 2 automatically starts playing the audio content of the audio segment playback sequence from the beginning again after completing the playback of the audio segment playback sequence.

[0125] In some embodiments, the smart audio device 2 supports offline playback, which means that the smart audio device 2 can play audio content cached in the local storage module 26 without being connected to the Internet.

[0126] The offline playback process of the smart audio device 2 is as follows: After the NFC card reader module 21 detects and reads the unique identifier of the NFC card 1 placed in the sensing area, the control module 23 first checks the network connection status of the network communication module 22. If the network communication module 22 is offline, the control module 23 searches the local storage module 26 for a cached audio segment corresponding to the combination of the unique identifier. If a cached audio segment exists in the local storage module 26, the control module 23 directly reads the cached audio segment from the local storage module 26 and transmits it to the audio output module 24 for playback. If no cached audio segment exists in the local storage module 26, the control module 23 plays a prompt tone through the audio output module 24 or displays a prompt message through the status indicator module 27 to inform the user that it is currently offline and has no cached content locally. The offline playback function enables the smart audio device 2 to be used normally in environments without network access, such as outdoors, improving the ease of use of the smart audio device 2.

[0127] In some embodiments, the smart audio device 2 is also configured to automatically power off or enter standby mode after audio playback ends. The control module 23 starts a timer after detecting that all audio segments in the audio segment playback sequence have finished playing. If the timer reaches a preset idle time and no new NFC card 1 is detected during the idle time or no user operation is detected, the control module 23 controls the smart audio device 2 to enter standby mode or perform an automatic power-off operation. Standby mode means the smart audio device 2 enters a low-power state. In standby mode, the NFC card reader module 21 maintains its detection function to wake up the smart audio device 2 when a new NFC card 1 is detected. Automatic power-off means the control module 23 cuts off the main power supply to the smart audio device 2, and the smart audio device 2 completely stops working. The automatic power-off and standby functions can save power consumption of the smart audio device 2 and extend its battery life.

[0128] Through the system architecture and functions described in this embodiment, the intelligent audio generation system supporting multi-entity interaction achieves the following technical effects:

[0129] First, the method of triggering audio playback via NFC card 1 is simple and intuitive, suitable for children aged three to six to use independently without touching the electronic screen.

[0130] Secondly, by combining cloud content downloads with local caching, the flexibility of content updates is improved, and both online and offline playback modes are supported to adapt to different usage scenarios such as home and outdoor.

[0131] Third, the parent terminal 4 enables remote management and monitoring of the smart audio device 2, allowing parents to stay informed about their child's use of the smart audio device 2 at any time.

[0132] Fourth, the NFC card 1 status identifier writing and verification mechanism realizes content copyright protection and user device binding functions, effectively preventing the disorderly dissemination of digital content.

[0133] Fifth, the multi-form NFC card interaction enhances children's hands-on skills and cognitive experience, increasing the product's interactive fun.

[0134] In some application scenarios, children may place multiple NFC cards vertically together in the sensing area, for example, placing a "blanket" card on top of a "kitten" card, or a "knight" card on top of a "horse" card.

[0135] Existing NFC card reading technology can typically only read the unique identifier list of all cards within the sensing area through anti-collision mechanisms, but it cannot distinguish the physical stacking order of the cards (i.e., Z-axis order). This results in the system being unable to distinguish between "a knight riding a horse" and "a horse riding a knight" (which can be excluded if it doesn't make logical sense, but is difficult to distinguish in abstract scenarios), or between "a blanket covering a kitten" (the kitten is invisible) and "a kitten on the blanket" (the kitten is visible), limiting the logical depth and realism of the interaction.

[0136] To address the spatial logic identification problem in multi-card stacking, this embodiment also provides a vertical stacking order identification scheme based on radio frequency signal feature analysis.

[0137] In this embodiment, the NFC card reader module 21 is not only configured to read unique identifiers, but also deeply developed to have radio frequency signal feature acquisition capabilities. The hardware of the NFC card reader module 21 includes a high-sensitivity radio frequency front-end circuit, which can detect the load modulation amplitude or received signal strength indication (RSSI) of each NFC card 1 during communication.

[0138] The control module 23 integrates a spatial logic analysis module, which is a software module. This module is configured to execute a stacking order discrimination algorithm. The specific implementation principle is as follows: When multiple NFC cards 1 are stacked vertically, due to the coupling effect of electromagnetic induction and the different vertical distances between the antenna and the reader coil, the load modulation signal strength generated by the magnetic field emitted by the reader will differ among NFC cards 1 at different levels. Typically, the bottom card, closest to the reader's sensing area, has the highest coupling coefficient and the strongest signal characteristics; while the cards stacked on top, due to increased distance and the shielding or interference effects of the lower cards, will exhibit specific attenuation or distortion in their signal characteristics.

[0139] Smart audio device 2 is configured to perform the following processing procedure:

[0140] The NFC card reader module 21 is activated sequentially in the anti-collision loop and reads each NFC card 1. While reading the unique identifier of each NFC card 1, the radio frequency front-end circuit samples the amplitude or energy value of the card's response signal in real time and converts the analog quantity into digital signal strength data.

[0141] The control module 23 collects the unique identifiers and corresponding signal strength data of all detected cards. The spatial logic analysis module sorts and performs differential analysis on the multiple signal strength data. Based on a preset signal attenuation model, the spatial logic analysis module determines the NFC card 1 with the largest signal strength data as the bottom layer card and the NFC card 1 with the smaller signal strength data as the top layer card, thereby constructing the vertical stacking order information of the NFC cards 1.

[0142] Control module 23 sends a combined request command containing vertical stacking order information to cloud server 3. Spatial relationship logic rules are added to the story logic rule base of cloud server 3.

[0143] For example, the spatial relationship logic rules are defined as follows: when the upper-level card is a concealing prop (such as a blanket or bushes) and the lower-level card is a character (such as a kitten), a hidden plot is triggered (such as searching for a hiding kitten); when the upper-level card is a vehicle or mount (such as a horse) and the lower-level card is a character (such as a knight), an incorrect logic prompt is triggered (because it is usually a person riding a horse, not a horse riding a person), or a comical special plot is triggered. In this way, the system not only knows which cards are available, but also understands the spatial physical relationships between the cards, greatly enhancing the physical realism of the interaction.

[0144] In related technologies, the unique identifier of NFC card 1 is usually fixed, or simply written with a "used" status. However, in continuous story interaction, the state of physical objects should change with the plot. For example, after a "battle" scene, the "sword" prop may be "damaged," or the "puppy" character may be "sick." If the system still identifies the card as an intact "sword" or a healthy "puppy" the next time the user uses it, it will cause a logical disconnect between the physical entity and the virtual storyline, reducing the user's immersion and the card's lifespan.

[0145] To address the disconnect between the fixed attributes of physical cards and the evolution of the storyline, this embodiment also provides a scheme for rewriting and evolving NFC dynamic attributes based on the storyline outcome.

[0146] In this embodiment, the storage chip of NFC card 1 is logically divided into a static identification area and a dynamic attribute area. The static identification area stores a unique identifier fixed at the factory, used to identify the card's basic identity (e.g., this is a sword). The dynamic attribute area is a read-write storage area used to store the card's real-time status parameters in the current story universe (e.g., durability, health, experience level, enchantment status, etc.).

[0147] The cloud server 3 integrates a storyline resolution engine, which is a software module. This engine is configured to calculate the attribute changes of each interactive story element based on the currently generated storyline while generating an audio clip playback sequence. For example, in a "Hero vs. Dragon" storyline, if the user chooses an aggressive attack strategy, the engine determines that the "sword" item's durability reaches zero, becoming a "broken sword"; simultaneously, it determines that the "hero" character's experience points increase, leveling them up to "veteran hero."

[0148] The cloud server 3 is configured to package and send the generated audio segment playback sequence and corresponding attribute update instructions to the smart audio device 2. The attribute update instructions include the unique identifier of the target card and new status data to be written to the dynamic attribute area (such as the broken sword status code and the new level code).

[0149] The smart audio device 2 is configured to control the NFC card reader module 21 to perform a write operation upon receiving an attribute update command, or during the settlement phase after audio playback ends. The NFC card reader module 21 sends a write command to the corresponding NFC card 1 placed in the sensing area, writing the new status data into the card's dynamic attribute area.

[0150] The next time the user uses the card, the NFC card reader module 21 will simultaneously read the data from the static identification area and the dynamic attribute area. If the control module 23 reads that the dynamic attribute area of ​​the "Sword" card is recorded as "Broken Sword," the generated combined request command will include the "Broken Sword" status feature. After the cloud server 3 receives the request containing the "Broken Sword" feature, the story logic rule base will no longer match the regular battle victory plot, but instead match the branch plot where the battle is difficult due to weapon damage or the need to repair the weapon at the blacksmith first.

[0151] Through this scheme, the NFC card 1 is no longer a simple trigger, but a digital life form with memory and growth attributes. The card's state will undergo irreversible or reversible physical changes (at the data level) as the user uses it and the story unfolds, making every interaction have a substantial impact on the card's future.

[0152] This embodiment also provides a smart audio generation method that supports multi-entity interaction, applied to the smart audio device in the system described in the foregoing embodiment. For example... Figure 4 As shown, the method may include the following steps:

[0153] S401 detects multiple NFC cards placed in the sensing area via an NFC card reader module.

[0154] An NFC card reader module is a hardware module in smart audio devices used to implement near-field wireless communication functionality. The NFC card reader module includes an NFC card reader chip and an NFC antenna. NFC is an abbreviation for Near Field Communication, a short-range, high-frequency wireless communication technology that allows contactless point-to-point data transfer between electronic devices over short distances. The sensing area refers to the physical area on the smart audio device used to place the NFC card. The sensing area is located within the effective sensing range of the NFC antenna, typically within ten centimeters. In this embodiment, the sensing area is designed to be a large area capable of simultaneously accommodating at least two NFC cards, thus supporting the simultaneous placement of multiple NFC cards. An NFC card is a physical card with an embedded NFC chip. The NFC chip includes a radio frequency antenna and a memory chip, enabling near-field wireless communication between the NFC card reader module and the radio frequency antenna.

[0155] The control module of the smart audio device controls the NFC card reader module to enter the detection state. The NFC card reader module's NFC chip generates a radio frequency (RF) signal, and the NFC antenna of the NFC card reader module transmits the RF signal to the sensing area. When an NFC card is placed in the sensing area, the NFC chip inside the NFC card receives the RF signal and is activated. The NFC chip then returns a response signal to the NFC card reader module through the RF antenna. After receiving the response signal, the NFC card reader module's NFC chip analyzes the response signal, thereby detecting the presence of an NFC card in the sensing area. When multiple NFC cards are placed in the sensing area, the NFC card reader module can detect the presence of multiple NFC cards simultaneously, or it can quickly and serially detect the presence of multiple NFC cards within a very short time interval using an anti-collision algorithm. The anti-collision algorithm is a technique used to distinguish and identify each NFC card when multiple NFC cards are simultaneously in the sensing area. The NFC card reader module transmits the detection results of multiple NFC cards to the control module of the smart audio device.

[0156] S402 reads the unique identifiers stored in multiple NFC cards via the NFC card reader module.

[0157] A unique identifier is data information that uniquely identifies each NFC card. This unique identifier can be a serial number embedded in the NFC chip at the factory, or a content identifier written later into the NFC card's storage chip. The unique identifier is stored in the internal storage chip of the NFC card and is unique; different NFC cards have different unique identifiers.

[0158] After the NFC card reader module detects the presence of multiple NFC cards in the sensing area, the control module of the smart audio device controls the NFC card reader module to perform a reading operation. The NFC card reader chip of the NFC card reader module sends a reading command to the NFC cards in the sensing area via the NFC antenna. Upon receiving the reading command, the NFC chip inside the NFC card reads the unique identifier from the storage chip and transmits the unique identifier to the NFC card reader module via the radio frequency antenna. The NFC card reader module's NFC antenna receives the radio frequency signal transmitted by the NFC card, and the NFC card reader chip demodulates and decodes the radio frequency signal to obtain the unique identifier stored in the NFC card. When multiple NFC cards are placed in the sensing area, the NFC card reader module communicates with each NFC card sequentially or in parallel, reading the unique identifier stored in each NFC card. The NFC card reader module transmits the multiple unique identifiers to the control module of the smart audio device. The control module of the smart audio device receives and temporarily stores the multiple unique identifiers for use in subsequent steps.

[0159] S403 combines the read unique identifiers to generate a combined request instruction.

[0160] A combined request instruction is a data packet used to request audio content corresponding to a combination of multiple unique identifiers from a cloud server. The generation of the combined request instruction is executed by the control module of the smart audio device. This control module is a combination of software and hardware; its hardware includes a microcontroller or processor chip, while its software includes firmware or application programs stored in the internal memory of the microcontroller or processor chip.

[0161] After receiving multiple unique identifiers transmitted by the NFC card reader module, the control module of the smart audio device performs a combination operation. This combination operation involves integrating the multiple unique identifiers into a single data set according to a predetermined data format. The control module encapsulates this data set into a combination request command based on a predetermined data encapsulation protocol. The combination request command contains the set information of the multiple unique identifiers and may also include additional information such as the smart audio device's identifier, the currently logged-in user account information, and a request timestamp. The combination request command is encoded using a predetermined data format, which can be JSON, XML, or other structured data formats. After generating the combination request command, the control module temporarily stores it in its internal memory for later transmission to the cloud server. By combining multiple unique identifiers to generate a combination request command, the smart audio device can request audio content corresponding to combinations of multiple NFC cards from the cloud server at once, thus enabling multi-entity interaction.

[0162] S404 sends the combined request instruction to the cloud server through the network communication module, requesting the cloud server to perform logical matching based on the combined request instruction.

[0163] The network communication module is a hardware module in a smart audio device used to implement network communication functions. It includes a wireless communication chip and a wireless communication antenna. The wireless communication chip handles network communication protocols and data transmission and reception, supporting either Wi-Fi or cellular communication protocols. The wireless communication antenna transmits and receives wireless signals. A cloud server is a computing device or cluster of computing devices deployed on the internet. It possesses data storage and processing capabilities, storing a story logic rule base and an audio segment library. Logic matching refers to the process by which the cloud server, based on multiple unique identifiers contained in a combined request instruction, searches the story logic rule base for story logic rules that match the combination of these unique identifiers, thereby determining the corresponding storyline logic.

[0164] The control module of the smart audio device transmits the generated combined request command to the network communication module. The wireless communication chip of the network communication module encapsulates the combined request command into a data frame conforming to the network communication protocol. The wireless communication chip of the network communication module then sends the data frame to a wireless router or base station via a wireless communication antenna. The wireless router or base station forwards the data frame to a cloud server via the internet. After receiving the data frame through its network interface, the cloud server parses the data frame and extracts the combined request command. Based on the multiple unique identifiers contained in the combined request command, the cloud server performs a logical matching operation, searches the story logic rule base for a story logic rule that matches the combination of multiple unique identifiers, determines the corresponding storyline logic, and retrieves the corresponding audio segment from the audio segment library based on the storyline logic to generate an audio segment playback sequence.

[0165] S405 receives an audio segment playback sequence returned by the cloud server through the network communication module. The audio segment playback sequence is generated by the cloud server in response to the combined request instruction.

[0166] An audio clip playback sequence is a data set generated by a cloud server based on combined request instructions. It contains multiple audio clips and their playback order information. Audio clips are story segment audio files retrieved from the cloud server's audio clip library. These include opening audio, character dialogue audio, scene description audio, event description audio, and ending audio. Playback order information indicates the order in which the audio clips are played. Playing these clips sequentially forms a complete and coherent story. An audio clip playback sequence can contain the audio data of the multiple audio clips themselves, or it can contain the download addresses and playback order information for each audio clip.

[0167] After generating the audio segment playback sequence, the cloud server sends the sequence to the internet via a network interface. The sequence is then transmitted to a wireless router or base station, which in turn transmits it wirelessly to the smart audio device. The smart audio device's network communication module receives the wireless signal via its antenna, and its wireless communication chip demodulates and decodes the signal to extract the audio segment playback sequence. The network communication module then transmits the sequence to the smart audio device's control module. If the sequence contains download addresses for multiple audio segments, the control module instructs the network communication module to download the corresponding audio data from the cloud server based on the download addresses. The control module receives and temporarily stores the audio segment playback sequence and its corresponding audio data for later parsing and playback.

[0168] S406 parses the audio segment playback sequence and plays the audio content contained in the audio segment playback sequence in sequence through the audio output module.

[0169] Analysis refers to the process by which the control module of a smart audio device analyzes and processes the data of the audio segment playback sequence. The audio output module is the hardware module in a smart audio device used to output sound. It includes an audio decoding chip, a power amplifier, and a speaker. The audio decoding chip decodes digital audio data into analog audio signals, the power amplifier amplifies the analog audio signals, and the speaker converts the amplified analog audio signals into sound waves for output. Sequential playback refers to the process of playing each audio segment in the order indicated in the audio segment playback sequence.

[0170] The control module of the smart audio device parses the audio segment playback sequence, extracting audio data and playback order information for multiple audio segments from the sequence. Based on this playback order information, the control module determines the playback order of the audio segments and processes them sequentially.

[0171] The control module first acquires the audio data of the first audio segment in the playback sequence and transmits this audio data to the audio output module. The audio output module's audio decoding chip receives and decodes the audio data, converting the digital audio data into an analog audio signal. The audio output module's power amplifier receives and amplifies the analog audio signal. The audio output module's speaker receives the amplified analog audio signal, converts it into sound waves, and outputs it to the external space, allowing the user to hear the audio content.

[0172] After the first audio segment finishes playing, the control module acquires the audio data of the second audio segment in the playback sequence and transmits it to the audio output module for playback. This process continues, with the control module transmitting the audio data of each audio segment sequentially to the audio output module for playback, until all audio segments in the playback sequence have finished playing.

[0173] By playing multiple audio clips in sequence, smart audio devices can seamlessly stitch them together to form a complete and coherent dynamically generated story, thereby providing users with a rich and interactive audio experience.

[0174] This embodiment provides a complete audio generation and processing workflow that supports multi-entity interaction, breaking the limitation of the traditional method's one-to-one static binding of physical objects and digital content. This method detects and reads the identifiers of multiple NFC cards step-by-step, integrating scattered entity information into a combined request command. It then utilizes cloud-based logic matching capabilities to respond to the user's multi-entity interactive operations. Through this process, the method achieves accurate conversion of multi-entity combined intents, enabling the device to dynamically construct a coherent audio narrative based on multiple entity objects input by the user. This grants users the authority to customize content through multi-entity interaction, enhancing the flexibility, engaging nature, and user participation of the audio generation method.

[0175] The following describes an exemplary smart audio device provided by an embodiment of the present invention. Figure 5 This is a schematic diagram of an exemplary hardware architecture of a smart audio device provided in an embodiment of the present invention.

[0176] In some embodiments, the smart audio device may be an electronic device, or the smart audio device may include an electronic device. The electronic device includes a processor, memory, and a network interface connected via a system bus. The processor of the electronic device provides computing and control capabilities. The memory of the electronic device includes a non-volatile storage medium and internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The database of the electronic device stores data. The network interface of the electronic device is used to communicate with other external terminals or servers via a network connection. In some embodiments, the network interface may be a wired network interface; in some embodiments, the network interface may also be a wireless network interface. When the computer program is executed by the processor, it implements the methods in the embodiments of the present invention.

[0177] Those skilled in the art will understand that Figure 5The architecture shown is merely a block diagram of a portion of the architecture related to the present invention and does not constitute a limitation on the electronic device to which the present invention is applied. Specific electronic devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0178] The above-described embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

[0179] As used in the above embodiments, depending on the context, the term "when..." can be interpreted as meaning "if...", "after...", "in response to determining...", or "in response to detecting...". Similarly, depending on the context, the phrase "when determining..." or "if (the stated condition or event) is interpreted as meaning "if determining...", "in response to determining...", "when (the stated condition or event) is detected", or "in response to detecting (the stated condition or event)".

[0180] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on an electronic device, all or part of the processes or functions described in the embodiments of the present invention are generated. The electronic device may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive), etc.

[0181] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. This program can be stored in a computer-readable storage medium, and when executed, it can include the processes described in the above method embodiments. The aforementioned storage medium includes various media capable of storing program code, such as ROM or random access memory (RAM), magnetic disks, or optical disks.

Claims

1. An intelligent audio generation system supporting multi-entity interaction, characterized in that, include: Multiple NFC cards, each NFC card storing a unique identifier and predefined as a specific story element attribute, the story element attribute including at least one of character attributes, scene attributes, prop attributes or event attributes; A smart audio device includes an NFC card reader module, a network communication module, a control module, and an audio output module; the smart audio device is configured to: detect and read the unique identifier of the NFC card placed in the sensing area through the NFC card reader module, combine the read multiple unique identifiers to generate a combination request command, and send the combination request command to a cloud server through the network communication module; The cloud server stores a story logic rule library and an audio segment library; the cloud server is configured to: receive the combined request instruction, match the multiple unique identifiers contained in the combined request instruction in the story logic rule library to determine the corresponding storyline logic, retrieve the corresponding audio segments from the audio segment library according to the storyline logic and generate an ordered audio segment playback sequence, and send the audio segment playback sequence to the smart audio device; The intelligent audio device is further configured to receive the audio segment playback sequence and play the audio content in the audio segment playback sequence sequentially through the audio output module.

2. The system according to claim 1, characterized in that, The cloud server is configured to: identify the story element attributes corresponding to different unique identifiers in the combined request instruction, and determine the unique storyline logic or trigger specific branch plots according to the preset attribute combination rules in the story logic rule base.

3. The system according to claim 1, characterized in that, The smart audio device is also configured to: when encountering a story branch point during the playback of the audio segment sequence, issue an interactive signal through voice prompts or light prompts, and restart the NFC card reader module to wait for the detection of a new NFC card; If the NFC card reader module detects a new NFC card's unique identifier, the smart audio device generates an update request instruction containing the new unique identifier and sends it to the cloud server; The cloud server determines the subsequent story branching direction based on the update request instruction, generates the subsequent audio segment playback sequence, and sends it to the smart audio device for continuous playback.

4. The system according to claim 1, characterized in that, The intelligent audio device is also equipped with a local storage module; After reading the unique identifier of the NFC card, the smart audio device is configured as follows: First, search the local storage module to see if there is a cached audio segment corresponding to the combination of the multiple unique identifiers; If the cached audio segment exists, it will be played directly through the audio output module; If the cached audio segment does not exist, the operation of sending a combination request instruction to the cloud server is performed, and after receiving the audio segment playback sequence sent by the cloud server, it is stored in the local storage module.

5. The system according to any one of claims 1-4, characterized in that, The NFC card reader module also has a data writing function; The smart audio device is also configured to: after successfully receiving the audio segment playback sequence and completing the download, control the NFC card reader module to write status identification information into the storage chip of the NFC card participating in the interaction; The status identification information includes at least a status bit indicating that the card has been used, and the device identifier or user account information of the current smart audio device.

6. The system according to claim 5, characterized in that, When the smart audio device reads the NFC card through the NFC card reader module, it is also configured to read the status identification information stored in each NFC card. The control module of the smart audio device is configured to execute verification logic: determine whether the device identifier or user account information in the status identifier information matches the device identifier of the current smart audio device or the currently logged-in user account; if they do not match, it is determined to be a non-bound device, and the smart audio device is prohibited from generating the combined request instruction or from playing the audio segment playback sequence.

7. The system according to claim 1, characterized in that, When generating the combined request instruction, the smart audio device is further configured to: detect the placement order information of each of the NFC cards when it is read by the NFC card reader module, and encapsulate the placement order information into the combined request instruction; When determining the corresponding storyline logic, the cloud server is configured to simultaneously match the corresponding storyline logic in the story logic rule base based on the set content of the multiple unique identifiers and the placement order information.

8. A method for generating intelligent audio that supports multi-entity interaction, characterized in that, The method, applied to a smart audio device according to any one of claims 1-7, comprises: The NFC card reader module detects multiple NFC cards placed in the sensing area. The NFC card reader module reads the unique identifiers stored in the multiple NFC cards. The multiple unique identifiers read are combined to generate a combined request instruction; The combined request instruction is sent to the cloud server via the network communication module to request the cloud server to perform logical matching based on the combined request instruction; The network communication module receives an audio segment playback sequence returned by the cloud server, the audio segment playback sequence being generated by the cloud server in response to the combined request instruction; The audio segment playback sequence is parsed, and the audio content contained in the audio segment playback sequence is played sequentially through the audio output module.

9. A smart audio device, characterized in that, Includes one or more processors and memory; The memory is coupled to the one or more processors, the memory being used to store computer program code, the computer program code including computer instructions, the one or more processors invoking the computer instructions to cause the smart audio device to perform the method as described in claim 8.

10. A computer-readable storage medium storing computer instructions, characterized in that, When the computer instructions are executed on the smart audio device, the smart audio device causes the smart audio device to perform the method as described in claim 8.