Non-heritage culture dissemination system based on virtual scene and narrative interaction

By combining a narrative logic engine and a multimodal interactive perception module with a skill logic simulator and a cognitive state tracker, a narrative-driven intangible cultural heritage dissemination system was constructed. This system solves the problem of insufficient integration of skill logic and cultural narrative in existing systems, and realizes in-depth dissemination and immersive learning of intangible cultural heritage skills.

CN122244294APending Publication Date: 2026-06-19刘雨菲

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
刘雨菲
Filing Date
2026-02-10
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

The existing intangible cultural heritage dissemination system lacks a deep integration of the core logic of the skills and the cultural narrative, making it difficult for participants to deeply understand the inner essence of intangible cultural heritage and the logic of cultural generation. The interaction mode is monotonous and lacks coherence and a deep sense of participation.

Method used

A non-linear narrative framework driven by a narrative logic engine is adopted, combined with a multimodal interactive perception module, a skill logic simulator, and a cognitive state tracker to construct a narrative-driven, logic simulation system. Through multimodal interaction and cognitive state feedback mechanisms, a high-fidelity simulation and personalized dissemination of intangible cultural heritage skills are achieved.

Benefits of technology

It has enabled the in-depth dissemination of intangible cultural heritage skills, enhanced the immersion of participants and the personalized control of their cognitive state, improved the accuracy and effectiveness of cultural dissemination, and promoted the long-term memory and internalization of cultural cognition.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244294A_ABST
    Figure CN122244294A_ABST
Patent Text Reader

Abstract

This application relates to the technical field of intangible cultural heritage (ICH) dissemination systems based on virtual scenes and narrative interaction. Specifically, it discloses an ICH dissemination system based on virtual scenes and narrative interaction, aiming to address the problems of insufficient integration of skill logic and cultural narrative, and the lack of diverse interaction modes in the digital dissemination of ICH. The system includes a narrative logic engine, a virtual scene construction module, a multimodal interaction perception module, a skill logic simulator, a cognitive state tracker, a dynamic narrative regulator, and an auxiliary narrative branch generator. Through the synergy of these modules, the system can dynamically adjust the narrative path and content according to the user's real-time cognitive state, providing a personalized immersive interactive experience, thereby achieving a deep and effective dissemination of the inherent logic and cultural essence of ICH skills.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of computer application technology, specifically relating to a non-material cultural heritage dissemination system based on virtual scenes and narrative interaction. Background Technology

[0002] The protection and dissemination of intangible cultural heritage is a crucial issue in the field of cultural transmission. Its core lies in effectively recording, interpreting, and promoting cultural forms that embody profound historical significance and unique techniques, in order to address real-world challenges such as the loss of successors and the extinction of skills. Digital technology offers new possibilities for this field. By constructing virtual scenes and interactive experiences, it can transcend the limitations of traditional recording methods, achieving a more vivid and in-depth presentation of the connotations of intangible cultural heritage.

[0003] Among them, cultural dissemination systems based on virtual reality and interactive narratives are becoming a key technological direction for the digital protection and living transmission of intangible cultural heritage. These systems aim to create immersive virtual environments and design narrative-driven interactive processes, enabling users to transcend the limitations of time and space and intuitively perceive the techniques, cultural context, and artistic value of intangible cultural heritage projects, thereby achieving the dual goals of knowledge transfer and emotional resonance.

[0004] Current technologies typically focus on static digital archiving or simple 3D model displays of intangible cultural heritage (ICH) projects, lacking a deep integration of the core logic of the craftsmanship and the cultural narrative. Specifically, existing systems struggle to simulate the dynamic processes of production, the interconnected mechanisms of mechanical structures, and the grand scenes and spatiotemporal narratives presented in complex ICH projects such as Qionghua puppetry. This results in users only receiving superficial visual information, unable to grasp the intrinsic essence of the craftsmanship and the logic of cultural generation, leading to a superficial dissemination effect. Furthermore, existing interaction modes are often simplistic, failing to organically integrate user actions, knowledge exploration, and narrative progression, resulting in a lack of coherence and deep engagement in the experience, hindering effective cultural understanding and retention. Therefore, constructing an intelligent system that deeply integrates virtual scenes, narrative logic, and multimodal interaction to systematically disseminate ICH skills and cultural connotations has become a pressing technical challenge in this field. Summary of the Invention

[0005] The purpose of this invention is to provide a non-material cultural heritage dissemination system based on virtual scenes and narrative interaction, so as to solve the technical contradictions in the existing digital dissemination systems for non-material cultural heritage, such as insufficient integration of the core logic of skills and cultural narrative, single interaction mode, and difficulty for users to deeply understand the inner essence and cultural generation logic of non-material cultural heritage.

[0006] The technical solution of this invention is a non-material cultural heritage (NCH) dissemination system based on virtual scenes and narrative interaction. This system includes: a narrative logic engine for parsing and driving a preset NCH narrative script, which defines the theme, plot development, key knowledge nodes, and corresponding virtual scenes and interactive task sequences for cultural dissemination; a virtual scene construction module for dynamically generating and rendering a multi-layered three-dimensional virtual environment matching the NCH skill process and cultural context based on scene instructions output by the narrative logic engine; a multimodal interaction perception module for real-time collection of embodied operation data, voice command data, and gaze focus data input by the user through interactive devices; a skill logic simulator for performing high-fidelity physical and logical simulations of the core production processes, structural linkage principles, or performance action sequences of NCH projects based on a preset digital twin model of NCH skills, according to skill simulation instructions scheduled by the narrative logic engine; and a cognitive state tracker for real-time analysis of the interaction data collected by the multimodal interaction perception module, combined with the current narrative process and virtual scene state, to calculate and update the user's cognitive mastery of preset knowledge nodes. The dynamic narrative controller receives cognitive mastery data from the cognitive state tracker and compares it with preset narrative path thresholds in the narrative logic engine. When the cognitive mastery reaches or exceeds the path threshold of the current narrative node, it instructs the narrative logic engine to advance to the next narrative node. When the cognitive mastery is lower than the path threshold of the current narrative node, it triggers an auxiliary narrative branch generation instruction. The auxiliary narrative branch generator responds to the auxiliary narrative branch generation instruction issued by the dynamic narrative controller. Based on the core knowledge elements of the current narrative node, it extracts relevant multimedia materials, simplified interactive tasks, or principle breakdown animations from the preset auxiliary resource library and instructs the virtual scene construction module and the skill logic simulator to collaboratively construct a reinforcement learning scene focused on that knowledge point.

[0007] Furthermore, the narrative scripts preset in the narrative logic engine are organized using a non-linear narrative framework based on a directed graph structure. This non-linear narrative framework includes a main narrative thread and multiple narrative branches belonging to specific knowledge nodes. The main narrative thread defines the core experience process of cultural dissemination, while the narrative branches correspond to knowledge expansion content at different depths or from different perspectives. The narrative logic engine jumps and merges between the main narrative thread and each narrative branch according to the instructions of the dynamic narrative controller.

[0008] Furthermore, the virtual scene construction module includes a scene element library, a spatial relationship rule library, and a real-time rendering engine. The scene element library stores 3D models, texture maps, and sound effect resources related to the target intangible cultural heritage project. The spatial relationship rule library defines the positional relationships, scales, and display logic between scene elements at different narrative stages. The real-time rendering engine calls corresponding resources from the scene element library according to the scene instructions of the narrative logic engine, and performs real-time ray tracing and rendering on the graphics processor according to the constraints of the spatial relationship rule library, so as to output a coherent virtual scene including a workshop, natural ecological context, historical street scene, or performance stage.

[0009] Furthermore, the multimodal interactive perception module integrates a force feedback data glove, a spatial locator, a microphone array, and an eye tracker. The force feedback data glove is used to capture the posture, position, and applied force and torque data of the user's hands. The spatial locator is used to track the six-degree-of-freedom pose of the user's head. The microphone array is used to collect the user's voice commands and perform noise reduction and semantic recognition. The eye tracker is used to record the three-dimensional coordinates and dwell time of the user's gaze focus in the virtual scene. After synchronizing the above multi-source heterogeneous data with timestamps and coordinate systems, the module outputs a standardized interactive data stream.

[0010] Furthermore, the craft logic simulator embeds a digital twin model of intangible cultural heritage craftsmanship. This model adopts a multi-level component-based architecture. The first level is the geometric appearance layer, which defines the three-dimensional mesh model and surface material properties of each component of the intangible cultural heritage object. The second level is the physical property layer, which assigns physical parameters such as mass, elastic modulus, and coefficient of friction to each component. The third level is the constraint relationship layer, which defines kinematic and dynamic constraints between components, such as hinges, slide rails, gear meshing, and rope traction. The fourth level is the process logic layer, which encodes the standard procedures, tool usage specifications, and dependencies between steps in the form of a state machine or flowchart. When the craft logic simulator is running, it drives the physical property layer and the constraint relationship layer to perform joint calculations according to the instructions of the process logic layer, thereby simulating the entire process of cutting, carving, assembling, debugging, and even performance actions in real time, and generating corresponding visual, auditory, and force feedback signals.

[0011] Furthermore, the cognitive state tracker performs cognitive state analysis as follows: First, it extracts feature data related to the preset interaction goal of the current narrative node from the standardized interaction data stream of the multimodal interaction perception module. The feature data includes the completion order and accuracy of operation steps, the correctness of tool selection, the assembly success rate of key structural components, the accuracy of voice answers to principle-based questions, and the duration of gaze dwell on core visual elements. Then, the extracted feature data is input into a pre-trained cognitive assessment model. This cognitive assessment model maps multidimensional feature data into a scalar value between 0 and 1 through a multi-layer neural network, i.e., cognitive mastery. Before deployment, the cognitive assessment model has been trained using a large amount of user interaction data labeled with expert ratings under supervised learning.

[0012] Furthermore, the preset narrative path threshold in the dynamic narrative controller is a dynamic variable; its initial value is defined by the narrative script and can be adaptively adjusted according to the overall cognitive progress rate of the user; the adjustment logic is as follows: the system records the average time taken by the user through several previous narrative nodes. If the average time is lower than the preset benchmark value, the path threshold of subsequent nodes is appropriately increased to provide more challenging in-depth content; if the average time is higher than the preset benchmark value, the path threshold of subsequent nodes is appropriately decreased to ensure narrative fluency.

[0013] Furthermore, the auxiliary resource library invoked by the auxiliary narrative branch generator has a content structure that corresponds one-to-one with the knowledge nodes of the narrative script. For each knowledge node, the auxiliary resource library stores at least three types of auxiliary resources: the first type is principle breakdown animation, which dynamically displays key steps of skills or structural linkage principles in a slow-motion, perspective, and highlighting manner; the second type is simplified interactive tasks, which allow users to repeatedly practice core skill points while reducing operational complexity or the number of steps; the third type is multimedia materials related to cultural background, including historical pictures, oral videos of inheritors, and related folk song audios. Based on the specific weak link characteristics fed back by the cognitive state tracker, the auxiliary narrative branch generator selects the resource type with the highest matching degree from the resource library of the corresponding knowledge node and combines them to generate personalized auxiliary narrative branches.

[0014] Furthermore, the system also includes a narrative outcome generation and sharing module; after the user completes the main narrative thread or a specific narrative branch, this module automatically integrates the key operational moments during the experience, the final virtual model of the work, and the personalized learning report generated by the system to generate an interactive review video or a 3D outcome archive; the user can share the generated outcome to a designated social platform or community database through this module.

[0015] Compared with the prior art, the advantages and positive effects of the present invention are as follows: 1. This invention constructs a core technical framework driven by narrative logic and based on logical simulation through deep coupling of a narrative logic engine and a craft logic simulator. This system no longer simply piles up virtual scenes, 3D models, and interactive operations; instead, it uses the inherent procedural logic and cultural narrative of intangible cultural heritage crafts as its framework to drive the entire experience. The craft logic simulator, based on a high-fidelity digital twin model, can perform principle-level simulations of the dynamic processes and mechanical linkages of complex intangible cultural heritage projects such as Qionghua puppets. This allows users to directly perceive the inherent logic of "why it is made this way" and "how it works" through hands-on experience, thereby achieving a deep dissemination of the essence of intangible cultural heritage crafts rather than just its appearance.

[0016] 2. This invention achieves adaptive and personalized adjustment of the dissemination path through a closed-loop feedback mechanism consisting of a cognitive state tracker and a dynamic narrative controller. The system can quantitatively assess the cognitive state of the user in real time and dynamically adjust the narrative process accordingly. When the user fails to fully grasp the current knowledge, the system does not simply allow them to proceed or provide textual prompts, but automatically triggers auxiliary narrative branches, providing targeted principle breakdowns, simplified exercises, or background-deepening content. This dynamic narrative control based on cognitive state allows the cultural dissemination process to adapt to the learning pace and cognitive characteristics of different users, transforming one-way information transmission into a two-way, adaptive learning dialogue, significantly improving the accuracy and effectiveness of dissemination.

[0017] 3. This invention creates a highly immersive and coherent embodied learning environment through the synergy of a multimodal interactive perception module and a virtual scene construction module. The system integrates multiple interactive channels, including force feedback, spatial positioning, voice, and eye tracking, enabling users to operate virtual tools, assemble virtual components, and engage in natural dialogue with the virtual environment in a near-realistic manner. This multimodal interaction not only enhances immersion but, more importantly, provides a rich and objective data source for cognitive state tracking. Simultaneously, the virtual scene is dynamically generated and maintains high coherence according to the narrative progress, ensuring the integrity of the experience from learning skills to understanding cultural context. This allows knowledge transfer and emotional resonance to be deeply integrated within a unified spatiotemporal narrative, effectively promoting long-term memory and internalization of cultural cognition. Attached Figure Description

[0018] Figure 1 This is a schematic diagram of the overall technical solution architecture of the intangible cultural heritage dissemination system based on virtual scenes and narrative interaction proposed in this invention; Figure 2 This is a schematic diagram of the core principle framework of the coupling of narrative-driven and logic simulation in this invention; Figure 3 This is a logical flowchart of the multimodal interactive perception and cognitive state tracking in this invention; Figure 4This is a schematic diagram of the closed-loop feedback mechanism for dynamic narrative control and auxiliary branch generation in this invention; Figure 5 This is a schematic diagram illustrating the multi-level interaction relationship between virtual scene construction and skill logic simulation in this invention; Detailed Implementation

[0019] 1. Example 1 The overall technical architecture of the intangible cultural heritage dissemination system based on virtual scenes and narrative interaction described in this invention is as follows: Figure 1 As shown in the diagram, this system uses a narrative logic engine as its core. Through close coupling with a virtual scene construction module, a multimodal interaction perception module, a craft logic simulator, a cognitive state tracker, a dynamic narrative regulator, and an auxiliary narrative branch generator, it constructs a digital intangible cultural heritage dissemination platform with adaptive capabilities, high immersion, and deep cultural logic restoration capabilities. The following will describe in detail the specific implementation methods of each component of the system with reference to the accompanying diagrams.

[0020] First, the narrative logic engine, as the core driver of the entire system, is responsible for parsing and executing the pre-defined intangible cultural heritage narrative script. This narrative script is not a traditional linear story text, but rather organized using a non-linear narrative framework based on a directed graph structure. Within this framework, the main narrative thread defines the core experience flow of cultural dissemination, such as the main task chain from "getting to know the Qionghua puppet" to "understanding its mechanical structure," and then to "completing a full performance." Multiple narrative branches correspond to in-depth expansions of specific knowledge nodes, such as "the principle of puppet joint linkage," "the evolution of traditional carving techniques," or "the local opera music system." Each node is assigned a unique identifier in the directed graph, and its pre-dependencies and subsequent jump relationships are expressed through edge connections. Based on the advancement or regression instructions issued by the dynamic narrative controller, the narrative logic engine dynamically jumps and merges between the main narrative thread and various narrative branches, thereby achieving non-linear, personalized narrative path generation. Please refer to the appendix. Figure 2 The diagram clearly illustrates how the narrative logic engine works in conjunction with the craft logic simulator to form a dual-core coupling mechanism of "narrative-driven - logic simulation".

[0021] The virtual scene construction module dynamically generates and renders a multi-layered 3D virtual environment that highly matches the intangible cultural heritage (ICH) techniques and cultural context, based on scene instructions output by the narrative logic engine. This module contains three key sub-units: a scene element library, a spatial relationship rule library, and a real-time rendering engine. The scene element library stores a large number of 3D model resources related to the target ICH project, including but not limited to production tools (such as carving knives and chisels), raw materials (such as wood and cloth), semi-finished parts (such as puppet torsos and joint components), complete works (such as puppet characters in performances), and environmental elements (such as wooden workbenches in traditional workshops, Lingnan courtyards outside windows, and stage backgrounds in historical street scenes). All models are equipped with high-precision material textures and physical sound effect parameters to ensure visual and auditory realism. The spatial relationship rule library stores the spatial configuration logic between scene elements at different narrative stages. For example, in the "carving stage," the carving knife must be within the operator's right-hand reach, and the wood should be placed in the center of the workbench with a fixed orientation; while in the "assembly stage," each joint component must be arranged in a specific order in the work area and gradually disappear as assembly progresses. After receiving scene instructions from the narrative logic engine, the real-time rendering engine retrieves the necessary resources from the scene element library and strictly adheres to the constraints in the spatial relationship rule library. It then executes real-time ray tracing algorithms on the graphics processor to generate a coherent virtual scene with global illumination, soft shadows, and reflection and refraction effects. This scene not only includes static environments but also supports dynamic weather, day-night cycles, and crowd activities to enhance immersion, ensuring that the user is always in a virtual world with a complete cultural context and self-consistent spatiotemporal logic.

[0022] The multimodal interaction perception module is used to collect real-time data on the user's embodied operation, voice commands, and gaze focus input through the interactive device. This module integrates four core hardware devices: a force feedback data glove, a spatial locator, a microphone array, and an eye tracker. The force feedback data glove incorporates multiple flexible sensors and micro-motors, accurately capturing the user's 26 degrees of freedom in hand posture, including finger bending angles, palm opening and closing, and wrist rotation direction, while simultaneously recording the force and torque applied to virtual objects. The spatial locator employs a combination of infrared optical tracking and inertial measurement units, tracking the user's head posture (i.e., translational dimensions of forward / backward, left / right, and up / down, and rotational dimensions of pitch, yaw, and roll) at a frequency of over 90 frames per second, providing low-latency position updates for the virtual reality headset. The microphone array consists of eight high-sensitivity omnidirectional microphones arranged in a ring. Utilizing beamforming and adaptive noise reduction algorithms, it effectively suppresses environmental noise, accurately picks up the user's voice commands, and converts them into structured semantic commands through a locally deployed speech recognition engine, such as "show the joint structure," "replay the carving steps," or "explain the function of this mechanism." An eye tracker, installed inside the headset, uses a near-infrared light source and a high-speed camera to capture pupil position, calculating the three-dimensional coordinates of the gaze focus in the virtual scene and its dwell time in a specific area. This module aligns the four types of heterogeneous data streams with millisecond-level timestamps and uniformly transforms them to a world coordinate system based on the virtual scene origin, ultimately outputting a standardized interactive data packet. Its data structure includes fields: hand pose matrix (4×4), force vector (3D), head pose matrix (4×4), voice semantic tag (string), gaze focus coordinates (3D), and gaze duration (milliseconds). Please refer to the appendix. Figure 3 The figure details how the multimodal interactive sensing module transforms raw sensor data into feature inputs usable by the cognitive state tracker.

[0023] The craft logic simulator is a key technological unit for achieving high-fidelity simulation of intangible cultural heritage crafts. Its embedded digital twin model of intangible cultural heritage crafts adopts a four-layer component-based architecture. The first layer is the geometric appearance layer, which stores the triangular mesh model of all components of the intangible cultural heritage object. The number of vertices is typically no less than 500,000, and the surface normals and UV coordinates are finely optimized to support high-resolution texture mapping and normal mapping. The second layer is the physical property layer, which assigns an independent set of physical parameters to each geometric component, including mass (unit: kg), density (unit: kg / m³), elastic modulus (unit: Pascal), Poisson's ratio, static friction coefficient, and dynamic friction coefficient. These parameters are obtained through actual material testing or literature research and are loaded into the physics engine during the simulation initialization phase. The third layer is the constraint relationship layer, which defines the kinematic and dynamic constraints between components in the form of procedural scripts. For example, in the Qionghua puppet model, the shoulder joint is defined as a ball-and-socket constraint, allowing three-axis rotation but restricting translation; the elbow joint is a hinge constraint, allowing only a single rotational axis of motion; and the control rope is modeled as a stretchable spring-damped system, whose tension dynamically adjusts with length. The fourth level is the process logic layer, which encodes the complete intangible cultural heritage production process in the form of a finite state machine. The state machine contains several state nodes, each corresponding to a standard operation step (such as "rough shaping," "detail carving," "joint drilling," and "rope threading"). Nodes are connected by transition conditions, which can be "operation completed," "tool switching," or "structural verification passed." When the craft logic simulator runs, the narrative logic engine first issues the simulation command for the current process, activating the corresponding state in the process logic layer; subsequently, this state triggers the joint calculation of the physical attribute layer and the constraint relationship layer, driving the 3D model to execute the corresponding action sequence. For example, when a user uses a virtual carving knife to carve a puppet's head, the system calculates the collision point between the knife tip and the wood surface in real time, generates the trajectory of flying wood chips based on the wood's physical properties, and applies a reaction force proportional to the cutting depth and speed to the user's hand through a force feedback data glove. Simultaneously, the system generates corresponding carving sound effects and wood chip particle effects, forming a complete multi-sensory feedback loop.

[0024] The cognitive state tracker analyzes interaction data collected by the multimodal interaction perception module in real time, combining the current narrative progress and virtual scene state to calculate and update the user's cognitive mastery of preset knowledge nodes. Its implementation process consists of two stages: feature extraction and model inference. In the feature extraction stage, the system first filters out feature dimensions directly related to the preset interaction goal of the current narrative node from the standardized interaction data stream. For example, if the current node is "correctly assemble the puppet's right arm joint," relevant features include: whether the operation steps are performed in the order of "first assemble the upper arm - then the forearm - finally thread the rope," whether each component is aligned within the tolerance range (±2 mm), whether the rope threading is successful on the first attempt, whether the voice response to the system's question "Why can this joint achieve flexion and extension?" contains keywords such as "hinge" and "rotation axis," and whether the gaze is continuously focused on the rotation axis area for more than 3 seconds during the joint profile animation playback. All features are quantified into numerical indicators, such as a step sequence correctness score of 1.0 or 0.0, assembly deviation as an Euclidean distance value, and voice keyword matching degree as a similarity score between 0 and 1. During the model inference phase, the aforementioned multidimensional feature vectors are input into a pre-trained cognitive assessment model. This model is a three-layer fully connected neural network. The number of nodes in the input layer equals the feature dimension (typically 8 to 12 dimensions), the hidden layer contains 64 neurons using the ReLU activation function, and the output layer is a single Sigmoid activation unit, mapping the input to a scalar value between 0 and 1, representing the cognitive mastery level. Before deployment, the model underwent supervised learning training using over 2000 hours of labeled data. The labeled data comes from frame-by-frame ratings of videos of participants' operations by multiple intangible cultural heritage inheritors. The rating criteria cover three dimensions: operational standardization, depth of understanding of principles, and cultural sensitivity. The cognitive state tracker performs an assessment every 500 milliseconds, outputting the current cognitive mastery value and passing it to the dynamic narrative controller.

[0025] The dynamic narrative modulator receives cognitive mastery data from the cognitive state tracker and compares it with preset narrative path thresholds in the narrative logic engine. These thresholds are not fixed constants but dynamic variables. Their initial values ​​are set by the narrative script when nodes are created; for example, the initial threshold for the "joint assembly" node is 0.75. The system also maintains a cognitive progress rate indicator for the user, calculated using a sliding window mechanism: recording the average time (in seconds) taken by the user through the last five narrative nodes and comparing it to a preset baseline time (e.g., 120 seconds). If the average time is less than 80% of the baseline (i.e., less than 96 seconds), the user is considered to have high learning efficiency, and the system increases the path threshold for subsequent nodes by 5% to 10%. If the average time is greater than 120% of the baseline (i.e., greater than 144 seconds), the user is considered to have comprehension difficulties, and the system decreases the path threshold for subsequent nodes by 5% to 10%. This adaptive mechanism ensures that the narrative difficulty dynamically matches the user's ability. When the cognitive mastery of the current node reaches or exceeds the adjusted path threshold, the dynamic narrative controller sends a "progress" command to the narrative logic engine, triggering the loading of the next narrative node; when the cognitive mastery is below the threshold, it sends an "assistance trigger" command to the auxiliary narrative branch generator, along with a feature summary of the current weak link (such as "insufficient understanding of the rope tension principle" or "disordered joint assembly sequence").

[0026] The auxiliary narrative branch generator responds to the instructions of the dynamic narrative controller, extracting matching multimedia materials, simplified interactive tasks, or principle breakdown animations from a pre-set auxiliary resource library based on the core knowledge elements of the current narrative node. The content structure of the auxiliary resource library strictly corresponds one-to-one with the knowledge nodes of the narrative script. For each knowledge node, the library stores at least three types of auxiliary resources: The first type is principle breakdown animations, which use slow motion (0.5x speed), X-ray perspective, highlighting and flashing key components (red pulse effect), and dynamic arrow annotations to intuitively demonstrate the key mechanisms of the skill. For example, for the "puppet string control" node, the animation will demonstrate frame by frame how the eight strings can be pulled in different combinations to achieve actions such as nodding, waving, and turning. The second type is simplified interactive tasks, which reduce the number of operation steps (such as simplifying the original 10 assembly steps to 5 steps), widen the tolerance range (such as widening the tolerance range from ±2 mm to ±5 mm), or provide intelligent adsorption assistance (automatically aligning when the parts are close to the correct position), allowing the user to repeatedly practice core skill points in a low-pressure environment. The third category consists of multimedia materials related to cultural background, including high-definition historical photos (such as performance scenes of the Qionghua Puppet Troupe in the 1950s), oral history videos of inheritors (1 to 3 minutes long, recounting personal learning experiences and skill insights), and local folk song audio (such as excerpts from the Yuebei Tea-Picking Tune). The auxiliary narrative branch generator incorporates a resource matching algorithm. This algorithm calculates the semantic similarity between various resources and the current needs based on the weak features fed back by the cognitive state tracker, and selects the 1 to 2 resources with the highest scores for combination. For example, if the system determines that the user "is unclear about the path of mechanical transmission," it prioritizes the combination of principle deconstruction animation and simplified interactive tasks; if it determines "lack of cultural and emotional resonance," it prioritizes the combination of inheritor oral history videos and folk song audio. The generated auxiliary narrative branches are encapsulated as independent micro-narrative units, temporarily inserted into the main process by the narrative logic engine, and instruct the virtual scene construction module to load a dedicated scene (such as a "principle explanation room" or "inheritor's study"), while simultaneously notifying the skill logic simulator to pause the main task simulation and instead support the operation verification of the simplified task.

[0027] In addition, the system includes a narrative outcome generation and sharing module. This module is automatically activated after the user completes the main narrative thread or a specific narrative branch. Its workflow is as follows: First, it extracts timestamps and status snapshots of key events from the system logs, including the moment a process is successfully completed for the first time, the complete 3D model of the final work, and the peak point of the cognitive mastery curve. Second, it calls a preset template engine to integrate the above data with a personalized learning report generated by the system (including a heatmap of knowledge point mastery, statistics of operational errors, and cultural understanding scores) into a 3- to 5-minute interactive review video. This video uses a non-linear narrative structure, allowing viewers to click on any moment of operation to replay the scene and view corresponding annotations on the technical principles. Simultaneously, the system can export the final work model as a standard GLB format 3D file, with accompanying metadata tags (such as the intangible cultural heritage project name, user ID, completion time, and skill difficulty level). Users can use this module to share the review video or 3D file to designated social media platforms or community databases with a single click, achieving socialized dissemination and long-term archiving of cultural achievements.

[0028] In summary, this embodiment deeply integrates narrative logic, skill simulation, multimodal interaction, and cognitive assessment to construct an intelligent communication system that can dynamically adapt to individual differences, deeply restore the inherent logic of intangible cultural heritage, and support immersive embodied learning. This system not only solves the shortcomings of traditional digital communication that emphasizes presentation over logic, but also achieves a paradigm shift in cultural communication from "information transmission" to "cognitive construction" through a closed-loop feedback mechanism.

[0029] 2. Example 2 Building upon Example 1, this example further refines the collaborative mechanism between the virtual scene construction module and the craft logic simulator, and introduces knowledge transfer capabilities across intangible cultural heritage projects to enhance the system's versatility and scalability. Please refer to the appendix. Figure 5 This diagram illustrates the multi-level interaction between virtual scene construction and skill logic simulation.

[0030] In this embodiment, the real-time rendering engine of the virtual scene building module is enhanced to support an intelligent rendering architecture that supports two-way "logic-visual" binding. Traditional rendering only focuses on geometry and lighting, while this system requires the rendering results to reflect the internal state of the craft logic simulator. To this end, the system adds a logical state label to each interactive component in the scene element library, such as the "locked / unlocked" state of a puppet joint, the "sharp / dulled" state of a carving tool, and the "intact / cut" state of wood. After each physics calculation, the craft logic simulator broadcasts the latest logical state of all components to the virtual scene building module. Upon receiving the state update, the real-time rendering engine immediately triggers the corresponding visual changes: if a joint is unlocked, a semi-transparent blue highlighted outline is displayed at the joint; if the carving tool becomes dull, the blade shows a gray wear and failure texture; if the wood is cut, the mesh topology is updated in real time and a new cut normal map is generated. This two-way binding ensures that the virtual scene is not only a visual container but also an external interface of the craft logic, allowing users to intuitively perceive the logical changes within the system through visual cues.

[0031] Furthermore, this embodiment introduces a cross-project knowledge transfer mechanism. The system pre-defines a knowledge graph of intangible cultural heritage skills, organized in an ontological form, containing abstract concept nodes such as "tool type," "material properties," "process mode," and "structural paradigm." For example, "hinge connection," as a general structural paradigm, is applicable not only to the elbow joint of Qionghua puppets but also to the mortise and tenon joints of Huizhou wood carving's movable doors or Dong ethnic group's covered bridges. When the system deploys a new intangible cultural heritage project, developers only need to map the project's digital twin model to the existing concept nodes in the knowledge graph to automatically inherit the corresponding auxiliary resources, cognitive assessment rules, and narrative branch templates. For example, the newly added "shadow puppetry lever" project, because its manipulation mechanism is also based on the "lever-rope" principle, can have its animation and interactive tasks automatically reused from the Qionghua puppet project's principle of rope tension, requiring only the replacement of scene elements and cultural background materials. This mechanism significantly reduces the integration cost of new projects and promotes cognitive analogy and knowledge transfer between different intangible cultural heritage categories, enabling users to understand other intangible cultural heritage skills with similar logical structures more quickly after learning one project.

[0032] In terms of multimodal interaction perception, this embodiment adds advanced semantic inference of "operational intention". Based on the force feedback data glove, the system introduces an operation intention recognition model. This model is based on a long short-term memory network, taking as input a 10-second sequence of hand pose and force data, and outputting a probability distribution of the current operation intention, such as "preparing to carve", "attempting assembly", or "checking the structure". This intention information is fed into a cognitive state tracker as an additional feature dimension in calculating cognitive mastery. For example, if the user stares at a part for a long time without operating it, but the intention recognition model outputs a probability of "checking the structure" higher than 0.8, the system determines that the user is observing and thinking, rather than failing to operate, thus avoiding misjudgment of cognitive level.

[0033] Through the above enhancements, this embodiment not only improves the depth and realism of the single-project experience, but also constructs a scalable, portable, and intelligent intangible cultural heritage dissemination platform architecture, providing a technical foundation for large-scale intangible cultural heritage digitization projects.

Claims

1. A system for disseminating intangible cultural heritage based on virtual scenes and narrative interaction, characterized in that, include: The narrative logic engine parses and drives a preset intangible cultural heritage narrative script, which defines the theme of cultural dissemination, plot development, key knowledge nodes, and corresponding virtual scenes and interactive task sequences. The virtual scene construction module dynamically generates and renders a multi-layered three-dimensional virtual environment that matches the intangible cultural heritage skill process and cultural context based on the scene instructions output by the narrative logic engine. The multimodal interaction perception module collects real-time embodied operation data, voice command data, and gaze focus data input by the user through interactive devices. The skill logic simulator performs high-fidelity physical and logical simulations of the core production processes, structural linkage principles, or performance action sequences of intangible cultural heritage projects based on a preset digital twin model of intangible cultural heritage skills, according to the skill simulation instructions scheduled by the narrative logic engine. A cognitive state tracker is used to analyze the interaction data collected by the multimodal interaction perception module in real time, and calculate and update the experiencer's cognitive mastery of preset knowledge nodes by combining the current narrative process and the virtual scene state; a dynamic narrative controller is used to receive the cognitive mastery data output by the cognitive state tracker and compare it with the preset narrative path threshold in the narrative logic engine. When the level of cognitive mastery reaches or exceeds the path threshold of the current narrative node, the narrative logic engine is instructed to advance to the next narrative node; When the level of cognitive mastery is lower than the path threshold of the current narrative node, an auxiliary narrative branch generation instruction is triggered; The auxiliary narrative branch generator is used to respond to the auxiliary narrative branch generation instructions issued by the dynamic narrative controller. Based on the core knowledge elements of the current narrative node, it extracts related multimedia materials, simplified interactive tasks or principle decomposition animations from the preset auxiliary resource library, and instructs the virtual scene construction module and the skill logic simulator to jointly construct a reinforcement learning scene focused on the knowledge point.

2. The intangible cultural heritage dissemination system based on virtual scenes and narrative interaction according to claim 1, characterized in that, The narrative logic engine uses a non-linear narrative framework based on a directed graph structure to organize the pre-set narrative scripts. The non-linear narrative framework includes a main narrative thread and multiple narrative branches belonging to specific knowledge nodes. The main narrative thread defines the core experience process of cultural dissemination, while the narrative branches correspond to knowledge expansion content at different depths or from different perspectives. The narrative logic engine, according to the instructions of the dynamic narrative controller, jumps and merges between the main narrative thread and each of the narrative branches.

3. The intangible cultural heritage dissemination system based on virtual scenes and narrative interaction according to claim 1, characterized in that, The virtual scene construction module includes a scene element library, a spatial relationship rule library, and a real-time rendering engine; the scene element library stores 3D models, material textures, and sound effect resources related to the target intangible cultural heritage project; The spatial relationship rule library defines the positional relationships, scales, and display logic between scene elements at different narrative stages. The real-time rendering engine calls corresponding resources from the scene element library according to the scene instructions of the narrative logic engine, and performs real-time ray tracing and rendering on the graphics processor according to the constraints of the spatial relationship rule library, so as to output a coherent virtual scene including a workshop, natural ecological context, historical street scene, or performance stage.

4. The intangible cultural heritage dissemination system based on virtual scenes and narrative interaction according to claim 1, characterized in that, The multimodal interactive perception module integrates a force feedback data glove, a spatial locator, a microphone array, and an eye tracker. The force feedback data glove is used to capture the posture, position, and applied force and torque data of the user's hands. The spatial locator is used to track the six-degree-of-freedom pose of the user's head. The microphone array is used to collect the user's voice commands and perform noise reduction and semantic recognition. The eye tracker is used to record the three-dimensional coordinates and dwell time of the user's gaze focus in the virtual scene. The multimodal interaction sensing module timestamps and aligns the aforementioned heterogeneous multi-source data with the coordinate system, and then outputs a standardized interactive data stream.

5. The intangible cultural heritage dissemination system based on virtual scenes and narrative interaction according to claim 1, characterized in that, The craft logic simulator is embedded with a digital twin model of intangible cultural heritage crafts. The digital twin model of intangible cultural heritage crafts adopts a multi-level component architecture. The first level is the geometric appearance layer, which defines the three-dimensional mesh model and surface material properties of each component of the intangible cultural heritage object. The second level is the physical property layer, which assigns physical parameters such as mass, elastic modulus, and friction coefficient to each component. The third layer is the constraint relationship layer, which defines the kinematic and dynamic constraints between components, such as hinges, slide rails, gear meshing, and rope traction. The fourth layer is the process logic layer, which encodes the standard procedures, tool usage specifications, and dependencies between steps in the form of a state machine or flowchart. When the craft logic simulator is running, it drives the physical attribute layer and the constraint relationship layer to perform joint calculations according to the instructions of the process logic layer, thereby simulating the entire process of cutting, carving, assembling, debugging, and even performance actions in real time.

6. The intangible cultural heritage dissemination system based on virtual scenes and narrative interaction according to claim 1, characterized in that, The cognitive state tracker performs cognitive state analysis as follows: First, it extracts feature data related to the preset interaction goal of the current narrative node from the standardized interaction data stream output by the multimodal interaction perception module. The feature data includes the completion order and accuracy of operation steps, the correctness of tool selection, the assembly success rate of key structural components, the accuracy of voice answers to principle-based questions, and the duration of gaze dwell on core visual elements. Then, the extracted feature data is input into a pre-trained cognitive assessment model. The cognitive assessment model maps the multidimensional feature data into a scalar value between 0 and 1 through a multi-layer neural network, i.e., cognitive mastery.

7. The intangible cultural heritage dissemination system based on virtual scenes and narrative interaction according to claim 1, characterized in that, The preset narrative path threshold in the dynamic narrative controller is a dynamic variable; the initial value of the narrative path threshold is defined by the narrative script and can be adaptively adjusted according to the overall cognitive progress rate of the experiencer; the adjustment logic is as follows: the system records the average time taken by the experiencer through several previous narrative nodes. If the average time is lower than the preset benchmark value, the path threshold of subsequent nodes is increased; if the average time is higher than the preset benchmark value, the path threshold of subsequent nodes is decreased.

8. The intangible cultural heritage dissemination system based on virtual scenes and narrative interaction according to claim 1, characterized in that, The auxiliary narrative branch generator calls upon an auxiliary resource library whose content structure corresponds one-to-one with the knowledge nodes of the narrative script. For each knowledge node, the auxiliary resource library stores at least three types of auxiliary resources: the first type is principle breakdown animation, which dynamically displays key steps of skills or structural linkage principles in a slow-motion, perspective, and highlighted manner; the second type is simplified interactive tasks, which allow users to repeatedly practice core skill points while reducing operational complexity or the number of steps; the third type is multimedia materials related to cultural background, including historical pictures, oral videos of inheritors, and related folk song audios. The auxiliary narrative branch generator selects the resource type with the highest matching degree from the resource library of the corresponding knowledge node based on the specific weak link characteristics fed back by the cognitive state tracker, and combines them to generate personalized auxiliary narrative branches.

9. A system for disseminating intangible cultural heritage based on virtual scenes and narrative interaction as described in claim 1, characterized in that, The system also includes a narrative outcome generation and sharing module; after the user completes the main narrative thread or a specific narrative branch, the narrative outcome generation and sharing module automatically integrates the key operation moments during the experience process, the final completed virtual model of the work, and the personalized learning report generated by the system to generate an interactive review video or a three-dimensional outcome archive. Users can share the generated results to designated social platforms or community databases through the narrative result generation and sharing module.

10. A system for disseminating intangible cultural heritage based on virtual scenes and narrative interaction as described in claim 5, characterized in that, The real-time rendering engine supports two-way binding between logical state and visual representation; After each physical calculation, the technology logic simulator broadcasts the latest logic state of the component to the virtual scene construction module. After receiving a status update, the real-time rendering engine triggers corresponding visual changes.