A digital economic empowerment data fusion and intelligent recommendation method and system for the cultural and tourism industry
By employing data fusion and twin simulation technologies such as local differential privacy, cross-modal attention, and dynamic graph neural networks, the problems of privacy leakage and inaccurate recommendations in cultural and tourism data processing have been solved, achieving secure and accurate personalized recommendations and global risk prevention.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN INST OF INFORMATION TECH
- Filing Date
- 2026-03-26
- Publication Date
- 2026-06-26
AI Technical Summary
Existing cultural tourism data processing and recommendation systems suffer from data silos, privacy leaks, lack of semantic alignment of multimodal data, and lack of consequence prediction and global control of recommendation strategies, resulting in inaccurate recommendation results and potential overcrowding.
Local differential privacy and longitudinal federated learning techniques are used to generate desensitized gradients. Data is fused through cross-modal attention mechanism and dynamic graph neural network to construct a twin simulation space for virtual pre-playing. Deep reinforcement learning is used to optimize the recommendation strategy.
It breaks down data silos without compromising user privacy, improves recommendation accuracy and system security, prevents overload, and achieves a balance between personalized experience and overall system optimization.
Smart Images

Figure CN122285992A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of big data processing technology, specifically to a method and system for data fusion and intelligent recommendation empowering the digital economy in the cultural tourism industry. Background Technology
[0002] With the development of the digital economy, the cultural and tourism industry is transforming from a traditional resource-driven model to a data-driven model. However, existing cultural and tourism data processing and recommendation systems still face the following severe technical challenges in practical applications, making it difficult to meet the needs of high-quality development: First, there's the contradiction between data silos and privacy breaches. Existing smart tourism systems (such as some solutions based on centralized data collection) typically require uploading tourists' trajectories, spending, and search records to a central server for unified processing. This centralized architecture not only faces enormous data transmission pressure but also poses a serious risk of privacy breaches. Due to commercial barriers between scenic spots, OTA platforms, and catering merchants, data is difficult to flow across these regions, resulting in a severe "data silo" phenomenon and making it impossible to build a complete tourist profile.
[0003] Second, there is a lack of semantic alignment for multimodal data. Most existing recommendation algorithms are based on text tags or collaborative filtering matrices (such as relying solely on historical click behavior). However, cultural and tourism scenarios have strong visual attributes (such as scenic images) and spatiotemporal dynamics (such as peak and off-peak seasons, and differences between morning and evening). Existing technologies often neglect semantic consistency verification between "visual images" and "text descriptions" (for example, recommending attractions that are beautifully described but differ greatly from the actual scenery), and also lack dynamic capture of spatiotemporal evolution features, resulting in inaccurate recommendation results and significant deviations between user experience and expectations.
[0004] Third, there is a lack of consequence simulation and global control for recommendation strategies. Most existing recommendation systems are "static matching," meaning they directly generate lists based on user preferences, ignoring the potential impact of the recommendation strategy on the physical world. For example, if a system simultaneously recommends the same popular tourist attraction to a large number of users, it can easily lead to overcrowding in the area within a short period, causing congestion or even safety incidents. Current technology lacks a simulation mechanism for "counterfactual reasoning" and "virtual pre-playing" before the strategy is implemented, making it difficult to ensure the overall operational safety and efficiency of the system while meeting personalized needs. Summary of the Invention
[0005] In view of one or more technical defects in the prior art, the present invention proposes the following technical solution.
[0006] According to a first aspect of the present invention, a method for data fusion and intelligent recommendation empowering the cultural tourism industry through the digital economy is proposed, the method comprising: S1: Obtain multi-source cultural tourism datasets and distinguish between user interaction data and resource attribute data; perform localized feature extraction and local differential privacy desensitization processing on distributed user interaction data to generate desensitization gradients; without interacting with the original data, perform encrypted aggregation on the multi-path desensitization gradients and jointly update the parameters of the global feature extraction model. S2: Construct a heterogeneous cultural tourism map using resource attribute data; semantically align the visual and textual features of points of interest in the map using a cross-modal attention mechanism; combine a global feature extraction model and use a dynamic graph neural network to perform temporal convolution on the map to generate dynamic node embedding vectors containing spatiotemporal evolution information. S3: Construct a twin simulation space that maps to the physical scene, embed dynamic nodes into vectors and map them to spatial state parameters; based on historical state sequences, use a sequence prediction model to simulate future passenger flow and facility status under different intervention conditions in the twin simulation space, and output the projection results; S4: Construct a deep reinforcement learning policy network and generate an initial recommendation policy using dynamic node embedding vectors as input; input the initial recommendation policy into the Siamese simulation space to perform virtual pre-playing; based on the risk and reward indicators fed back from the simulation results, use the proximal policy optimization algorithm to iteratively update the policy network until convergence and output the optimal recommendation list.
[0007] Furthermore, in S1, the local differential privacy desensitization process includes: pruning the L2 norm of the local gradient and adding parameters that meet the preset privacy budget using a random response or Gaussian mechanism. The noise vector generates a desensitized gradient that cannot be reversed to restore the original data.
[0008] Furthermore, in S1, the encrypted aggregation adopts a federated averaging strategy: based on the sample size weights of each distributed data source, the uploaded de-identified gradients are weighted and averaged, the calculation results are used to correct the weight parameters of the global model, and then distributed to each data source for synchronization.
[0009] Furthermore, in S2, the calculation formula for the cross-modal attention mechanism is as follows: ,in, The query vector is a text feature mapping. , A key-value vector for visual feature mapping. express The feature dimensions; the weight of visual content in enhancing text semantics is calculated using this formula.
[0010] Furthermore, in S2, the temporal convolution process of the dynamic graph neural network is as follows: the dynamic graph is divided into a sequence of time snapshots, spatial features are extracted using graph convolutional layers, and the spatial features of adjacent time snapshots are aggregated through gated recurrent units to update the dynamic embedding representation of the nodes.
[0011] Furthermore, the deduction in S3 is a counterfactual deduction. The specific process of counterfactual deduction includes: setting counterfactual intervention conditions that include specific diversion actions; calling a pre-trained group movement model to simulate the trajectory evolution under these conditions in a twin simulation space; if the predicted future congestion index exceeds the safety threshold, a negative penalty signal is generated; otherwise, a positive reward signal is generated.
[0012] Furthermore, in S4, the reward function for deep reinforcement learning is constructed as follows: ,in, , These are the predicted click-through rate and conversion rate, respectively. For congestion risk losses based on the simulation results, , These are adaptively adjustable weighting coefficients.
[0013] Furthermore, it also includes a cold start processing step, maintaining a global meta-model based on meta-learning initialization; when a new user with a lack of historical interaction data joins, the meta-model is fine-tuned with a few samples using their static attribute labels to generate an initial strategy.
[0014] According to a second aspect of the present invention, a data fusion and intelligent recommendation system for digital economy empowerment in the cultural tourism industry is proposed, the system comprising: The distributed privacy computing module is configured to acquire multi-source cultural and tourism datasets and distinguish between user interaction data and resource attribute data; perform localized feature extraction and local differential privacy desensitization processing on the distributed user interaction data to generate desensitization gradients; and perform encrypted aggregation on the multi-path desensitization gradients to jointly update the parameters of the global feature extraction model without interacting with the original data. The dynamic graph construction module is configured to construct a heterogeneous cultural and tourism graph using resource attribute data; semantically align the visual and textual features of points of interest in the graph through a cross-modal attention mechanism; and combine a global feature extraction model with a dynamic graph neural network to perform temporal convolution on the graph to generate dynamic node embedding vectors containing spatiotemporal evolution information. The twin simulation module is configured to construct a twin simulation space that maps to the physical scene, embedding dynamic nodes into vectors and mapping them to spatial state parameters; and based on historical state sequences, it uses a sequence prediction model to simulate future passenger flow and facility status under different intervention conditions in the twin simulation space, and outputs the simulation results. The intelligent decision optimization module is configured to construct a deep reinforcement learning policy network, generate an initial recommendation policy with dynamic node embedding vectors as input, input the initial recommendation policy into a twin simulation space to perform virtual pre-playing, and iteratively update the policy network using a proximal policy optimization algorithm based on the risk and reward indicators fed back from the simulation results until convergence and output of the optimal recommendation list.
[0015] The present invention also proposes a computer-readable storage medium storing computer program code, which, when executed by a computer, performs any of the methods described above.
[0016] The technical advantages of this invention are as follows: This invention utilizes local differential privacy and longitudinal federated learning techniques to generate de-identified gradients locally on the client side, exchanging only encrypted parameters. This not only completely blocks attack paths that attempt to reconstruct the user's original trajectory through gradient inversion at the algorithmic level, maximizing the protection of tourist privacy, but also breaks down data barriers between different cultural and tourism entities, enabling multi-party joint modeling without sharing original data, significantly improving the model's generalization ability and feature richness.
[0017] This invention introduces a cross-modal attention mechanism and a dynamic graph neural network, achieving for the first time deep alignment of visual image features and textual semantic features in cultural tourism recommendations, effectively solving the recommendation bias caused by "mismatch between images and text". Simultaneously, by capturing the evolution of node states over time through temporal convolution, the system can perceive the switching between peak and off-peak seasons and changes in day and night views, achieving dynamic spatiotemporal adaptation of the recommendation strategy.
[0018] This invention innovatively uses digital twin technology as a "counterfactual inference sandbox" for deep reinforcement learning. Before the recommendation strategy is issued to the real world, a virtual rehearsal is conducted in the twin simulation space to predict potential risks of passenger congestion or facility overload. This mechanism transforms the traditional "post-event remediation" into "pre-event prevention," finding a game equilibrium between "user satisfaction" and "operational safety" through a near-end policy optimization algorithm, achieving intelligent decision-making that balances personalized experience with overall system optimization. Attached Figure Description
[0019] Other features, objects, and advantages of this application will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings.
[0020] Figure 1 This is a flowchart illustrating a digital economy-enabled data fusion and intelligent recommendation method for the cultural and tourism industry according to an embodiment of the present invention. Figure 2 This is a schematic diagram illustrating the specific implementation process of step S1 according to a specific embodiment of the present invention; Figure 3 This is a schematic diagram illustrating the specific implementation process of step S2 according to a specific embodiment of the present invention; Figure 4 This is a schematic diagram illustrating the specific implementation process of step S3 according to a specific embodiment of the present invention; Figure 5 This is a schematic diagram illustrating the specific implementation process of step S4 according to a specific embodiment of the present invention; Figure 6 This is a schematic diagram of the framework of a digital economy-enabled data fusion and intelligent recommendation system for the cultural and tourism industry, according to an embodiment of the present invention. Detailed Implementation
[0021] The present application will now be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and not intended to limit it. Furthermore, it should be noted that, for ease of description, only the parts relevant to the invention are shown in the accompanying drawings.
[0022] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other. This application will now be described in detail with reference to the accompanying drawings and embodiments.
[0023] Figure 1 This invention illustrates a digital economy-enabled data fusion and intelligent recommendation method for the cultural and tourism industry, comprising: S1: Obtain multi-source cultural tourism datasets and distinguish between user interaction data and resource attribute data; perform localized feature extraction and local differential privacy desensitization processing on the distributed user interaction data to generate desensitized gradients; without interacting with the original data, perform encrypted aggregation on the multi-path desensitized gradients and jointly update the global feature extraction model parameters.
[0024] In specific embodiments, such as Figure 2 As shown, the specific implementation steps of step S1 include: S11: Data Source Segmentation and Preprocessing. Define a multi-source cultural tourism dataset, which can be divided into two categories: The first category is user interaction data, stored on distributed client nodes (such as tourist mobile apps and mini-programs). This includes users' real-time GPS trajectories, historical browsing records, payment amounts, and stay durations. The second category is resource attribute data, stored on the server or in public databases. This includes non-privacy public information, such as the location of the scenic spot (POI), attraction description text, official promotional images, publicly available review images uploaded by tourists, and weather data. In specific applications, the distributed client (such as the tourist mobile terminal) has a built-in permission verification module. Only when the system detects that the user has actively checked "Join the Personalized Experience Plan" and signed the electronic privacy agreement on the UI will it call the underlying sensor (such as GPS, accelerometer) interface to obtain the above data. If no valid authorization signature is detected, it automatically enters "silent mode," providing only general basic services that do not require personal data.
[0025] S12: Local feature extraction. On each client node... On top of that, deploy a lightweight neural network (such as MobileNet or a small Transformer), train it using local user interaction data to identify user preference features, and generate model gradients during the computation process. .
[0026] S13: Local differential privacy anonymization. To prevent the server from inferring user data through gradients, two steps are required before uploading: Step (a) Gradient clipping: Calculate the L2 norm of the gradient vector. Scale it to a preset threshold Within, that is This limits the upper limit of the influence of a single sample on the overall model. Step (b) Noise injection: Using a Gaussian mechanism, noise is generated that follows a distribution. noise vector , The noise multiplier is a hyperparameter that controls the strength of privacy protection; its value is related to the preset privacy budget. Inversely proportional; The gradient clipping threshold set in step (a) above is used to limit the maximum influence norm of a single sample on the model gradient; This represents a Gaussian normal distribution. The noise is superimposed onto the clipped gradient to obtain the desensitized gradient. .
[0027] S14: Ciphertext aggregation and parameter update. The server only receives de-identified gradients. The raw data cannot be accessed. A federated averaging strategy is adopted. , This indicates the total number of client nodes participating in this round of aggregation; Indicates the first The number of data samples used for training locally on each client; This represents the sum of all client data samples participating in the aggregation. It utilizes the aggregated global gradient. Update the weight parameters of the global model It then distributes the results back to the client to complete one round of collaborative training.
[0028] Step S1 achieves "data usable but not visible." Through local differential privacy technology, even if an attacker intercepts the uploaded gradient, the presence of noise prevents the reconstruction of the user's specific trajectory or consumption records. Simultaneously, federated learning breaks down data silos, enabling the model to learn the collective behavioral characteristics of all users across the network, thus solving the problem of single-point data sparsity.
[0029] S2: Construct a heterogeneous cultural tourism map using resource attribute data; perform semantic alignment of visual and textual features of points of interest in the map using a cross-modal attention mechanism; combine a global feature extraction model and use a dynamic graph neural network to perform temporal convolution on the map to generate dynamic node embedding vectors containing spatiotemporal evolution information.
[0030] In specific embodiments, such as Figure 3 As shown, the specific implementation steps of step S2 include: S21: Heterogeneous Graph Construction. The graph is constructed using the resource attribute data in S1. Nodes include: user nodes (U), POI nodes (I), and environment nodes (E, such as "Sunny Day" or "Weekend"). Edges represent interaction relationships (such as "Click" or "Located at").
[0031] S22: Cross-modal feature alignment. For POI nodes, there are often situations where "the image is beautiful but the text is mediocre" or "the text is exaggerated but the reality is poor." This application utilizes an attention mechanism to align visual and textual features: text encoding, using BERT to extract POI description text, and generating query vectors through linear transformation. Visual encoding uses ResNet to extract features from POI images and generates key vectors through linear transformation. Sum value vector Attention calculations ultimately generate a text-enhanced feature representation that incorporates visual weights. , express The formula calculates which regions of visual features (such as details of mountains and rivers) are most relevant to the text description (such as "strange peaks and rocks"), and then weights and fuses them to generate a multimodal enhanced feature vector.
[0032] S23: Dynamic graph temporal convolution. The graph is sliced according to time windows (e.g., one snapshot per hour) to form a snapshot sequence. Spatial convolution uses a graph convolutional network (GCN) to aggregate neighbor information within each snapshot. Temporal aggregation uses a gated recurrent unit (GRU) to process the time series data. The node state at any given time is passed to The final output is a dynamic embedding vector for each node, which contains not only multimodal attributes (e.g., what it is) but also spatiotemporal state (e.g., whether it is hot now).
[0033] Step S2 addresses the inaccuracy of single-modal recommendations through cross-modal alignment (e.g., avoiding recommendations of misleading attractions). Dynamic graph temporal convolution enables time-aware recommendations. For example, for the same attraction, the system-generated embedding vector differs between "daytime" and "nighttime," thus accurately recommending "nighttime" attractions.
[0034] S3: Construct a twin simulation space that maps to the physical scene, embed dynamic nodes into vectors and map them to spatial state parameters; based on historical state sequences, use a sequence prediction model to simulate future passenger flow and facility status under different intervention conditions in the twin simulation space, and output the projection results.
[0035] In specific embodiments, such as Figure 4 As shown, the specific implementation steps of step S3 include: S31: Twin simulation space mapping. Construct a virtual 3D space that maps to the physical scenic area at a 1:1 scale. Decode the "dynamic node embedding vector" generated in S2 into parameters for the simulation system.
[0036] For example: User embedding vector → movement speed and consumption willingness probability of virtual agents; POI embedding vector → attraction radius and service capacity of virtual facilities.
[0037] S32: Counterfactual intervention setting. Define a hypothetical problem: "What would happen if we now recommended the 'back mountain trail' route to these 5,000 users who enjoy hiking?" Here, the "recommendation action" has not yet occurred in reality; it is only used as an instruction input to the simulation system.
[0038] S33: Sequence Prediction and Inference. The simulation system calls a pre-trained swarm movement model, combined with an LSTM (Long Short-Term Memory) network, to simulate the evolution over the next 30 minutes to 2 hours. Output: Predictions Real-time passenger flow density map, queue waiting time, and expected facility load rate for each area.
[0039] In specific embodiments, risk assessment and signal generation are also included: the system will predict... The real-time passenger flow is compared with a preset safety threshold (e.g., 4 people per square meter). If the predicted congestion index exceeds the safety threshold, the twin simulation space will generate a negative penalty signal, which will be passed on as strong negative feedback to the subsequent decision network; conversely, if the predicted passenger flow is evenly distributed and the facility load is in the optimal range, a positive reward signal will be generated to encourage the generation of this strategy.
[0040] In this step, counterfactual reasoning allows us to assess the potential consequences of thousands of recommendation combinations without interfering with the real world, and to select the safest and most efficient solution. Specifically, unlike traditional recommendation systems that only know if there will be traffic congestion after sending recommendations, this solution knows this before sending them.
[0041] S4: Construct a deep reinforcement learning policy network and generate an initial recommendation policy using dynamic node embedding vectors as input; input the initial recommendation policy into the Siamese simulation space to perform virtual pre-playing; based on the risk and reward indicators fed back from the simulation results, use the proximal policy optimization algorithm to iteratively update the policy network until convergence and output the optimal recommendation list.
[0042] In specific embodiments, such as Figure 5 As shown, the specific implementation steps of step S4 include: S41: Construct a reinforcement learning environment. This learning environment may include: agent (recommendation algorithm model), environment (a twin simulation space in S3, not the real world), state (dynamic node embedding vectors output in S2, including current user interests and attraction popularity), and action (generating a candidate recommendation list).
[0043] S42: Virtual Preview. After the agent generates actions (recommendation list), it directly inputs them into the twin simulation space of S3. The twin simulation space runs the simulation and provides feedback on the prediction results (e.g., predicting a high click rate, but predicting that the area will be severely congested).
[0044] S43: Calculation of the comprehensive reward function. The system calculates the reward based on the simulation results. ,in, , These are the predicted click-through rate and conversion rate, respectively. For congestion risk losses based on the simulation results, , These are adaptively adjustable weighting coefficients. Wherein, It is dynamic; when the simulated predicted density exceeds the safety threshold, This will increase exponentially, forcing the model to abandon high-return but high-risk recommendations.
[0045] In a specific embodiment, the deep reinforcement learning policy network adopts a multi-task learning architecture. Besides outputting actions (recommendation lists), this network also includes a click-through rate (CTR) prediction tower and a conversion rate (CTR) prediction tower. The CTR prediction tower receives a state vector (user dynamic node embedding vector) and an action vector (recommended item embedding vector) as inputs. Through a multilayer perceptron (MLP) and a sigmoid activation function, it outputs the estimated click probability of the user on the recommendation list. The conversion rate prediction tower is also based on state and action vectors, using an independent MLP layer to predict the probability that a user will make an actual purchase (such as buying tickets or placing an order) after clicking. During training, the two prediction towers are jointly supervised and trained using historical real-world interaction data (clicked tags and purchased tags) through a binary cross-entropy loss function to ensure the accuracy of the predictions.
[0046] In a specific embodiment, congestion risk loss The twin simulation space derivation calculation in step S3 shows that the twin simulation space divides the scenic area into... The grid area, at the time of simulation Statistics for each grid The number of virtual agents within the network, and the calculation of real-time density. The safe carrying capacity threshold for each area is set as follows: (For example, 2 people / square meter). Normalize and sum the exceedance levels for all grid regions: ,in, The function ensures that a positive loss value is only generated when the density exceeds a safe threshold; when all areas are uncongested, This calculation method ensures that the risk penalty can accurately identify local congestion hotspots, rather than simply focusing on average density.
[0047] S44: Policy Update. Utilizing a near-end policy optimization algorithm, based on the reward... Update the parameters of the policy network. By limiting the step size of policy updates, the stability of the learning process is ensured, preventing the model from drastically oscillating between "pursuing profits" and "avoiding congestion".
[0048] S45: Optimal Strategy Output. Once the model converges (i.e., it finds the strategy that maximizes profit without congestion), the final recommendation list is sent to the clients of real tourists.
[0049] Step S4 addresses the short-sighted problem of traditional recommendation systems that "only consider individual preferences and ignore the overall impact." It achieves an automatic balance of interests among three parties: "tourist experience (avoiding traffic congestion)," "merchant revenue (increased consumption)," and "management (safe operation)," thus realizing true digital economy empowerment.
[0050] In specific embodiments, a cold start handling mechanism for new users is also included. For newly registered users or those lacking historical interaction data, this invention introduces a meta-learning-based initialization mechanism. The server maintains a globally shared meta-model, which consists of general recommendation parameters extracted from the common features of massive amounts of historical users. When a new user joins, the system uses a small number of static attributes (such as age group, gender, city of origin, and interest tags) provided during registration as a support set. On the client side, the downloaded meta-model is fine-tuned using these static attributes with a few samples (rapid gradient iteration of model parameters using only a very small amount of sample data provided by the new user). Only a few gradient updates are needed to quickly generate an initial personalized recommendation strategy adapted to the new user, thus solving the problem of traditional recommendation systems being unable to accurately serve new users.
[0051] To better illustrate the practical application effects of this invention, the following uses the "XX Ancient Town" scenic area as an example to simulate the operation of this invention during a peak Golden Week holiday. The data in this embodiment (such as visitor flow, congestion index, etc.) are merely exemplary data used to demonstrate the input-output relationship of the algorithm and do not represent actual measurement results for a specific physical scenic area.
[0052] Assume 50,000 tourists in the scenic area use the "XX Smart Travel" APP. The APP collects tourists' real-time location (currently at the south gate of the ancient town), browsing history (likes "ancient style photography" and "local snacks"), and spending records on their phones. The mobile model calculates the feature gradient indicating the tourist's interest in the "nighttime lantern festival." Before uploading, Gaussian noise is added to the gradient using local differential privacy technology. The scenic area's cloud server receives the noisy gradient and cannot deduce the tourist's specific itinerary, but it can aggregate the group characteristic that "tourists in the south gate area are generally interested in nighttime activities."
[0053] The scenic area database contains descriptive text for a particular attraction (panoramic view overlooking the ancient town, unobstructed vista) and recently uploaded photos from tourists (showing heavy fog and low visibility). Using a cross-modal attention mechanism, the system identifies a semantic conflict between the text's "unobstructed vista" and the image's "low visibility." The system automatically reduces the attraction's recommendation weight in the viewing dimension and generates a new feature vector incorporating "rainy and foggy atmosphere," thus avoiding recommending inappropriate viewing experiences to tourists.
[0054] The recommendation engine originally planned to push the suggestion "go to the central square to watch the performance" to 5,000 tourists located at the south gate. Before the strategy was issued, the system input this instruction into the twin simulation space. The 5,000 agents in the virtual world moved towards the central square according to the instruction. The simulation model, combined with LSTM prediction, showed that if the strategy was implemented, the congestion index leading to the square could reach 9.5 (severe congestion) after 20 minutes, and there was a risk of stampede. The simulation environment then generated a severe negative penalty signal.
[0055] Upon receiving a negative penalty signal, the deep reinforcement learning policy network immediately triggers proximal policy optimization. The policy is modified: 3,000 out of 5,000 tourists are recommended to visit a certain stage on the east side, with an additional 20% discount coupon for tea and drinks at the stage as an incentive; only 2,000 tourists are allowed to go to the central square. A second simulation in a twin space shows that both paths are comfortable, and the overall estimated spending will increase to some extent. The system then pushes this validated "diversion + discount" strategy to the mobile phones of real tourists.
[0056] Figure 6This invention illustrates a data fusion and intelligent recommendation system for digital economy empowerment in the cultural tourism industry. The system includes: a distributed privacy computing module 601, a dynamic graph construction module 602, a twin simulation and deduction module 603, and an intelligent decision optimization module 604. The distributed privacy computing module 601 is configured to acquire multi-source cultural tourism datasets and distinguish between user interaction data and resource attribute data; perform localized feature extraction and local differential privacy desensitization processing on the distributed user interaction data to generate desensitization gradients; and perform encrypted aggregation on the multi-path desensitization gradients to jointly update the global feature extraction model parameters without interacting with the original data. The dynamic graph construction module 602 is configured to construct a heterogeneous cultural tourism graph using resource attribute data; and perform semantic matching of visual and textual features of interest points in the graph through a cross-modal attention mechanism. The system integrates a global feature extraction model and uses a dynamic graph neural network to perform temporal convolution on the graph to generate dynamic node embedding vectors containing spatiotemporal evolution information. A twin simulation deduction module 603 is configured to construct a twin simulation space mapped to the physical scene, mapping the dynamic node embedding vectors to spatial state parameters. Based on historical state sequences, a sequence prediction model is used to simulate future passenger flow and facility status under different intervention conditions in the twin simulation space, outputting the deduction results. An intelligent decision optimization module 604 is configured to construct a deep reinforcement learning policy network, generating an initial recommendation policy with the dynamic node embedding vectors as input. The initial recommendation policy is input into the twin simulation space to perform a virtual pre-simulation. Based on the risk and benefit indicators fed back from the deduction results, the proximal policy optimization algorithm is used to iteratively update the policy network until convergence, outputting the optimal recommendation list.
[0057] For ease of description, the above system is described by dividing it into various functional modules. Of course, in implementing this application, the functions of each module can be implemented in one or more software and / or hardware. This system can perform all the detailed steps of the aforementioned methods.
[0058] As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that this application can be implemented by means of software plus necessary general-purpose hardware platforms. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the apparatus described in various embodiments or some parts of the embodiments of this application.
[0059] Finally, it should be noted that the above embodiments are for illustration only and not for limiting the technical solutions of the present invention. Although the present invention has been described in detail with reference to the above embodiments, those skilled in the art should understand that modifications or equivalent substitutions can still be made to the present invention without departing from the spirit and scope of the present invention. Any modifications or partial substitutions should be covered within the scope of the claims of the present invention.
Claims
1. A method for data fusion and intelligent recommendation empowering the cultural tourism industry through digital economy, characterized in that: The method includes: S1: Obtain multi-source cultural tourism datasets and distinguish between user interaction data and resource attribute data; perform localized feature extraction and local differential privacy desensitization processing on distributed user interaction data to generate desensitization gradients; without interacting with the original data, perform encrypted aggregation on the multi-path desensitization gradients and jointly update the parameters of the global feature extraction model. S2: Construct a heterogeneous cultural tourism map using the resource attribute data; perform semantic alignment between the visual features and textual features of points of interest in the map using a cross-modal attention mechanism; combine a global feature extraction model and use a dynamic graph neural network to perform temporal convolution on the map to generate dynamic node embedding vectors containing spatiotemporal evolution information. S3: Construct a twin simulation space that maps to the physical scene, embed dynamic nodes into vectors and map them to spatial state parameters; based on historical state sequences, use a sequence prediction model to simulate future passenger flow and facility status under different intervention conditions in the twin simulation space, and output the projection results; S4: Construct a deep reinforcement learning policy network and generate an initial recommendation policy using dynamic node embedding vectors as input; input the initial recommendation policy into the Siamese simulation space to perform virtual pre-playing; based on the risk and reward indicators fed back from the simulation results, use the proximal policy optimization algorithm to iteratively update the policy network until convergence and output the optimal recommendation list.
2. The method according to claim 1, characterized in that, In step S1, the local differential privacy desensitization process includes: pruning the L2 norm of the local gradient and adding a random response or Gaussian mechanism to satisfy a preset privacy budget. The noise vector generates a desensitized gradient that cannot be reversed to restore the original data.
3. The method according to claim 1 or 2, characterized in that, In S1, the encrypted aggregation adopts a federated averaging strategy: the uploaded de-identified gradient is weighted and averaged according to the sample size weight of each distributed data source, the calculation result is used to correct the weight parameters of the global model, and then distributed to each data source for synchronization.
4. The method according to claim 1, characterized in that, In S2, the calculation formula for the cross-modal attention mechanism is: ,in, The query vector is a text feature mapping. , A key-value vector for visual feature mapping. express The feature dimensions; the weight of visual content in enhancing text semantics is calculated using this formula.
5. The method according to claim 1 or 4, characterized in that, In S2, the temporal convolution process of the dynamic graph neural network is as follows: the dynamic graph is divided into a sequence of time snapshots, spatial features are extracted using graph convolutional layers, and spatial features of adjacent time snapshots are aggregated through gated recurrent units to update the dynamic embedding representation of nodes.
6. The method according to claim 1, characterized in that, The deduction in S3 is a counterfactual deduction. The specific process of the counterfactual deduction includes: setting counterfactual intervention conditions that include specific diversion actions; calling a pre-trained group movement model to simulate the trajectory evolution under these conditions in a twin simulation space; if the predicted future congestion index exceeds the safety threshold, a negative penalty signal is generated; otherwise, a positive reward signal is generated.
7. The method according to claim 1, characterized in that, In step S4, the reward function for deep reinforcement learning is constructed as follows: ,in, , These are the predicted click-through rate and conversion rate, respectively. For congestion risk losses based on the simulation results, , These are adaptively adjustable weighting coefficients.
8. The method according to claim 1, characterized in that, It also includes a cold start processing step, maintaining a global meta-model based on meta-learning initialization; when a new user with a lack of historical interaction data joins, the meta-model is fine-tuned with a few samples using their static attribute labels to generate an initial strategy.
9. A digital economy-enabled data fusion and intelligent recommendation system for the cultural tourism industry, characterized in that: The system includes: The distributed privacy computing module is configured to acquire multi-source cultural and tourism datasets and distinguish between user interaction data and resource attribute data; perform localized feature extraction and local differential privacy desensitization processing on the distributed user interaction data to generate desensitization gradients; and perform encrypted aggregation on the multi-path desensitization gradients to jointly update the parameters of the global feature extraction model without interacting with the original data. The dynamic graph construction module is configured to construct a cultural tourism heterogeneous graph using the resource attribute data; perform semantic alignment of visual features and text features of interest points in the graph through a cross-modal attention mechanism; and combine the global feature extraction model to perform temporal convolution on the graph using a dynamic graph neural network to generate dynamic node embedding vectors containing spatiotemporal evolution information. The twin simulation inference module is configured to construct a twin simulation space that maps to the physical scene, embedding the dynamic nodes into vectors and mapping them to spatial state parameters; and based on historical state sequences, it uses a sequence prediction model to simulate future passenger flow and facility status under different intervention conditions in the twin simulation space, and outputs the inference results. The intelligent decision optimization module is configured to construct a deep reinforcement learning policy network, generate an initial recommendation policy using the dynamic node embedding vector as input; input the initial recommendation policy into the twin simulation space to perform virtual pre-playing; and iteratively update the policy network using a proximal policy optimization algorithm based on the risk and reward indicators fed back from the simulation results, until convergence and output of the optimal recommendation list.
10. A computer-readable storage medium storing computer program code that, when executed by a computer, performs the method of any one of claims 1-8.