Intelligent processing system for cultural gene data

By constructing an intelligent processing system for cultural gene data, we have achieved intelligent processing across the entire chain, solved the problems of data fragmentation and untapped value, improved processing efficiency and data application value, and ensured data security and timeliness.

CN122240902APending Publication Date: 2026-06-19NINGBO INST OF TECH ZHEJIANG UNIV ZHEJIANG

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NINGBO INST OF TECH ZHEJIANG UNIV ZHEJIANG
Filing Date
2026-03-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing cultural gene data processing technologies suffer from fragmented data collection, processing, analysis, and display processes, lack of unified standards, difficulty in achieving end-to-end automated processing, and lack of in-depth value mining and data transformation, resulting in low processing efficiency and insufficient release of application potential.

Method used

A smart data processing system for cultural genes is constructed, including modules for data acquisition, preprocessing, storage, intelligent analysis, feature extraction, judgment, processing execution, and visualization. It adopts multi-source data acquisition terminals, distributed databases, artificial intelligence algorithms, and access control to achieve intelligent processing across the entire chain.

Benefits of technology

It has achieved intelligent processing across the entire data chain from data collection to display, improving processing efficiency, solving the problem of heterogeneous data management, ensuring data security, deeply mining the value of cultural gene data, and ensuring data timeliness through a timed incremental update mechanism.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240902A_ABST
    Figure CN122240902A_ABST
Patent Text Reader

Abstract

This invention provides an intelligent processing system for cultural gene data, comprising a data acquisition module, a data preprocessing module, a data storage module, an intelligent analysis module, a feature extraction module, a judgment module, a processing execution module, a visualization module, and a permission management module. The data acquisition module is communicatively connected to the data preprocessing module, which in turn is communicatively connected to the data storage module. The data storage module is also communicatively connected to the intelligent analysis module, which is communicatively connected to the feature extraction module, and the feature extraction module is communicatively connected to the judgment module. This invention constructs a fully intelligent system from data acquisition to visualization, replacing fragmented, manually-driven processing, standardizing processes, and significantly improving processing efficiency. The data acquisition module covers diverse data types, solving the management challenges of heterogeneous data sources and inconsistent formats, and achieving unified data control.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of cultural gene data processing technology, and more specifically to an intelligent processing system for cultural gene data. Background Technology

[0002] Cultural gene data, as a core carrier of national cultural heritage and regional characteristics, encompasses diverse types such as written documents, audio folklore, image artifacts, and digital archives. Its effective processing is a crucial prerequisite for the protection, inheritance, and innovative application of cultural resources. An intelligent processing system for cultural gene data refers to a technological system that integrates functions such as data collection, organization, analysis, mining, application, and display. It replaces traditional manual operations with technological means to achieve standardized management and value transformation of cultural gene data throughout its entire lifecycle. Currently, the technology for processing cultural gene data is gradually transforming towards digitalization and intelligence. Existing technologies employ single-function modules for partial data processing. For example, they use scanners, recording devices, and other terminals for data collection, basic algorithms for data format conversion and simple cleaning, databases for data storage, and simple charting tools for data visualization. Other technologies attempt to introduce artificial intelligence algorithms to extract keywords from textual data or perform feature recognition on image data, achieving preliminary intelligent analysis of some data. Simultaneously, a number of cultural resource databases have been established within the industry, enabling centralized storage and sharing of some cultural gene data by connecting to existing digital archives. These existing technologies have alleviated the reliance on manual processing of cultural gene data to some extent and promoted the digitization process, but overall, they remain in a decentralized and fragmented development stage. This leads to the following shortcomings in the processing of cultural gene data: 1. Existing technologies mostly involve the intelligentization of single modules or partial processes. Data collection, processing, analysis, and display are fragmented and rely on manual intervention to connect processes, resulting in low processing efficiency, low process standardization, and inability to achieve end-to-end automated data processing. Furthermore, there is a lack of unified collection standards and processing links for diverse data types such as text, audio, images, and archives. Data from different sources has large differences in format and poor compatibility, resulting in insufficient heterogeneous data management capabilities and difficulty in achieving unified data control. 2. Existing technologies mostly remain at the level of basic data processing and storage, lacking the collaborative design of intelligent analysis, feature extraction, and processing execution modules. This makes it difficult to uncover the deeper value hidden in the data, such as inheritance relationships and regional connections, and also fails to effectively transform the data into high-value products such as restoration results, cultural and creative materials, and display resources. Consequently, the application potential of cultural gene data has not been fully released. Summary of the Invention

[0003] To address the shortcomings of existing technologies, this invention provides an intelligent processing system for cultural gene data, which solves the problems mentioned in the background section.

[0004] To achieve the above objectives, the present invention provides the following technical solution: An intelligent processing system for cultural gene data includes a data acquisition module, a data preprocessing module, a data storage module, an intelligent analysis module, a feature extraction module, a judgment module, a processing execution module, a visualization module, and an access control module. The data acquisition module is communicatively connected to the data preprocessing module, the data preprocessing module is communicatively connected to the data storage module, the data storage module is communicatively connected to the intelligent analysis module, the intelligent analysis module is communicatively connected to the feature extraction module, the feature extraction module is communicatively connected to the judgment module, the judgment module is communicatively connected to the processing execution module, the processing execution module is communicatively connected to the visualization module, and the permission management module is communicatively connected to the data storage module, the intelligent analysis module, and the processing execution module. The data acquisition module is used to collect raw data of diverse cultural genes, including textual document data, audio folk custom data, image cultural relic data and digital archive data. The data acquisition module is a multi-source data acquisition terminal, including a scanner, recording equipment, high-definition camera and data interface unit. The data preprocessing module is used to receive the raw data collected by the data acquisition module and to clean, deduplicate, and standardize the format of the raw data. The data storage module is used to store preprocessed cultural gene data and intermediate data during system operation, and adopts a distributed database architecture. The intelligent analysis module is used to perform semantic parsing, correlation analysis, and temporal feature mining on the preprocessed cultural gene data. The feature extraction module is used to extract core feature information from cultural gene data, including cultural symbol features, inheritance lineage features, and regional association features; The determination module is used to determine the processing requirements of cultural gene data based on the feature extraction results, including repair and optimization, classification and archiving, association recommendation and activation application; The processing execution module is used to perform corresponding data processing operations based on the judgment result; The visualization module is used to display the processed cultural gene data in the form of charts, 3D models, or interactive interfaces. The permission management module is used to hierarchically control system access permissions and data operation permissions.

[0005] Furthermore: the specific data acquisition process of the data acquisition module is as follows: For textual document data: The document image is captured by a scanner, and the text content is extracted by OCR recognition technology to form two sets of data, namely image and text. Regarding audio folk data: folk voices and traditional music audio data are collected using recording equipment, while also recording the collection time, geographical information, and information of the inheritors; For image-based cultural relic data: High-definition cameras are used to capture panoramic and detailed images of cultural relics, while simultaneously collecting metadata such as the size, material, and age of the relics; For digitized archival data: Connect to existing cultural databases through data interface units to collect digitized cultural gene data in batches.

[0006] Furthermore, the specific processing procedure of the data preprocessing module is as follows: Data cleaning: Rule-based filtering algorithms are used to remove noisy, incomplete, and abnormal data from the original data; Data deduplication: By calculating and comparing data fingerprints, completely duplicate data is eliminated, while more complete versions of similar data are retained, and relationships are marked. Format standardization: Converting data from different sources into a unified standard format.

[0007] Furthermore: the specific analysis process of the intelligent analysis module is as follows: S1: Semantic parsing: Using the BERT pre-trained model to perform semantic understanding on text-based cultural gene data, extracting keywords, themes and sentiments, and generating semantic labels; S2: Association Analysis: Based on knowledge graph technology, a cultural gene association network is constructed. Using cultural symbols, regions, eras, and inheritors as nodes, the association strength between nodes is analyzed to uncover hidden inheritance relationships and integration paths. S3: Temporal Feature Mining: For cultural gene data with a time dimension, the LSTM time series model is used to analyze the changes in data over time and predict the development trend of cultural genes.

[0008] Furthermore: the specific extraction process of the feature extraction module is as follows: For textual document data: extract core keywords, document genre, author's school of thought, and core ideological features, and use the TF-IDF algorithm to calculate feature weights; For audio folk data: extract the spectral features, rhythm features, pitch features and language features of the audio, and convert the audio signal into a feature vector using the MFCC algorithm; For image-based cultural relic data: extract the shape features, texture features, color features, and ornamentation features of the cultural relic, and use the SIFT algorithm for feature point detection and description; For digitized archival data: extract the category, creation time, storage unit, and core content characteristics of the archives, and establish a feature index library.

[0009] Furthermore: the specific determination process of the determination module is as follows: SS1: Receives the feature vectors and weight values ​​output by the feature extraction module and establishes a processing requirement determination model; SS2: When the integrity of core features is below a preset threshold, it is determined that a "repair and optimization" requirement is needed; SS3: When the category label in the feature vector is clear and there are fewer than 5 associated nodes, it is determined to be a "classification and archiving" requirement; SS4: When there are ≥5 associated nodes in the feature vector and there are cross-regional / cross-era associations, it is determined to be a "relationship recommendation" requirement; SS5: When the feature vector contains visual and interactive core elements, it is determined to be a "revitalized application" requirement; SS6: Transmit the judgment result and the corresponding feature basis to the processing execution module.

[0010] Furthermore: the specific execution process of the processing module is as follows: To address the "repair and optimization" requirement: Generative adversarial networks are used to repair incomplete data. Specifically, text data is supplemented with missing content through context prediction, image data is repaired with texture transfer to repair damaged areas, and audio data is restored with spectral restoration to restore distorted parts. To address the need for "classification and archiving": archives are created according to a three-level classification system of cultural type, region, and era, generating a unique file number and linking it to the data storage module; For the "association recommendation" requirement: Based on the node strength of the association network, recommend relevant cultural gene data and generate an association recommendation list, including data name, association basis and similarity score; To address the need for "revitalized application": transform cultural characteristics into applicable digital resources, including generating cultural and creative design materials, interactive scripts for folk activities, and 3D display models of cultural relics.

[0011] Furthermore, the visualization module includes a text display unit, an image browsing unit, an audio playback unit, and a 3D interactive unit. The text display unit supports pagination, keyword highlighting, and annotation association. The image browsing unit supports zooming, rotation, detail magnification, and comparison viewing functions; The audio playback unit supports playback, pause, fast forward, and sound quality adjustment functions, while displaying audio waveforms and semantic tags; The 3D interactive unit supports rotating, scaling, and cross-sectional viewing of 3D models of cultural relics, and supports virtual roaming of folk custom scenes.

[0012] Furthermore: the specific control process of the permission management module is as follows: Set up three levels of access control: administrator privileges, operation privileges, and browsing privileges; Administrator privileges: Possess full permissions for system configuration, data import / export, permission assignment, and data deletion; Operation permissions: Has permissions to collect, preprocess, analyze, and execute data; does not have permission to delete data. Viewing permissions: Only has viewing permissions for the visualization module, no data operation permissions; The authorization verification uses a dual authentication method of account password and dynamic verification code, and the operation log is recorded in real time and stored in the data storage module.

[0013] Furthermore, it also includes a data update module, which is communicatively connected to the data acquisition module and the data storage module; The data update module has a built-in timed acquisition unit and an incremental update unit; The timed data acquisition unit can be set with a collection cycle, automatically triggering the data acquisition module to update the cultural gene data of a specified category. The incremental update unit is used to compare the differences between newly collected data and stored data, updating only the changed data to reduce data transmission and storage pressure. Once the data update is complete, the intelligent analysis module and feature extraction module are automatically triggered to reprocess the updated data and simultaneously update the content displayed in the visualization module.

[0014] This invention provides an intelligent processing system for cultural gene data. Compared with existing technologies, it has the following advantages: 1. A fully intelligent system was built, from data collection to visualization, replacing fragmented processing dominated by manual intervention, standardizing processes, and significantly improving processing efficiency; the data collection module covers diverse data types, solving the management problems of heterogeneous data sources and messy formats, and achieving unified data control; 2. The access control module is linked with the core business module to establish a hierarchical control mechanism, which structurally ensures the security of data access and operation and reduces the risk of leakage and misoperation; the collaborative design of the intelligent analysis, feature extraction and processing execution modules enables in-depth mining and value transformation of cultural gene data, maximizing the application value of data. 3. By adding a data update module, a dual update mechanism of scheduled and incremental updates is constructed to ensure the timeliness of cultural gene data and solve the problem of data lag. The incremental update unit only updates the difference data, reducing the amount of data transmission and storage pressure, and improving update efficiency. After the update, the intelligent analysis and feature extraction modules are automatically triggered to reprocess the data, and the visualization module and data storage module are updated synchronously to ensure the consistency between the updated data and the original data and maintain the accuracy of system analysis and display. Attached Figure Description

[0015] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0016] Figure 1 A system block diagram of the present invention is shown. Detailed Implementation

[0017] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are described clearly and completely. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0018] Example To address the technical problems in the background section, the following intelligent processing system for cultural gene data is proposed: Combination Figure 1 As shown, the present invention provides an intelligent processing system for cultural gene data, including a data acquisition module, a data preprocessing module, a data storage module, an intelligent analysis module, a feature extraction module, a judgment module, a processing execution module, a visualization module, and an access control module. The data acquisition module is communicatively connected to the data preprocessing module, the data preprocessing module is communicatively connected to the data storage module, the data storage module is communicatively connected to the intelligent analysis module, the intelligent analysis module is communicatively connected to the feature extraction module, the feature extraction module is communicatively connected to the judgment module, the judgment module is communicatively connected to the processing execution module, the processing execution module is communicatively connected to the visualization module, and the permission management module is communicatively connected to the data storage module, the intelligent analysis module, and the processing execution module. The data acquisition module is used to collect raw data of diverse cultural genes, including textual document data, audio folk custom data, image cultural relic data and digital archive data. The data acquisition module is a multi-source data acquisition terminal, including a scanner, recording equipment, high-definition camera and data interface unit. The data preprocessing module is used to receive the raw data collected by the data acquisition module and to clean, deduplicate, and standardize the format of the raw data. The data storage module is used to store preprocessed cultural gene data and intermediate data during system operation, and adopts a distributed database architecture. The intelligent analysis module is used to perform semantic parsing, correlation analysis, and temporal feature mining on the preprocessed cultural gene data. The feature extraction module is used to extract core feature information from cultural gene data, including cultural symbol features, inheritance lineage features, and regional association features; The determination module is used to determine the processing requirements of cultural gene data based on the feature extraction results, including repair and optimization, classification and archiving, association recommendation and activation application; The processing execution module is used to perform corresponding data processing operations based on the judgment result; The visualization module is used to display the processed cultural gene data in the form of charts, 3D models, or interactive interfaces. The permission management module is used to hierarchically control system access permissions and data operation permissions.

[0019] The above solution constructs a fully intelligent system from data acquisition to visualization by setting up a data acquisition module, a data preprocessing module, a data storage module, an intelligent analysis module, a feature extraction module, a judgment module, a processing execution module, a visualization module, and an access control module. This replaces fragmented processing dominated by manual intervention, standardizes processes, and significantly improves processing efficiency. The data acquisition module covers diverse data types, solving the management challenges of heterogeneous data sources and inconsistent formats, and achieving unified data control. The access control module works in conjunction with core business modules to establish a hierarchical control mechanism, structurally ensuring data access and operation security and reducing the risk of leakage and misoperation. The collaborative design of the intelligent analysis, feature extraction, and processing execution modules enables in-depth mining and value transformation of cultural gene data, maximizing the application value of the data.

[0020] In this embodiment, the specific data acquisition process of the data acquisition module is as follows: For textual document data: The document image is captured by a scanner, and the text content is extracted by OCR recognition technology to form two sets of data, namely image and text. For audio folk data: collect folk voice and traditional music audio data through recording equipment, with a sampling rate of not less than 44.1kHz, and record the collection time, geographical information and information of the inheritors; For image-based cultural relic data: panoramic and detailed images of cultural relics are acquired using high-definition cameras, with a resolution of no less than 20 million pixels and an RGB color mode. At the same time, metadata such as the size, material, and age of the cultural relics are also acquired. For digitized archival data: Data interface units are used to connect to existing cultural databases, enabling batch collection of digitized cultural gene data. Multiple formats, including XML, JSON, and CSV, are supported for import. The proposed solution establishes dedicated collection processes for four data types: text, audio, images, and archives. This ensures a unified collection standard and guarantees the integrity and standardization of different data types. Metadata such as region, inheritor, and artifact age are recorded simultaneously during collection, providing foundational data support for subsequent intelligent analysis modules' correlation analysis and time-series mining, thus enhancing data relevance. The digitized archival data is imported in batches through data interface units, achieving seamless integration with existing cultural databases, reducing redundant collection work, and improving data collection efficiency. The standardized collection process ensures uniform raw data format, facilitating subsequent preprocessing modules' cleaning, deduplication, and format standardization, and reducing preprocessing difficulty.

[0021] In this embodiment, the specific processing procedure of the data preprocessing module is as follows: Data cleaning: Rule-based filtering algorithms are used to remove noisy, incomplete, and abnormal data from the original data. Data with more than 30% incompleteness is directly removed, while data with less than 30% incompleteness is filled by interpolation. Data deduplication: By calculating and comparing the MD5 values ​​of data fingerprints, completely duplicate data is eliminated, and versions with more complete information are retained for similar data. The similarity of similar data is ≥85%, and the association relationship is marked. Format standardization: Data from different sources is uniformly converted into a standard format. Text data is uniformly encoded in UTF-8, image data in PNG format, audio data in WAV format, and metadata fields are uniformly specified according to the Dublin Core standard. In the above scheme, data cleaning adopts a rule-based filtering algorithm, deduplication adopts data fingerprint comparison, and format standardization adopts a unified conversion rule, ensuring the stability and consistency of preprocessing results from the process perspective. The cleaning stage removes noise and abnormal data, and completes or removes incomplete data. The deduplication stage retains complete data and marks associations, effectively improving data quality and providing high-quality input data for the intelligent analysis module, ensuring accurate analysis results. The format standardization stage uniformly converts heterogeneous data into a standard format, solving data flow barriers and ensuring that the preprocessed data can be seamlessly connected to the data storage module and the intelligent analysis module, improving the overall system flow efficiency. The noise-free, duplication-free, and uniformly formatted preprocessed data reduces the storage pressure of the data storage module and lays the foundation for accurate extraction by the feature extraction module.

[0022] In this embodiment, the specific analysis process of the intelligent analysis module is as follows: S1: Semantic parsing: Using the BERT pre-trained model to perform semantic understanding on text-based cultural gene data, extracting keywords, themes and sentiments, and generating semantic labels; S2: Association Analysis: Based on knowledge graph technology, a cultural gene association network is constructed. Using cultural symbols, regions, eras, and inheritors as nodes, the association strength between nodes is analyzed to uncover hidden inheritance relationships and integration paths. S3: Temporal Feature Mining: For cultural gene data with a time dimension, including but not limited to the evolution of folk activities and records of cultural relic restoration, the LSTM temporal series model is used to analyze the changes in data over time and predict the development trend of cultural genes. In the above scheme, a combination algorithm of "BERT pre-trained model, knowledge graph and LSTM temporal series model" is used. Customized processes are designed for different analysis needs to improve the depth and accuracy of analysis from a technical point of view. The BERT model realizes accurate semantic parsing of text data, extracts keywords, themes and sentiments, and the generated semantic tags provide accurate text features for the feature extraction module. Knowledge graph technology constructs a cultural gene association network with four core nodes to efficiently mine hidden inheritance relationships and integration paths, enriching the data association value. The LSTM temporal series model performs trend prediction for time dimension data, providing data support for the "activation application" of the processing module. The structured analysis process ensures the reproducibility of the analysis process, improves system stability, and the analysis results provide comprehensive input for the feature extraction module.

[0023] In this embodiment, the specific extraction process of the feature extraction module is as follows: For textual document data: extract core keywords, document genre, author's school of thought, and core ideological features, and use the TF-IDF algorithm to calculate feature weights; For audio folk data: extract the spectral features, rhythm features, pitch features and language features of the audio, and convert the audio signal into a feature vector using the MFCC algorithm; For image-based cultural relic data: extract the shape features, texture features, color features, and ornamentation features of the cultural relic, and use the SIFT algorithm for feature point detection and description; For digitized archival data: extract the category, creation time, storage unit, and core content features of the archives, and establish a feature index library. In the above scheme, dedicated extraction schemes are designed for four types of data: text, audio, images, and archives, to achieve accurate feature extraction and improve feature quality. The extracted features cover the core attributes of the data, and feature weights are calculated through algorithms to establish an index library, providing a quantitative basis for the judgment module's needs. Feature vectorization and weight calculation processing make feature data easier for the judgment module's model to receive and analyze, improving judgment efficiency. The structured extraction process ensures that the feature format of different types of data is unified, facilitating unified processing by the subsequent judgment module. At the same time, the feature index library supports efficient storage and retrieval by the data storage module.

[0024] In this embodiment, the specific determination process of the determination module is as follows: SS1: Receives the feature vectors and weight values ​​output by the feature extraction module and establishes a processing requirement determination model; SS2: When the integrity of core features is lower than a preset threshold, where the integrity of core features is ≥60%, it is determined to be a "repair and optimization" requirement; SS3: When the category label in the feature vector is clear and there are fewer than 5 associated nodes, it is determined to be a "classification and archiving" requirement; SS4: When there are ≥5 associated nodes in the feature vector and there are cross-regional / cross-era associations, it is determined to be a "relationship recommendation" requirement; SS5: When the feature vector contains visual and interactive core elements, it is determined to be a "revitalized application" requirement. The core elements include, but are not limited to, decorative patterns and folk customs. SS6: The judgment results and corresponding feature basis are transmitted to the processing execution module. In the above scheme, a judgment model with feature vectors and weight values ​​is established, and quantitative judgment rules are formulated to achieve accurate matching of demand types, replace subjective judgment, and improve the accuracy of judgment. The judgment rules of the four demand types correspond one-to-one with the four processing schemes of the processing execution module, ensuring that the judgment results can directly guide the processing execution from a structural perspective, thereby improving the efficiency of process connection. The judgment process is structured and reproducible, ensuring that the judgment standards of different batches of data are consistent and improving system stability. The judgment results are accompanied by feature basis, which facilitates targeted operations by the subsequent processing execution module and provides a basis for operation log recording, thereby improving traceability.

[0025] In this embodiment, the specific execution process of the processing execution module is as follows: To address the "repair and optimization" requirement: Generative Adversarial Networks (GANs) are used to repair incomplete data. Text data is supplemented with missing content through context prediction, image data is repaired with texture transfer to repair damaged areas, and audio data is restored with spectral restoration to restore distorted parts. To address the need for "classified archiving": archives are filed according to a three-level classification system of cultural type, region, and era, generating a unique file number and linking it to the data storage module. Among them, cultural types include but are not limited to folklore, cultural relics, documents, and art; regional information includes province, city, and county; and era includes ancient, modern, and contemporary. For the "association recommendation" requirement: Based on the node strength of the association network, recommend relevant cultural gene data and generate an association recommendation list, including data name, association basis and similarity score; To address the need for "revitalized applications," the solution transforms cultural heritage characteristics into usable digital resources, including generating cultural and creative design materials, interactive scripts for folk activities, and 3D display models of cultural relics. Specifically, it develops customized processing solutions for four key demand types to achieve precise processing and improve results. The restoration and optimization solution uses a GAN network to complete missing text, repair damaged images, and restore audio distortion, resolving the issue of damaged cultural heritage data and restoring its integrity and usability. The classification and archiving solution employs a three-tiered system of "cultural type-region-era," generating unique archive numbers and storing them in a linked manner to achieve orderly data management and facilitate subsequent retrieval. The system enhances data management efficiency through efficient data access and invocation. The associated recommendation scheme generates a recommendation list with similarity scores based on the strength of associated network nodes, uncovering hidden connections between data to provide users with accurate data references and deepen data application. The revitalization application scheme transforms cultural gene characteristics into digital resources such as cultural and creative materials, interactive scripts, and 3D models, enabling innovative applications of cultural gene data and promoting cultural inheritance and dissemination. Processed high-value data is directly transmitted to the visualization module, while categorized and archived data is associated with the data storage module, ensuring smooth data flow and achieving seamless integration of "processing-storage-display," thus improving overall system collaboration efficiency.

[0026] In this embodiment, the visualization module includes a text display unit, an image browsing unit, an audio playback unit, and a 3D interaction unit. The text display unit supports pagination, keyword highlighting, and annotation association. The image browsing unit supports zooming, rotation, detail magnification, and comparison viewing functions; The audio playback unit supports playback, pause, fast forward, and sound quality adjustment functions, while displaying audio waveforms and semantic tags; The 3D interactive unit supports rotating, scaling, and cross-sectional viewing of 3D models of cultural relics, and supports virtual roaming of folk custom scenes. In the above solution, the visualization module has built-in four functional units: text, image, audio, and 3D. It designs exclusive display functions for different types of data to improve the relevance of the display and the user experience. The rich interactive functions enhance the interaction between users and data, making cultural gene data easier to understand and apply, and improving the data dissemination effect. The visualization module directly receives high-value data from the processing and execution module, realizes the instant presentation of processing results, and improves the efficiency of process connection. The standardized display functions ensure that users with different permissions can quickly get started, and at the same time, the displayed content corresponds one-to-one with the processing results to ensure the accuracy of data presentation.

[0027] In this embodiment, the specific control process of the permission management module is as follows: Set up three levels of access control: administrator privileges, operation privileges, and browsing privileges; Administrator privileges: Possess full permissions for system configuration, data import / export, permission assignment, and data deletion; Operation permissions: Has permissions to collect, preprocess, analyze, and execute data; does not have permission to delete data. Viewing permissions: Only has viewing permissions for the visualization module, no data operation permissions; The access control system employs a dual authentication method combining account password and dynamic verification code. Operation logs are recorded and stored in the data storage module in real time. This scheme clearly defines three levels of access: administrator, operator, and visitor, defining distinct operational scopes to structurally prevent unauthorized operations and insufficient permissions. Furthermore, the use of dual authentication with account password and dynamic verification code enhances login security and reduces the risk of unauthorized access. Real-time recording and storage of operation logs in the data storage module establishes a complete operation traceability mechanism, facilitating accountability for security issues and improving system maintainability. The access control module is linked to core business modules, ensuring that users with different permissions can only operate on their corresponding modules, thus guaranteeing data security throughout the entire process and supporting stable system operation.

[0028] In this embodiment, a data update module is also included, which is communicatively connected to the data acquisition module and the data storage module; The data update module has a built-in timed acquisition unit and an incremental update unit; The timed data acquisition unit can be set with a collection cycle, automatically triggering the data acquisition module to update the cultural gene data of a specified category. The incremental update unit is used to compare the differences between newly collected data and stored data, updating only the changed data to reduce data transmission and storage pressure. After the data update is completed, the intelligent analysis module and feature extraction module are automatically triggered to reprocess the updated data and simultaneously update the content displayed in the visualization module. In the above solution, by adding a data update module and communicating with the data acquisition module and data storage module, a timed and incremental dual update mechanism is constructed to ensure the timeliness of cultural gene data and solve the problem of data lag. The incremental update unit only updates the difference data, reducing the amount of data transmission and storage pressure and improving update efficiency. After the update, the intelligent analysis and feature extraction modules are automatically triggered to reprocess the data and simultaneously update the visualization module and data storage module to ensure the consistency between the updated data and the original data and maintain the accuracy of system analysis and display. At the same time, a timed acquisition unit with a collection cycle can be set to realize the automated execution of update tasks, reduce manual intervention, improve the system's intelligence level, and provide data freshness guarantee for the long-term stable operation of the system.

[0029] Working principle and usage process of this invention: Data collection: Manually initiated data collection tasks or timed triggering of data update modules; The scanner captures images and uses OCR to recognize text, generating two sets of data: image and text. The recording device captures audio and simultaneously records metadata such as the capture time, region, and inheritor. The high-definition camera captures panoramic and detailed images, recording information such as the size, material, and age of the cultural relics. The data interface unit connects to the existing database to import data in batches. The original data of the diverse cultural genes is transmitted to the data preprocessing module. Data preprocessing: The data preprocessing module receives raw data from the data acquisition module; it filters noise and abnormal data based on rules, directly discarding data with more than 30% incompleteness and completing data with less than 30% using interpolation; it calculates MD5 data fingerprints, removes completely duplicate data, and retains complete versions of similar data with ≥85% similarity and marks them as related; it uniformly converts the data into standard formats such as text UTF-8, images PNG, and audio WAV, with metadata following the DublinCore standard; the cleaned, non-duplicate, and uniformly formatted standardized data is then transmitted to the data storage module for backup. Intelligent analysis and feature extraction: After the data storage module completes standardized data storage, it automatically triggers the analysis process; it extracts text keywords and themes through the BERT model to generate semantic tags; it constructs a knowledge graph to link cultural genes to network nodes: cultural symbols, regions, eras, and inheritors, and explores inheritance relationships; and it uses the LSTM model to analyze time-dimensional data and predict the development trend of cultural genes. Feature extraction: The TF-IDF algorithm is used to extract features such as keywords, genre, and core ideas, and calculate their weights. The MFCC algorithm is used to convert audio into spectral, rhythm, and pitch feature vectors. The SIFT algorithm is used to extract feature points such as shape, texture, and ornamentation. Features such as category, formation time, and core content are extracted and an index library is built. The feature vectors, weight values, and associated network data are transmitted to the decision module. Processing requirement determination: The judgment module receives feature data from the feature extraction module; establishes a processing requirement judgment model, inputting feature vectors and weight values; and determines the requirement type according to rules: core feature integrity < 60% → "Repair and Optimize"; clear category labels and < 5 related nodes → "Classification and Archiving"; ≥ 5 related nodes, cross-regional / cross-era → "Related Recommendation"; contains visual / interactive core elements → "Activate Application"; outputs the judgment result and feature basis, and transmits them to the processing execution module. Processing and execution: The processing and execution module receives the requirement type instructions from the judgment module; it uses a GAN network to complete incomplete text content, repair damaged areas of images, and restore distorted audio; it archives the data according to a three-level system of "culture type-region-era", generates a unique archive number, and stores it in association; based on the strength of the associated network nodes, it generates a recommendation list containing data names, association criteria, and similarity scores; it transforms the data into digital resources such as cultural and creative materials, folk interaction scripts, and 3D models of cultural relics; the processed, standardized, and high-value cultural gene data is then transmitted to the visualization module. Visualization and access control: After the processing module completes the data processing, it synchronizes it to the visualization module; Visual presentation: Text: Pagination browsing, keyword highlighting, and annotation association; Image: Zoom, rotate, zoom in on details, compare and view; Audio: Play / Pause / Fast Forward, sound quality adjustment, and simultaneous display of waveform and semantic tags; 3D: Rotation / section of cultural relic models, virtual tour of folk custom scenes; Access control: Upon login, dual authentication is performed using an account and password along with a dynamic verification code; during this process, administrators can perform all operations, operators can only process data, and visitors can only view data; log entries record operations in real time and store them in the data storage module; and an interactive cultural gene data display interface is provided for users with different permissions. Dynamic data updates: The system is updated either by scheduled data collection or manually; the incremental update unit compares the newly collected data with the stored data and updates only the changed parts; the intelligent analysis and feature extraction modules are automatically triggered to reprocess the updated data; the visualization module's display content and the data storage module's backup are updated synchronously; and the system data is synchronized in real time to ensure the timeliness and integrity of the cultural gene data.

[0030] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0031] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. An intelligent processing system for cultural gene data, characterized in that: It includes a data acquisition module, a data preprocessing module, a data storage module, an intelligent analysis module, a feature extraction module, a judgment module, a processing execution module, a visualization module, and a permission management module; The data acquisition module is communicatively connected to the data preprocessing module, the data preprocessing module is communicatively connected to the data storage module, the data storage module is communicatively connected to the intelligent analysis module, the intelligent analysis module is communicatively connected to the feature extraction module, the feature extraction module is communicatively connected to the judgment module, the judgment module is communicatively connected to the processing execution module, the processing execution module is communicatively connected to the visualization module, and the permission management module is communicatively connected to the data storage module, the intelligent analysis module, and the processing execution module. The data acquisition module is used to collect raw data of diverse cultural genes, including textual document data, audio folk custom data, image cultural relic data and digital archive data. The data acquisition module is a multi-source data acquisition terminal, including a scanner, recording equipment, high-definition camera and data interface unit. The data preprocessing module is used to receive the raw data collected by the data acquisition module and to clean, deduplicate, and standardize the format of the raw data. The data storage module is used to store preprocessed cultural gene data and intermediate data during system operation, and adopts a distributed database architecture. The intelligent analysis module is used to perform semantic parsing, correlation analysis, and temporal feature mining on the preprocessed cultural gene data. The feature extraction module is used to extract core feature information from cultural gene data, including cultural symbol features, inheritance lineage features, and regional association features; The determination module is used to determine the processing requirements of cultural gene data based on the feature extraction results, including repair and optimization, classification and archiving, association recommendation and activation application; The processing execution module is used to perform corresponding data processing operations based on the judgment result; The visualization module is used to display the processed cultural gene data in the form of charts, 3D models, or interactive interfaces. The permission management module is used to hierarchically control system access permissions and data operation permissions.

2. The intelligent processing system for cultural gene data according to claim 1, characterized in that: The specific data acquisition process of the data acquisition module is as follows: For textual document data: The document image is captured by a scanner, and the text content is extracted by OCR recognition technology to form two sets of data, namely image and text. Regarding audio folk data: folk voices and traditional music audio data are collected using recording equipment, while also recording the collection time, geographical information, and information of the inheritors; For image-based cultural relic data: High-definition cameras are used to capture panoramic and detailed images of cultural relics, while simultaneously collecting metadata such as the size, material, and age of the relics; For digitized archival data: Connect to existing cultural databases through data interface units to collect digitized cultural gene data in batches.

3. The intelligent processing system for cultural gene data according to claim 2, characterized in that: The specific processing procedure of the data preprocessing module is as follows: Data cleaning: Rule-based filtering algorithms are used to remove noisy, incomplete, and abnormal data from the original data; Data deduplication: By calculating and comparing data fingerprints, completely duplicate data is eliminated, while more complete versions of similar data are retained, and relationships are marked. Format standardization: Converting data from different sources into a unified standard format.

4. The intelligent processing system for cultural gene data according to claim 3, characterized in that: The specific analysis process of the intelligent analysis module is as follows: S1: Semantic parsing: Using the BERT pre-trained model to perform semantic understanding on text-based cultural gene data, extracting keywords, themes and sentiments, and generating semantic labels; S2: Association Analysis: Based on knowledge graph technology, a cultural gene association network is constructed. Using cultural symbols, regions, eras, and inheritors as nodes, the association strength between nodes is analyzed to uncover hidden inheritance relationships and integration paths. S3: Temporal Feature Mining: For cultural gene data with a time dimension, the LSTM time series model is used to analyze the changes in data over time and predict the development trend of cultural genes.

5. The intelligent processing system for cultural gene data according to claim 4, characterized in that: The specific extraction process of the feature extraction module is as follows: For textual document data: extract core keywords, document genre, author's school of thought, and core ideological features, and use the TF-IDF algorithm to calculate feature weights; For audio folk data: extract the spectral features, rhythm features, pitch features and language features of the audio, and convert the audio signal into a feature vector using the MFCC algorithm; For image-based cultural relic data: extract the shape features, texture features, color features, and ornamentation features of the cultural relic, and use the SIFT algorithm for feature point detection and description; For digitized archival data: extract the category, creation time, storage unit, and core content characteristics of the archives, and establish a feature index library.

6. The intelligent processing system for cultural gene data according to claim 5, characterized in that: The specific determination process of the determination module is as follows: SS1: Receives the feature vectors and weight values ​​output by the feature extraction module and establishes a processing requirement determination model; SS2: When the integrity of core features is below a preset threshold, it is determined that a "repair and optimization" requirement is needed; SS3: When the category label in the feature vector is clear and there are fewer than 5 associated nodes, it is determined to be a "classification and archiving" requirement; SS4: When there are ≥5 associated nodes in the feature vector and there are cross-regional / cross-era associations, it is determined to be a "relationship recommendation" requirement; SS5: When the feature vector contains visual and interactive core elements, it is determined to be a "revitalized application" requirement; SS6: Transmit the judgment result and the corresponding feature basis to the processing execution module.

7. The intelligent processing system for cultural gene data according to claim 6, characterized in that: The specific execution process of the processing module is as follows: To address the "repair and optimization" requirement: Generative Adversarial Networks are used to repair incomplete data. Specifically, text data is supplemented with missing content through context prediction, image data is repaired with texture transfer to repair damaged areas, and audio data is restored with spectral restoration to recover distorted parts. To address the "classification and archiving" requirement: archives are created according to a three-tier classification system of cultural type, region, and era, generating unique file numbers and linking them to the data storage module; For the "association recommendation" requirement: Based on the node strength of the association network, recommend relevant cultural gene data and generate an association recommendation list, including data name, association basis and similarity score; To address the need for "revitalized application": transform cultural characteristics into applicable digital resources, including generating cultural and creative design materials, interactive scripts for folk activities, and 3D display models of cultural relics.

8. The intelligent processing system for cultural gene data according to claim 7, characterized in that: The visualization module includes a text display unit, an image browsing unit, an audio playback unit, and a 3D interactive unit. The text display unit supports pagination, keyword highlighting, and annotation association. The image browsing unit supports zooming, rotation, detail magnification, and comparison viewing functions; The audio playback unit supports playback, pause, fast forward, and sound quality adjustment functions, while displaying audio waveforms and semantic tags; The 3D interactive unit supports rotating, scaling, and cross-sectional viewing of 3D models of cultural relics, and supports virtual roaming of folk custom scenes.

9. The intelligent processing system for cultural gene data according to claim 8, characterized in that: The specific control process of the permission management module is as follows: Set up three levels of access control: administrator privileges, operation privileges, and browsing privileges; Administrator privileges: Possess full permissions for system configuration, data import / export, permission assignment, and data deletion; Operation permissions: Has permissions to collect, preprocess, analyze, and execute data; does not have permission to delete data. Viewing permissions: Only has viewing permissions for the visualization module, no data operation permissions; The authorization verification uses a dual authentication method of account password and dynamic verification code, and the operation log is recorded in real time and stored in the data storage module.

10. The intelligent processing system for cultural gene data according to claim 9, characterized in that: It also includes a data update module, which is communicatively connected to the data acquisition module and the data storage module; The data update module has a built-in timed acquisition unit and an incremental update unit; The timed data acquisition unit can be set with a collection cycle, automatically triggering the data acquisition module to update the cultural gene data of a specified category. The incremental update unit is used to compare the differences between newly collected data and stored data, updating only the changed data to reduce data transmission and storage pressure. Once the data update is complete, the intelligent analysis module and feature extraction module are automatically triggered to reprocess the updated data and simultaneously update the content displayed in the visualization module.