A multi-level security collaborative training method and system for large models

By constructing a secure network and a multi-layered protection mechanism, the contradiction between privacy protection and model performance in large model training is resolved, achieving reduced computational overhead and improved training results while ensuring data privacy and security.

CN122247702APending Publication Date: 2026-06-19NAN FANG DIAN WANG GONG YING LIAN (GUANG XI) YOU XIAN GONG SI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NAN FANG DIAN WANG GONG YING LIAN (GUANG XI) YOU XIAN GONG SI
Filing Date
2026-03-30
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In the training of existing large models, it is difficult to balance data privacy protection and model performance. Traditional encryption methods have high computational overhead and lack differentiated protection, resulting in a contradiction between the strength of privacy protection and training effect.

Method used

A secure network is built for local data aggregation. Homomorphic encryption or data synthesis is used to encrypt the data into a usable but invisible state. The data is protected by a multi-layered collaborative protection mechanism, including differential privacy, data desensitization, and anonymization.

Benefits of technology

While ensuring privacy and security, we can reduce data computation overhead, balance privacy protection and model performance, and obtain safe and compliant training data for large-scale model iterative optimization.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122247702A_ABST
    Figure CN122247702A_ABST
Patent Text Reader

Abstract

This application discloses a multi-layered secure collaborative training method and system for large models, relating to the field of data encryption. The method includes: constructing and utilizing a secure network to obtain raw operational data from the enterprise's local environment; aggregating model parameters to obtain aggregated training data; encrypting the aggregated training data into a usable but invisible state using homomorphic encryption or data synthesis methods to obtain encrypted training data; classifying the encrypted training data according to its category and then using a multi-layered secure collaborative protection mechanism to securely protect the encrypted training data, obtaining the final training data; and using the final training data to train the model to obtain the trained target large model. This application achieves local data aggregation and encrypted transmission by constructing a secure network, and combines multi-layered collaborative protection mechanisms such as differential privacy, data anonymization, and anonymization to balance privacy protection and model performance, reduce data computation overhead, and provide secure and compliant training data for large model training.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data encryption, and in particular to a multi-level secure collaborative training method and system for large models. Background Technology

[0002] The in-depth application of large-scale models in fields such as natural language processing, computer vision, and scientific computing has led to an exponential increase in the demand for high-quality, large-scale data for model training. The raw operational data accumulated locally by enterprises contains rich business knowledge and real-world scenario characteristics, becoming a key resource for enhancing the capabilities of large-scale models. However, this type of data often involves user privacy, trade secrets, or sensitive information subject to regulation. Throughout the entire data collection, transmission, aggregation, and training process, privacy leaks, data misuse, and security compliance risks are becoming increasingly prominent. Traditional data protection methods, such as simple encryption or basic anonymization, often struggle to strike a balance between the strength of privacy protection, model training effectiveness, and computational efficiency: overly strong privacy protection can degrade model performance, while weak security measures fail to meet compliance requirements.

[0003] How can we shift from a "data blocking" mentality to "data governance" to avoid completely prohibiting data flow and stifling AI development? How can we ensure that data is used under the premise of being "usable but not visible" and "controllable and measurable"? In current large-scale model training, data privacy protection and model performance are often difficult to balance. Single encryption methods have high computational costs, traditional desensitization methods are prone to information loss, and there is a lack of differentiated protection mechanisms for different data categories, resulting in a contradiction between the strength of privacy protection and training effectiveness. Summary of the Invention

[0004] The purpose of this application is to provide a multi-layered secure collaborative training method and system for large models. It can achieve local data aggregation and encrypted transmission by constructing a secure network. Combined with multi-layered collaborative protection mechanisms such as differential privacy, data anonymization, and anonymization, it can balance privacy protection and model performance, reduce data computation overhead, and provide secure and compliant training data for training large models.

[0005] To achieve the above objectives, this application provides the following solution: Firstly, this application provides a multi-layered secure collaborative training method for large models. The method includes: constructing a secure network; using the secure network to obtain raw operational data from the enterprise's local machine; and aggregating model parameters on the raw operational data to obtain aggregated training data; the model parameters include at least gradient updates; encrypting the aggregated training data into a usable but invisible state using homomorphic encryption or data synthesis to obtain encrypted training data; classifying the encrypted training data into categories; and using a multi-layered secure collaborative protection mechanism to protect the encrypted training data according to its category to obtain the final training data; the multi-layered secure collaborative protection mechanism includes at least differential privacy protection, data desensitization protection, and data anonymization protection; and using the final training data to iteratively optimize and train the target large model to obtain a trained target large model.

[0006] In a second aspect, this application also provides a computer system, including: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the multi-level secure collaborative training method for large models described in the first aspect.

[0007] According to the specific embodiments provided in this application, the following technical effects are disclosed: This application constructs a secure network to acquire raw operational data and aggregate model parameters locally within the enterprise, avoiding the risk of data leakage. Subsequently, homomorphic encryption or data synthesis methods are used to transform the aggregated training data into a usable but invisible form, ensuring privacy and security during transmission and use. Furthermore, the encrypted training data is categorized, and a multi-layered security collaborative protection mechanism, comprising differential privacy, data anonymization, and other techniques, is applied to protect the data according to its category. This application balances privacy protection and model performance while reducing computational overhead, ultimately yielding secure and compliant final training data for iterative optimization training of the target large-scale model, thereby obtaining a well-trained target large-scale model with a balance between performance and privacy. Attached Figure Description

[0008] To more clearly illustrate the technical solutions in the embodiments of this application or related technologies, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0009] Figure 1 This is a flowchart illustrating a multi-level secure collaborative training method for large models, provided in an embodiment of this application.

[0010] Figure 2This is a schematic diagram of a multi-layered security collaboration mechanism provided in an embodiment of this application.

[0011] Figure 3 This is a schematic diagram of the large-model reinforcement learning process provided in an embodiment of this application.

[0012] Figure 4 This is an internal structure diagram of a computer system provided in an embodiment of this application. Detailed Implementation

[0013] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0014] To make the above-mentioned objectives, features and advantages of this application more apparent and understandable, the application will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0015] Example 1, such as Figures 1-3 As shown, this embodiment provides a multi-layered secure collaborative training method for large models, the method including: S1. Construct a secure network, use the secure network to obtain raw operational data from the enterprise's local machine, and aggregate model parameters from the raw operational data to obtain aggregated training data; the model parameters include at least gradient updates.

[0016] Furthermore, step S1 specifically includes: S11. Build a secure network.

[0017] S12. Use a secure network to obtain raw operating data from the enterprise's local machine in a non-shared manner; the enterprise only stores the raw operating data locally, does not share the raw operating data during training, and only shares the trained target large model.

[0018] S13. Model parameters are aggregated from the original running data using model inversion attack defense, data poisoning attack defense, and multi-layer privacy protection methods to obtain aggregated training data.

[0019] Furthermore, as shown in Table 1, the model inversion attack defense specifically includes: data strengthening encryption and differential privacy.

[0020] The data poisoning attack defense specifically includes: data verification and cleaning, and Byzantine fault tolerance.

[0021] The multi-layered privacy protection specifically includes: security audit monitoring and security assessment based on multi-party cross-validation.

[0022] Table 1. Privacy Protection Features Comparison Table

[0023] In practical applications, large model training requires data aggregation. However, the transmission of confidential data during aggregation poses security risks, necessitating security defenses. Secure data aggregation primarily includes: 1. Model Inversion Attacks: Attackers attempt to deduce the original training data of the participants by observing the model's output (such as prediction results) and the publicly available model structure. For example, in an aggregated training model for image recognition, an attacker may be able to gradually reconstruct the image data in the training set by carefully constructing input data and observing the model's output.

[0024] 2. Data Poisoning Attacks: Malicious actors inject carefully crafted "toxic" data samples into the training data, attempting to influence the training results of the global model and cause it to deviate or err on specific tasks. For example, in an aggregated training model for spam classification, malicious actors can add a large number of fake, legitimate email samples to the training data, leading to a decrease in the accuracy of the trained model in identifying spam.

[0025] 3. Multi-layered privacy protection: While the aggregated training employs various privacy protection technologies, vulnerabilities in the technical implementation or improper configuration may still pose a risk of privacy leakage. For example, during encrypted communication, if the encryption key is poorly managed and obtained by an attacker, the privacy of the data during transmission cannot be guaranteed.

[0026] Data verification and cleaning are crucial: before participating in aggregated training, each participant rigorously verifies and cleans their data to remove outliers and potentially malicious data. Simultaneously, the central server can employ data quality detection algorithms to indirectly assess the data quality of model updates uploaded by each participant. For example, by analyzing the statistical characteristics of model updates, it can determine whether there are abnormal fluctuations, thus identifying potential data poisoning behavior.

[0027] Byzantine Fault Tolerance Algorithm: This algorithm ensures the aggregation training system can function correctly even in the presence of malicious participants (Byzantine nodes). These algorithms eliminate interference from malicious nodes and guarantee the correctness of the global model aggregation result by performing consistency checks and voting mechanisms on model updates uploaded by participants. For example, during model aggregation, the central server can perform multiple rounds of comparison and filtering of model updates uploaded by each participant, accepting only updates from the majority of legitimate participants, thereby reducing the impact of data poisoning attacks on the global model.

[0028] Security Audit and Monitoring: Establish a comprehensive security audit and monitoring mechanism to monitor the operation of the aggregated training system in real time. Regularly check the system's encryption settings, data transmission logs, model training parameters, etc., to promptly identify potential privacy leakage risks. Once an anomaly is detected, take swift measures to repair and handle it.

[0029] Multi-party cross-validation security assessment: Before and during the deployment of the aggregated training system, professional security organizations or multiple parties are invited to participate in cross-validation and security assessment. By simulating various attack scenarios, the system's security is comprehensively tested to identify and fix potential security vulnerabilities. Simultaneously, the system's security policies and technical solutions are regularly updated and optimized to adapt to constantly evolving security threats.

[0030] S2. Use homomorphic encryption or data synthesis to encrypt the aggregated training data into a usable but invisible state to obtain encrypted training data.

[0031] Furthermore, step S2 specifically includes: S21. When homomorphic encryption is used, the aggregated training data is encrypted into a usable but invisible state from three levels: algorithm, hardware, and ecosystem, to obtain encrypted training data.

[0032] Furthermore, in step S21, the aggregated training data is encrypted into a usable but invisible state from three levels: algorithm, hardware, and ecosystem, to obtain encrypted training data. Specifically, this includes: at the algorithm level, using lightweight solutions for parameter fine-tuning, batch processing, and approximate calculation; at the hardware level, using FPGA acceleration, dedicated ASIC chips, and GPU acceleration for hardware support; and at the ecosystem level, using framework abstraction, compiler optimization, and blockchain integration for optimization, all of which together encrypt the aggregated training data into a usable but invisible state to obtain encrypted training data. Among these, the lightweight solutions include at least the Paillier encryption method.

[0033] S22. When using the data synthesis method, virtual training data is generated based on the aggregated training data, and the virtual training data is used as encrypted training data.

[0034] In practical applications, homomorphic encryption allows computations to be performed on encrypted data, with the resulting data also being encrypted. The decrypted result is identical to the result of performing the same computation on the plaintext data. This achieves the ideal state of "data usable but not visible." Homomorphic encryption is suitable for problems with high computational overhead, such as scenarios with extremely high security requirements, like processing encrypted corporate financial data or medical records in the cloud.

[0035] The current encryption and decryption process has the following problems: Ciphertext inflation: The encrypted data volume can be thousands to millions of times larger than the original data.

[0036] Noise accumulation: Each operation adds noise, requiring frequent "bootstrapping" operations to refresh the ciphertext, a process that accounts for more than 80% of the computational cost of FHE.

[0037] Complex computations: In schemes such as CKKS / BGV, the complexity of polynomial multiplication reaches O(n2logn), which is much higher than that of plaintext computation.

[0038] It adopts a combination of hardware and software collaboration and full-stack acceleration.

[0039] 1. Algorithm-level optimization.

[0040] 1) Choose a lightweight solution: such as Paillier (additive homomorphism), which is suitable for scenarios that only require addition / scalar multiplication (such as aggregation of training parameters), and is several orders of magnitude more efficient than fully homomorphic encryption.

[0041] 2) Parameter fine-tuning: Reduce the polynomial modulus n Alternatively, the modulus width of the coefficients can be adjusted to strike a balance between safety and performance.

[0042] 3) Batch processing: Utilizing SIMD features, a single ciphertext is encoded into multiple plaintexts for parallel processing.

[0043] 4) Approximate calculation: CKKS supports floating-point approximation, which is suitable for scenarios that tolerate errors, such as machine learning.

[0044] 2. Hardware acceleration.

[0045] 1) FPGA acceleration: Parallelization of Paillier and other algorithms is achieved based on HLS (High-Level Synthesis), and the throughput is significantly improved by using optimizations such as Montgomery modular multiplication and Karatsuba multiplication.

[0046] 2) Dedicated ASIC chips: Intel released the HERACLES SoC, with FHE performance 5547 times higher than high-end CPUs. Fudan University, KAIST, and others launched the low-power Torus FHE processor, achieving high energy efficiency at 128-bit security level.

[0047] 3) GPU acceleration: Microsoft SEAL and other libraries support CUDA, which can accelerate matrix operations of CKKS / BGV by 8-10 times.

[0048] 3. Innovation at the system and ecosystem levels.

[0049] 1) Framework abstraction: MiLiu Intelligence open-sources LattiSense and LattiAI, shielding cryptographic details and supporting automatic hardware scheduling.

[0050] 2) Compiler optimization: Automatically converts PyTorch models into homomorphic cryptographic inference processes.

[0051] 3) Blockchain integration: INIChain integrates TFHE into EVM, alleviating computational pressure through dynamic difficulty adjustment (DDA) and parallel block production.

[0052] In practical applications, for sensitive data, such as synthesized audio and video information, data synthesis only needs to retain the statistical characteristics and structure of the original data, while masking sensitive data such as ID numbers or legally defined sensitive data, requiring data synthesis filtering. The data synthesis method in this application is applied to the early testing and verification phase of model development, using synthesized data to avoid contact with real sensitive data. The specific method is as follows: 1. Synthetic Data: Synthetic data is artificially generated data through algorithms (such as GANs, VAEs, rule engines, etc.). It does not contain any sensitive information about real individuals, but retains the statistical characteristics and structure of the original data. Synthetic data belongs to the "high-dimensional anonymization" technology, making it impossible to trace back to real data, and complies with the requirements of privacy regulations such as the Personal Information Protection Law and GDPR.

[0053] 2) Avoid contact with real sensitive data.

[0054] S3. Classify the encrypted training data into categories, and use a multi-layered security collaborative protection mechanism to protect the encrypted training data according to the category to obtain the final training data; the multi-layered security collaborative protection mechanism includes at least: differential privacy protection, data desensitization protection and data anonymization protection.

[0055] In practical applications, confidential information in the aggregated training data must be encrypted throughout the data transmission process and decrypted after it is transmitted to the server. This inevitably consumes a lot of resources, so a high-performance data encryption system needs to be established to achieve the best results through hardware and software collaboration.

[0056] Furthermore, step S3 specifically includes: S31. Classify the encrypted training data.

[0057] S32. When the encrypted training data belongs to the category of medical case data or medical image data, the encrypted training data shall be anonymized.

[0058] S33. When the encrypted training data belongs to the category of financial statistics or customer transaction data, differential privacy protection shall be applied to the encrypted training data.

[0059] In practical applications, differential privacy is a cryptographic technique that primarily aims to maximize the accuracy of data queries when retrieving data from statistical databases, while minimizing the chance of identifying records. It involves adding mathematically calculated "noise" to the data query or summary results, making it impossible to deduce any individual's information from the output. The magnitude of this "noise" is controlled to balance data availability and privacy protection. When sharing user behavior analysis reports externally, differential privacy techniques are used to ensure that the statistical value of the data is preserved without disclosing individual information.

[0060] Differential privacy ensures that attackers cannot determine whether an individual exists in the dataset by adding controlled noise to the query results. Its formal definition is: if for two adjacent datasets D and D' that differ by only one record, a random algorithm M satisfies... .

[0061] Then M is said to satisfy Differential privacy.

[0062] (Privacy Budget): Controls the strength of privacy protection. The smaller the value, the stronger the protection, but the lower the data availability.

[0063] Typical scenario: =0.5.

[0064] Highly sensitive scenarios (such as healthcare and finance): =0.1.

[0065] Relaxation scenarios (such as user behavior trends): =1.

[0066] δ (relaxation term): Allows for a very small probability of privacy breach failure, usually set to 0. Magnitude.

[0067] For example, an app's statistics show that 1 million users clicked the app 356,800 times. After adding Gaussian noise (standard deviation ≈ 5.3), the number of clicks was reduced to 356,788, thus preserving the trend while protecting individual cases.

[0068] Mathematically provable privacy protection, independent of attacker's background knowledge. Adjustable parameters balance privacy and data utility. Multi-turn queries lead to the accumulation of privacy budgets, enabling budget management.

[0069] S34. When the encrypted training data belongs to the category of e-commerce platform transaction data, data desensitization protection and data anonymization protection shall be performed on the encrypted training data.

[0070] In practical applications, data anonymization is a key technology for protecting data privacy. Its core objective is to remove or obfuscate direct and indirect identifiers in data through technical means, ensuring that data cannot be traced back to a specific individual. The following is a detailed process for anonymizing large model training data: 1. The core method of data anonymization used: direct identifier removal / replacement; operation: delete or replace information directly associated with an individual, such as name, ID number, and mobile phone number. Example: replace user ID "user_123" with hash value "a3b7c9…", or replace the real name with pseudonyms such as "UID_001".

[0071] 2. Indirect Identifier Generalization / Suppression; Operation: Generalize indirect identifiers such as date of birth, occupation, and geographical location, or directly delete inefficient fields. Examples: Generalize "30-year-old male" to "25-35-year-old male"; generalize "a certain community" to "a certain street"; delete the house number from the address, keeping only the street name.

[0072] 3. Advanced anonymization algorithms; k-anonymity: Ensures that at least k individuals have the same feature value on key attributes, hiding information about a single individual. l-diversity: Sensitive attributes (such as disease type) within the same anonymization group must contain at least l different values ​​to prevent inference of identity through sensitive attributes. Algorithms based on permutations / clustering: Replaces sensitive attribute values ​​through permutation or clustering analysis, ensuring data irreversibility.

[0073] The data anonymization techniques used are as follows: 1. Hash encryption; Application: Use secure hash algorithms such as SHA-256 on string-type PII (such as name, mobile phone number), combined with salt to prevent rainbow table cracking.

[0074] 2. Data Masking; Operation: Transform sensitive data, preserving the format but hiding the content. Example: Mobile phone number: 135****5678; ID card number: 330101****1211; Text: Replace "User Zhang San purchased a mobile phone in March 2026" with "User purchased a mobile phone in *year**month".

[0075] 3. Differential Privacy; Principle: Controlled noise is added to the data, making it impossible for attackers to distinguish between model training results that "contain a certain individual's data" and those that "do not contain that data." Application: A privacy budget ε is set during the data collection phase, and individual privacy is protected through mechanisms such as Laplace noise.

[0076] The methods for data anonymization are as follows: 1. Substitution Method: The substitution method is a direct data anonymization method. It involves replacing the values ​​of sensitive fields with fictitious values ​​that have similar characteristics. For example, for names, common pseudonyms can be used. "Zhang San" can be replaced with "Li Hua," maintaining the form of the name without revealing the real identity. A similar method can be used for phone numbers. For example, replacing the real phone number "138____5678" with "138____1234" maintains the same format and length, even though the numbers are different.

[0077] 2. Masking Method; The large-scale masking method refers to partially hiding sensitive data according to certain rules. For example, in ID card numbers, we can mask the birth date portion. For example, "110105____321X" only displays part of the numbers, hiding key information such as the birth date, while retaining the overall format of the ID card number. The large-scale masking method can also be used in processing bank card numbers. For example, in the card number "6222____1234____5678", the middle part of the numbers is displayed in masked form, which ensures the integrity of the card number for necessary verification operations while preventing the leakage of card number information.

[0078] 3. Encryption Methods; Large-Scale Model Encryption involves encrypting sensitive data so that even if the data is acquired, its true content cannot be known without the decryption key. For example, symmetric encryption algorithms, such as AES, can be used to encrypt personal information. Suppose we have a record containing a name and ID number; we first convert it into a string in a specific format, then encrypt this string using the AES algorithm, obtaining a ciphertext. During large-scale model training, this ciphertext is used, not the original sensitive data. Large-scale model asymmetric encryption algorithms can also be used for data desensitization. For example, when a company collaborates with an external organization on large-scale model training, the company can use its private key to encrypt sensitive data, and the external organization can only decrypt it using the corresponding public key. This ensures data security during data transmission and use.

[0079] 4. Synthesis Method: The large-scale model synthesis method involves generating fictitious but similar data based on the features of the original data. For example, when training a large-scale image recognition model, if there is sensitive image data containing human faces, we can use image processing techniques to analyze facial features, such as facial proportions and shapes, and then use these features to synthesize new facial images. These new images look similar to the original images but do not contain the identity information of the real person. The synthesis method can also be used in text data processing. For example, for a text containing sensitive business information, we can analyze its language style, word frequency, and other features, and then generate a text with similar features but completely different content for use in training the large-scale model.

[0080] Furthermore, the differential privacy protection specifically includes: adding controllable noise to the query results to prevent attackers from determining whether an individual exists in the dataset.

[0081] Furthermore, the data desensitization protection specifically includes: using at least one of the following methods for data desensitization: substitution, masking, encryption, and synthesis.

[0082] Furthermore, the data anonymization protection specifically includes: using at least one of direct identifier removal and replacement, indirect identifier generalization suppression, and advanced anonymization algorithms for data anonymization; wherein, the advanced anonymization algorithms specifically include: k-anonymity, l-diversity, and permutation-based clustering algorithms.

[0083] S4. Use the final training data to iteratively optimize and train the target large model to obtain the trained target large model.

[0084] The technical effects of this application are as follows: This application constructs a secure network to acquire raw operational data and aggregate model parameters locally within the enterprise, avoiding the risk of data leakage. Subsequently, homomorphic encryption or data synthesis methods are used to transform the aggregated training data into a usable but invisible form, ensuring privacy and security during transmission and use. Furthermore, the encrypted training data is categorized, and a multi-layered security collaborative protection mechanism, comprising differential privacy, data anonymization, and other techniques, is applied to protect the data according to its category. This application balances privacy protection and model performance while reducing computational overhead, ultimately yielding secure and compliant final training data for iterative optimization training of the target large-scale model, thereby obtaining a well-trained target large-scale model with a balance between performance and privacy.

[0085] Example 2: This example provides a computer system, which can be a server or a terminal, and its internal structure diagram can be as follows. Figure 4 As shown, the computer system includes a processor, memory, input / output (I / O) interfaces, and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides the environment for the operating system and computer programs stored in the non-volatile storage media. The database stores forced oscillation samples and sub / supersynchronous oscillation samples. The I / O interfaces are used for information exchange between the processor and external devices. The communication interface is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements the aforementioned method for rapid prediction and identification of the dominant frequency of sub / supersynchronous oscillations in new energy power systems based on transfer learning.

[0086] Those skilled in the art will understand that Figure 4 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer system to which the present application is applied. A specific computer system may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0087] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments described above. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

[0088] The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0089] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.

[0090] This document uses specific examples to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this application. Furthermore, those skilled in the art will recognize that, based on the ideas of this application, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A multi-level secure collaborative training method for large models, characterized in that, The method includes: A secure network is constructed, which is used to obtain raw operational data from the enterprise's local machine. The raw operational data is then aggregated with model parameters to obtain aggregated training data. The model parameters include at least gradient updates. The aggregated training data is encrypted into a usable but invisible state by using homomorphic encryption or data synthesis methods, thus obtaining encrypted training data. The encrypted training data is categorized, and a multi-layered security collaborative protection mechanism is adopted to protect the encrypted training data according to its category, so as to obtain the final training data; the multi-layered security collaborative protection mechanism includes at least: differential privacy protection, data desensitization protection and data anonymization protection; The target large model is iteratively optimized and trained using the final training data to obtain a well-trained target large model.

2. The multi-level secure collaborative training method for large models according to claim 1, characterized in that, A secure network is constructed to acquire raw operational data from the enterprise's local system. Model parameters are then aggregated from this raw operational data to obtain aggregated training data, specifically including: Build a secure network; The raw operational data is obtained from the enterprise's local machine using a secure network in a non-shared manner; the enterprise only stores the raw operational data locally, does not share the raw operational data during training, and only shares the trained target large model. The original running data is aggregated by employing model inversion attack defense, data poisoning attack defense, and multi-layer privacy protection methods to obtain aggregated training data.

3. The multi-level secure collaborative training method for large models according to claim 1, characterized in that, The model inversion attack defense specifically includes: data strengthening encryption and differential privacy; The data poisoning attack defense specifically includes: data verification and cleaning, and Byzantine fault tolerance; The multi-layered privacy protection specifically includes: security audit monitoring and security assessment based on multi-party cross-validation.

4. The multi-level secure collaborative training method for large models according to claim 1, characterized in that, The aggregated training data is encrypted into a usable but invisible state using homomorphic encryption or data synthesis methods, resulting in encrypted training data, specifically including: When homomorphic encryption is used, the aggregated training data is encrypted into a usable but invisible state from three levels: algorithm, hardware and ecosystem, to obtain encrypted training data. When using the data synthesis method, virtual training data is generated based on the aggregated training data, and the virtual training data is used as encrypted training data.

5. The multi-level secure collaborative training method for large models according to claim 4, characterized in that, The aggregated training data is encrypted into a usable but invisible state from three levels: algorithm, hardware, and ecosystem. This results in encrypted training data, specifically including: At the algorithm level, lightweight solutions are used for parameter fine-tuning, batch processing, and approximate calculations. At the hardware level, hardware support is provided through FPGA acceleration, dedicated ASIC chips, and GPU acceleration. At the ecosystem level, optimization is achieved through framework abstraction, compiler optimization, and blockchain integration. Together, the aggregated training data is encrypted into a usable but invisible state, resulting in encrypted training data. The lightweight solution includes at least Paillier encryption.

6. The multi-level secure collaborative training method for large models according to claim 1, characterized in that, The encrypted training data is categorized, and a multi-layered security collaborative protection mechanism is used to protect the encrypted training data according to its category, resulting in the final training data, which specifically includes: Classify the encrypted training data; When the encrypted training data belongs to the category of medical case data or medical image data, the encrypted training data is anonymized for protection. When the encrypted training data belongs to the category of financial statistics or customer transaction data, differential privacy protection is applied to the encrypted training data. When the encrypted training data belongs to the category of e-commerce platform transaction data, data anonymization and data desensitization protection are performed on the encrypted training data.

7. The multi-level secure collaborative training method for large models according to claim 1, characterized in that, The differential privacy protection specifically includes: adding controllable noise to the query results to prevent attackers from determining whether an individual exists in the dataset.

8. The multi-level secure collaborative training method for large models according to claim 1, characterized in that, The data desensitization protection specifically includes: using at least one of the following methods for data desensitization: substitution, masking, encryption, and synthesis.

9. The multi-level secure collaborative training method for large models according to claim 1, characterized in that, The data anonymization protection specifically includes: using at least one of direct identifier removal and replacement, indirect identifier generalization suppression, and advanced anonymization algorithms for data anonymization; wherein, the advanced anonymization algorithms specifically include: k-anonymity, l-diversity, and permutation-based clustering algorithms.

10. A computer system, comprising: A memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that the processor executes the computer program to implement the multi-level secure collaborative training method for large models according to any one of claims 1-9.