Machine learning device, machine learning method, and machine learning program

The machine learning device aggregates trained and reverse-trained models to prevent information leakage in large language models, ensuring model performance and privacy without significant degradation.

WO2026140272A1PCT designated stage Publication Date: 2026-07-02MITSUBISHI ELECTRIC CORP

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
MITSUBISHI ELECTRIC CORP
Filing Date
2025-04-09
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing methods to prevent information leakage in large language models, such as those using differential privacy, degrade model performance, and there is a need for a solution that maintains model performance while preventing information leakage.

Method used

A machine learning device that aggregates a trained model with a reverse-trained model to generate a trained model, using a model aggregation unit to calculate parameter values based on weighted averages or adjusted weights, thereby preventing information leakage without significantly degrading performance.

Benefits of technology

The solution effectively prevents information leakage from large language models while maintaining model performance, allowing for stable training and early convergence, with adjustable privacy measures and minimal performance degradation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure JP2025014145_02072026_PF_FP_ABST
    Figure JP2025014145_02072026_PF_FP_ABST
Patent Text Reader

Abstract

A model training unit (120) trains a target model using learning data and generates a trained model. An inverse training unit (130) inversely trains the target model using the learning data and generates an inversely trained model. A model aggregation unit (140) aggregates the trained model and the inversely trained model to generate a pre-trained model.
Need to check novelty before this filing date? Find Prior Art

Description

Machine learning device, machine learning method, and machine learning program

[0001] The present disclosure relates to a technique for generating a learned model with measures against information leakage.

[0002] In various industrial fields, the use of learning models with large model parameters such as generative AI is expected. AI is an abbreviation for artificial intelligence. From here on, the large language model will be described as a representative of learning models with large model parameters. However, what is described here is also common to learning models related to other tasks such as image generation models.

[0003] In order to construct and train a large language model from scratch, a huge computational cost and a huge dataset are required. Therefore, it is considered to additionally train new learning data on an existing learned large language model. Such additional training is called fine-tuning. For example, it is conceivable to supplement an already pre-trained large language model with company-specific knowledge through additional training.

[0004] On the other hand, since the model structure (number of model parameters) of the large language model is huge, learning data is likely to be stored inside the large language model, and there is a pointed-out vulnerability that private information and confidential information related to the learning data may leak from the large language model.

[0005] Non-Patent Document 1 proposes a countermeasure based on differential privacy technology as a countermeasure against information leakage for a fine-tuned large language model. This countermeasure adds noise to the model parameters to scramble the information related to the learning data stored in the model parameters. However, since the model parameters contain noise, the model performance deteriorates in a trade-off relationship with the information leakage countermeasure (privacy protection). Therefore, the countermeasure of Non-Patent Document 1 could not be carried out without degrading the model performance.

[0006] Da Yu et al. , “Differentially Private Fine-tuning of Language Models”, arXiv:2110.06500 (ICLR2022), 2022

[0007] This disclosure aims to enable the generation of trained models that incorporate measures to prevent information leakage without degrading model performance.

[0008] The machine learning device of this disclosure includes a model aggregation unit that aggregates a trained model, which is a target model trained using training data, and a reverse-trained model, which is the target model reverse-trained using the training data, to generate a trained model.

[0009] According to this disclosure, it is possible to generate a trained model that incorporates measures to prevent information leakage without degrading the model's performance.

[0010] Configuration diagram of the machine learning device 100 in Embodiment 1. Flowchart of the machine learning method in Embodiment 1. Diagram showing an overview of the trained model 194 in Embodiment 1. Flowchart of the machine learning method in Embodiment 2. Hardware configuration diagram of the machine learning device 100 in Embodiment.

[0011] In the embodiments and drawings, the same or corresponding elements are denoted by the same reference numeral. The descriptions of elements denoted by the same reference numeral as the described elements are omitted or simplified as appropriate. The arrows in the figures mainly indicate the flow of data or processing.

[0012] Embodiment 1. A method for generating a trained model with measures to prevent information leakage without degrading model performance will be described based on Figures 1 to 3.

[0013] ***Configuration Description*** The configuration of the machine learning device 100 will be described based on Figure 1. The machine learning device 100 is a computer equipped with hardware such as a processor 101, memory 102, auxiliary storage device 103, communication device 104, and input / output interface 105. These hardware components are connected to each other via signal lines.

[0014] The processor 101 is an integrated circuit (IC) that performs arithmetic processing and controls other hardware. For example, the processor 101 is a CPU, DSP, GPU, or a combination of these. IC is an abbreviation for Integrated Circuit. CPU is an abbreviation for Central Processing Unit. DSP is an abbreviation for Digital Signal Processor. GPU is an abbreviation for Graphics Processing Unit.

[0015] Memory 102 is a volatile or non-volatile storage device. Memory 102 is also called main memory. For example, memory 102 is RAM. Data stored in memory 102 is saved to auxiliary storage device 103 as needed. RAM is an abbreviation for Random Access Memory.

[0016] The auxiliary storage device 103 is a non-volatile storage device. For example, the auxiliary storage device 103 is a ROM, HDD, flash memory, or a combination thereof. Data stored in the auxiliary storage device 103 is loaded into memory 102 as needed. ROM is an abbreviation for Read Only Memory. HDD is an abbreviation for Hard Disk Drive.

[0017] The communication device 104 is a receiver and transmitter. For example, the communication device 104 is a communication chip or NIC. Communication of the machine learning device 100 is performed using the communication device 104. NIC is an abbreviation for Network Interface Card.

[0018] The input / output interface 105 is a port to which input and output devices are connected. For example, the input / output interface 105 is a USB terminal, the input devices are a keyboard and mouse, and the output device is a display. Input and output of the machine learning device 100 are performed via the input / output interface 105. USB is an abbreviation for Universal Serial Bus.

[0019] The machine learning device 100 comprises elements such as a data acquisition unit 110, a model learning unit 120, an inverse learning unit 130, a model aggregation unit 140, a model evaluation unit 150, and a model output unit 160. These elements are implemented in software.

[0020] The auxiliary storage device 103 stores machine learning programs that enable the computer to function as a data acquisition unit 110, a model learning unit 120, a reverse learning unit 130, a model aggregation unit 140, a model evaluation unit 150, and a model output unit 160. The machine learning programs are loaded into memory 102 and executed by the processor 101. The auxiliary storage device 103 also stores the operating system (OS). At least a portion of the OS is loaded into memory 102 and executed by the processor 101. The processor 101 executes the machine learning programs while executing the OS. OS is an abbreviation for Operating System.

[0021] The data for the machine learning program (input data, output data, etc.) is stored in the storage unit 190. Memory 102 functions as the storage unit 190. However, storage devices such as auxiliary storage device 103, registers in the processor 101, and cache memory in the processor 101 may function as the storage unit 190 instead of memory 102, or together with memory 102.

[0022] Machine learning programs can be recorded (stored) in a computer-readable format on non-volatile recording media such as optical discs or flash memory.

[0023] ***Explanation of Operation*** The procedure for operating the machine learning device 100 corresponds to the machine learning method. Furthermore, the procedure for operating the machine learning device 100 corresponds to the procedure for processing by the machine learning program.

[0024] The machine learning method will be explained based on Figure 2. In step S110, the data acquisition unit 110 acquires training data 191.

[0025] Training data 191 is the data used to train the target model.

[0026] The target model is a machine learning model to be trained. The target model may be an untrained model or a trained model.

[0027] Machine learning models are also called machine learning computations. An example of a machine learning model is deep learning. Machine learning models may include NN, CNN, RNN, VAE, GAN, Diffusion model, LSTM, Transformer, BERT, GPT, and CLIP. Note that these models are not mutually exclusive. For example, BERT and GPT are included in the Transformer model. Also, the Transformer model is included in the NN model. Algorithms and models for machine learning may be combinations of multiple types. NN is an abbreviation for Neural Network. CNN is an abbreviation for Convolutional Neural Network. RNN is an abbreviation for Recurrent Neural Network. VAE stands for Variational Autoencoder. GAN stands for Generative Advisory Networks. LSTM stands for Long Short Term Memory. BERT stands for Bidirectional Encoder Representations from Transformers. GPT stands for Generative Pre-trained Transformer. CLIP stands for Contrastive Language-Image Pre-training.

[0028] The target model is, for example, a large-scale language model. When the target model is a large-scale language model, the feature vectors generated from a large amount of text become the training data 191. Feature vectors can be generated using known methods such as Word2Vec.

[0029] For example, the target model and training data 191 are stored in the storage unit 190, and the data acquisition unit 110 reads the target model and training data 191 from the storage unit 190.

[0030] In step S120, the model learning unit 120 learns the target model using the learning data 191. This generates a trained model 192 from the learned target model. The trained model 192 is the learned target model. The trained model 192 has all the model parameters of the target model.

[0031] If the target model is a pre-trained model, the model learning unit 120 performs additional training on the target model using, for example, training data 191. Additional training is the process of training the machine learning model to acquire knowledge from the training data.

[0032] Additional training is also called fine tuning. An example of fine tuning is PEFT, which stands for Parameter Efficient Fine Tuning. PEFT does not update the model parameters of the trained model itself.

[0033] If the target model is a large-scale language model, the model learning unit 120 additionally learns only a small number of additional model parameters, for example, those used in PEFT. These small number of additional model parameters may be input to the model learning unit 120 as feature vectors, for example. Alternatively, these small number of additional model parameters may be input to the model learning unit 120 using prompts (instructional information for the large-scale language model), for example. Examples of prompt types include hard prompts and soft prompts. These small number of additional model parameters may be input to the model learning unit 120 using either hard prompts or soft prompts, for example. Or, these small number of additional model parameters may be input to the model learning unit 120 using a combination of hard prompts and soft prompts, for example.

[0034] The model learning unit 120 may also update all model parameters of the target model.

[0035] In step S130, the reverse learning unit 130 reverse learns the target model using the training data 191. This generates the reverse-learned model 193. The reverse-learned model 193 is the reverse-learned target model. The reverse-learned model 193 has all the model parameters of the target model.

[0036] Backtraining is a learning method in which a machine learning model is trained to forget the knowledge it has gained from the training data.

[0037] Back learning is also known as machine unlearning.

[0038] If the target model is a large-scale language model, the model learning unit 120 reverse-learns only a small number of additional model parameters used, for example, in PEFT.

[0039] The reverse learning unit 130 may also update all model parameters of the target model.

[0040] In steps S141 to S143, the model aggregation unit 140 aggregates the trained model 192 and the inversely trained model 193. This generates the trained model 194. The trained model 194 is a trained model generated by aggregating the trained model 192 and the inversely trained model 193. Specifically, the model aggregation unit 140 calculates the parameter values ​​of the trained model 194 using the parameter values ​​of the trained model 192 and the parameter values ​​of the inversely trained model 193. Figure 3 shows an example of the generation of the trained model 194. In Figure 3, the parameter values ​​of some model parameters of the trained model 192 and the parameter values ​​of some model parameters of the inversely trained model 193 are aggregated to calculate the parameter values ​​of some model parameters of the trained model 194.

[0041] In step S141, the model aggregation unit 140 obtains the parameter values ​​of the target parameter from the trained model 192. The target parameter is the model parameter to be aggregated. There may be one target parameter or multiple target parameters. Alternatively, each of the model parameters may be used as a target parameter.

[0042] In step S142, the model aggregation unit 140 obtains the parameter value of the target parameter from the post-inverse learning model 193.

[0043] In step S143, the model aggregation unit 140 aggregates the parameter value of the target parameter of the post-learning model 192 and the parameter value of the target parameter of the post-inverse learning model 193 to calculate the parameter value of the target parameter of the learned model 194.

[0044] For example, the model aggregation unit 140 calculates the average of the parameter value of the target parameter of the post-learning model 192 and the parameter value of the target parameter of the post-inverse learning model 193. The calculated average becomes the parameter value of the target parameter of the learned model 194. Note that the parameter value of the target parameter of the learned model 194 may be calculated based on a statistic other than the average (such as the median or the mode).

[0045] For example, the model aggregation unit 140 aggregates the parameter values of the target parameter as follows. First, the model aggregation unit 140 adjusts the ratio of the weight for the parameter value of the target model of the post-learning model 192 to the weight for the parameter value of the target model of the post-inverse learning model 193. Then, the model aggregation unit 140 adds the weighted parameter value of the post-learning model 192 to the weighted parameter value of the post-learning model 192. The calculated parameter value becomes the parameter value of the target parameter of the learned model 194.

[0046] The parameter value M of the target parameter of the learned model 194 is represented by the following formula. α is a privacy parameter. The value of the privacy parameter α is greater than 0 and less than 1. M F is the parameter value of the target parameter of the post-learning model 192. M U is the parameter value of the target parameter of the post-inverse learning model 193.

[0047] M = (α × M F ) + ((1 - α) × M U )

[0048] In step S151, the model evaluation unit 150 evaluates the information leakage risk of the trained model 194.

[0049] For example, the model evaluation unit 150 evaluates the trained model 194 using a membership inference attack.

[0050] In step S152, the model evaluation unit 150 determines whether re-aggregation is necessary based on the evaluation results of the information leakage risk of the trained model 194. The criteria for this determination are predetermined.

[0051] If re-aggregation is necessary, the process proceeds to step S143. The model aggregation unit 140 readjusts the weights according to the evaluation result of the information leakage risk and updates the parameter values ​​of the trained model 194 using the readjusted weights (step S143). For example, the model aggregation unit 140 readjusts the weights by changing the privacy parameter α. The model evaluation unit 150 evaluates the information leakage risk of the updated trained model 194 (step S151) and determines whether re-aggregation is necessary based on the evaluation result (step S152). For example, the model evaluation unit 150 may present the evaluation result and the weights (privacy parameter α) to the user using an output device (e.g., a display).

[0052] If re-aggregation is not required, the process proceeds to step S160.

[0053] In step S160, the model evaluation unit 150 outputs the trained model 194.

[0054] The output pre-trained model 194 is a pre-trained model with measures in place to prevent information leakage.

[0055] For example, the model evaluation unit 150 stores the trained model 194 in the storage unit 190.

[0056] ***Effects of Embodiment 1*** Embodiment 1 aims to implement measures to prevent information leakage from, for example, a finely tuned pre-trained model (large-scale language model) without degrading the model's performance. The machine learning device 100 trains a pre-trained model 194 that does not leak information about the training data 191. For example, the machine learning device 100 combines a post-trained model 192 that has gained knowledge from the training data 191 through additional training, and a post-inverse-trained model 193 that has forgotten the knowledge of the training data 191 through inverse training. This suppresses excessive information storage by the pre-trained model 194. As a result, it becomes possible to prevent information leakage from the pre-trained model 194. Furthermore, since the machine learning device 100 performs normal (additional) training and normal inverse training, the machine learning device 100 can be introduced without manipulating the model training. Therefore, Embodiment 1 results in less degradation of model performance compared to existing methods. The machine learning device 100 can perform additional training and inverse training independently. Therefore, the machine learning device 100 can perform each training stably. Therefore, early convergence of learning can be expected. Embodiment 1 can also be applied to PEFT that learns only some of the model parameters. Furthermore, Embodiment 1 allows the strength of information leakage countermeasures to be changed simply by adjusting the privacy parameter α.

[0057] Embodiment 2. An embodiment that further improves the accuracy of the trained model 194 will be described, mainly based on the differences from Embodiment 1, with reference to Figure 4.

[0058] ***Configuration Description*** The configuration of the machine learning device 100 is the same as the configuration in Embodiment 1.

[0059] ***Explanation of Operation*** The machine learning method will be explained based on Figure 4. Steps S210 to S251 are the same as steps S110 to S151 in Embodiment 1.

[0060] In step S252, the model evaluation unit 150 determines whether or not re-aggregation is necessary based on the evaluation results of the information leakage risk.

[0061] If re-aggregation is necessary, the process proceeds to step S210. The data acquisition unit 110 acquires new training data 191 (step S210). The model learning unit 120 trains the trained model 194 using the new training data 191. This generates a new trained model 192 (step S220). The inverse learning unit 130 inverse learns the trained model 194 using the new training data 191. This generates a new inversely learned model 193 (step S230). The model aggregation unit 140 aggregates the new trained model 192 and the new inversely learned model 193 to generate a new trained model 194 (steps S241 to S243). The model evaluation unit 150 evaluates the information leakage risk of the new trained model 194 (step S251) and determines whether re-aggregation is necessary based on the evaluation result (step S252). For example, the model evaluation unit 150 may present the evaluation results and specific gravity (privacy parameter α) to the user using an output device (e.g., a display).

[0062] If re-aggregation is not required, the process proceeds to step S260.

[0063] In step S260, the model evaluation unit 150 outputs the trained model 194 generated in the final step S243.

[0064] The output pre-trained model 194 is a pre-trained model with measures in place to prevent information leakage.

[0065] ***Effects of Embodiment 2*** Embodiment 2 aims to further improve the accuracy of the trained model 194 with information leakage countermeasures. In Embodiment 1, the machine learning device 100 aggregates the further trained model 192 and the inversely trained model 193, and generates the trained model 194 with information leakage countermeasures while adjusting the privacy parameter α according to the evaluation results of the trained model 194. On the other hand, in Embodiment 2, the machine learning device 100 performs model aggregation during the training process of additional training and inverse training. In other words, the machine learning device 100 performs model aggregation and model evaluation at a stage when additional training and inverse training have progressed to a certain extent, and feeds the aggregated trained model 194 back into the next additional training and the next inverse training. This mechanism improves the accuracy of the trained model 194 with information leakage countermeasures.

[0066] In Embodiment 2, the training process involves additional learning and inverse learning. Model aggregation within the training process may be performed at each epoch or iteration of the training process. Model aggregation can be achieved, for example, by an approach similar to that of federated learning.

[0067] ***Modified Embodiment*** The machine learning device 100 can be applied to purposes other than preventing information leakage of the trained model 194. The machine learning device 100 is also effective, for example, in removing bias from the trained model 194. For example, if the training data 191 includes information such as preferences, gender, race, religion, and regional differences, biases such as discriminatory expressions may be introduced into the trained model 194 due to this information. In this case, the model evaluation unit 150 evaluates whether or not bias has been removed from the trained model 194. This makes it possible to apply the machine learning device 100 for bias removal.

[0068] ***Supplement to the Embodiment*** The machine learning operation is not limited to deep learning. The machine learning operation may be an operation such as regression, decision tree learning, Bayesian methods, random forests, genetic algorithms, or clustering.

[0069] Machine learning models are not limited to large-scale language models. Machine learning models may also be image models or multimodal models (models that include language and images).

[0070] The privacy parameter α is not limited to just one. For example, a machine learning model based on deep learning consists of multiple layers. Therefore, the privacy parameter α may be set for each layer. Since the privacy parameter α can be adjusted in detail for each layer, it is possible to generate a trained model with measures to prevent information leakage without degrading the model's performance, and it also has the effect of improving the efficiency of generating trained models.

[0071] In this embodiment, for the sake of simplicity, we have illustrated the case where there is one trained model and one inversely trained model. However, the number of trained models and inversely trained models is not limited to one. For example, there may be multiple trained models and inversely trained models. Parameter value M of the target parameter of the trained model F This can be allocated according to the importance of multiple post-trained models. Similarly, the parameter value M of the target parameter of the inversely trained model. U This can be allocated according to the importance of multiple inversely trained models. In other words, the more important a trained model (or inversely trained model) is, the greater its contribution to the parameter value of the target parameter. Importance can be set, for example, based on the size of the training data. For example, the larger the amount of training data, the higher the importance can be.

[0072] Based on Figure 5, the hardware configuration of the machine learning device 100 will be described. The machine learning device 100 includes a processing circuit 109. The processing circuit 109 is hardware that implements a data acquisition unit 110, a model learning unit 120, a reverse learning unit 130, a model aggregation unit 140, a model evaluation unit 150, and a model output unit 160. The processing circuit 109 may be dedicated hardware, or it may be a processor 101 that executes a program stored in memory 102.

[0073] If the processing circuit 109 is dedicated hardware, the processing circuit 109 may be, for example, a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, an ASIC, an FPGA, or a combination thereof. ASIC is an abbreviation for Application Specific Integrated Circuit. FPGA is an abbreviation for Field Programmable Gate Array.

[0074] The machine learning device 100 may include multiple processing circuits that replace the processing circuit 109.

[0075] In the processing circuit 109, some functions may be implemented by dedicated hardware, while the remaining functions may be implemented by software or firmware.

[0076] Thus, the functions of the machine learning device 100 can be realized through hardware, software, firmware, or a combination thereof.

[0077] Each embodiment is an example of a preferred form and is not intended to limit the technical scope of this disclosure. Each embodiment may be implemented in part or in combination with other embodiments. Procedures described using flowcharts, etc., may be modified as appropriate.

[0078] The "part" of each element of the machine learning device 100 may be read as "processing," "process," "circuit," or "circuit."

[0079] 100 Machine learning device, 101 Processor, 102 Memory, 103 Auxiliary storage device, 104 Communication device, 105 Input / output interface, 109 Processing circuit, 110 Data acquisition unit, 120 Model learning unit, 130 Inverse learning unit, 140 Model aggregation unit, 150 Model evaluation unit, 160 Model output unit, 190 Storage unit, 191 Training data, 192 Trained model, 193 Inversely trained model, 194 Trained model.

Claims

1. A machine learning device comprising a model aggregation unit that generates a trained model by aggregating a trained model, which is a target model trained using training data, and a reverse-trained model, which is the same target model reverse-trained using the training data.

2. The machine learning apparatus according to claim 1, wherein the model aggregation unit calculates the parameter values ​​of the trained model using the parameter values ​​of the trained model and the parameter values ​​of the inversely trained model.

3. The machine learning apparatus according to claim 2, wherein the model aggregation unit adjusts the ratio of the weights for the parameter values ​​of the trained model to the weights for the parameter values ​​of the inversely trained model to calculate the parameter values ​​of the trained model.

4. The machine learning device according to claim 3, comprising a model evaluation unit that evaluates the information leakage risk or bias elimination of the trained model, wherein the model aggregation unit readjusts the weight according to the evaluation result of the information leakage risk or bias elimination, and updates the parameter values ​​of the trained model using the readjusted weight.

5. The machine learning apparatus according to claim 4, wherein the model evaluation unit presents the evaluation result and the specific gravity to the user.

6. A machine learning device according to any one of claims 1 to 5, comprising: a model learning unit that learns the target model using the learning data and generates the trained model; and an inverse learning unit that inversely learns the target model using the learning data and generates the inversely learned model.

7. The machine learning device comprises: a model learning unit that learns the target model using the learning data and generates the trained model; an inverse learning unit that inversely learns the target model using the learning data and generates the inversely learned model; and a model evaluation unit that evaluates the information leakage risk or bias elimination of the trained model, wherein the model learning unit learns the trained model using new learning data according to the evaluation result of the information leakage risk or bias elimination and generates a new trained model; the inverse learning unit inversely learns the trained model using the new learning data and generates a new inversely learned model; and the model aggregation unit aggregates the new trained model and the new inversely learned model to generate a new trained model.

8. The machine learning apparatus according to claim 7, wherein the model aggregation unit adjusts the ratio of the weights for the parameter values ​​of the new trained model to the weights for the parameter values ​​of the new inversely trained model to calculate the parameter values ​​of the new trained model.

9. The machine learning apparatus according to claim 8, wherein the model evaluation unit presents the evaluation result and the specific gravity to the user.

10. A machine learning method that generates a trained model by aggregating a trained model, which is a target model trained using training data, and an inversely trained model, which is the same target model inversely trained using the training data.

11. A machine learning program that causes a computer to perform a model aggregation process to generate a trained model by aggregating a trained model, which is a target model trained using training data, and an inversely trained model, which is the same target model inversely trained using the training data.