A cross-domain federated learning model and method based on a value iteration network

A learning model and cross-domain technology, applied in the field of cross-domain federated learning models based on value iterative networks, it can solve the problems of strong generalization ability, no reduction in the amount of parameters, and low domain similarity, so as to protect data privacy and overcome computing problems. Complexity and cost consumption, the effect of improving forecast accuracy

Active Publication Date: 2019-05-03
SUN YAT SEN UNIV
View PDF5 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] However, the existing Actor-Mimic migration learning is based on the traditional DRL network, which does not take advantage of the strong generalization ability of the VINs network. The Actor-Mimic migration strategy only performs initialization in the target domain, and all parameters in the target domain still need to be Retraining does not reduce the amount of parameters that need to be learned, and when the Actor-Mimic migration learning method is trained in the source domain or the target domain, it also requires a large number of data sets for each domain; the data are mutually visible during training, The data information is shared, but the privacy of the original data is not protected
[0010] The problem with transfer learning is that it does not consider the privacy protection of the source model or source data; when the feature spaces of the two fields are completely different (feature space mapping cannot be done), transfer learning cannot be performed, and transfer learning may have performance loss. When the model is migrated from the source domain to the target domain only as an initialization, a large part of the knowledge about the source domain learned from the model may be lost, and only part of the knowledge shared with the target domain remains, even when the domain similarity is not high. Migration, however, federated learning can use the data of both parties to achieve model growth in two fields without sharing data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A cross-domain federated learning model and method based on a value iteration network
  • A cross-domain federated learning model and method based on a value iteration network
  • A cross-domain federated learning model and method based on a value iteration network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The following describes the implementation of the present invention through specific specific examples in conjunction with the accompanying drawings. Those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific examples, and various details in this specification can also be based on different viewpoints and applications, and various modifications and changes can be made without departing from the spirit of the present invention.

[0038] Before introducing the present invention, the abbreviations and key terms involved in the present invention are defined as follows:

[0039] Deep learning: Deep learning was proposed by Hinton et al. in 2006 and is a new field of machine learning. Deep learning is introduced into machine learning to bring it closer to the original goal-artificial intelligence. Dee...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-domain federal learning model and method based on a value iteration network, and the model comprises a data preparation unit which is used for taking a path planning field of a grid map as a training environment, and taking observation states of two different parts in the same map as inputs of the two fields of federal learning; an Federated-VIN network establishingunit is used for establishing a Federhead-VIN network structure based on a value iteration network; the VIN network structure constructs the full connection of the value iteration modules of the source domain and the target domain, and defines a new joint loss function about the two domains according to the newly constructed network; the iteration unit is used for carrying out forward calculationon the VI modules in the two fields in the training process, and value iteration is achieved for multiple times through the VI modules; and the backward updating unit is used for carrying out backward calculation to update the network parameters, and carrying out backward updating on the VIN parameters and the full connection parameters in the two fields according to the joint loss function.

Description

Technical field [0001] The present invention relates to the technical field of machine learning, in particular to a cross-domain federated learning model and method based on a value iterative network. Background technique [0002] Reinforcement Learning (RL) is that the agent learns in a "trial and error" manner. It is reward-guided behavior obtained by interacting with the environment. The goal is to make the agent get the greatest reward. Reinforcement learning is different from connecting. Supervised learning in ideological learning is mainly manifested in teacher signals. In reinforcement learning, the reinforcement signal provided by the environment is an evaluation of the quality of the generated action, rather than telling the reinforcement learning system how to generate the correct action. Since the information provided by the external environment is scarce, the agent must learn from its own experience. In this way, knowledge is gained in an action-evaluation environmen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08
Inventor 申珺怡卓汉逵
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products