Abstract

1、通过梯度下降优化可微塑性

2、在测试集合中，训练集中从未见过的自然图像集合，能重建。

3、可以解决一般的元学习任务。

Introduction

Many of the recent spectacular successes in machine learn-ing involve learning one complex task very well, throughextensive training over thousands or millions of trainingexamples (Krizhevsky et al., 2012; Mnih et al., 2015; Sil-ver et al., 2016).  After learning is complete, the agent’sknowledge is fixed and unchanging; if the agent is to beapplied to a different task, it must be re-trained (fully or par-tially), again requiring a very large number of new training examples. By contrast, biological agents exhibit a remark-able ability to learn quickly and efficiently from ongoingexperience: animals can learn to navigate and remember thelocation of (and quickest way to) food sources, discover andremember rewarding or aversive properties of novel objectsand situations, etc. – often from a single exposure.Endowing artificial agents with lifelong learning abilitiesis essential to allowing them to master environments withchanging or unpredictable features, or specific features thatare unknowable at the time of training. For example, super-vised learning in deep neural networks can allow a neuralnetwork to identify letters from a specific, fixed alphabetto which it was exposed during its training; however, au-tonomous learning abilities would allow an agent to acquireknowledge ofanyalphabet,  including alphabets that areunknown to the human designer at the time of training. An additional benefit of autonomous learning abilities isthat in many tasks (e.g. object recognition, maze navigation,etc.), the bulk of fixed, unchanging structure in the task canbe stored in the fixed knowledge of the agent, leaving onlythe changing, contingent parameters of the specific situationto be learned from experience.  As a result, learning theactual specific instance of the task at hand (that is, the actuallatent parameters that do vary across multiple instances ofthe general task) can be extremely fast, requiring only fewor even a single experience with the environment.Several meta-learning methods have been proposed to trainagents to learn autonomously (reviewed shortly). However,unlike in current approaches, in biological brains long-termlearning is thought to occur (Martin et al., 2000; Liu et al.,2012) primarily throughsynaptic plasticity– the strength-ening and weakening of connections between neurons asa result of neural activity, as carefully tuned by evolutionover millions of years to enable efficient learning duringthe lifetime of each individual.  While multiple forms ofsynaptic plasticity exist, many of them build upon the gen-eral principle known as Hebb’s rule: if a neuron repeatedlytakes part in making another neuron fire, the connectionbetween them is strengthened (often roughly summarizedas “neurons that fire together, wire together”) (Hebb, 1949).Designing neural networks with plastic connections has long been explored with evolutionary algorithms (see Soltoggioet al. 2017 for a recent review), but has been so far relativelyless studied in deep learning.   However,  given the spec-tacular results of gradient descent in designing traditionalnon-plastic neural networks for complex tasks,  it wouldbe of great interest to expand backpropagation training tonetworks with plastic connections – optimizing through gra-dient descent not only the base weights, but also the amountof plasticity in each connection.We previously demonstrated the theoretical feasibility andanalytical derivability of this approach (Miconi, 2016). Herewe show that this approach can train large (millions of pa-rameters) networks for non-trivial tasks. To demonstrate ourapproach, we apply it to three different types of tasks: com-plex pattern memorization (including natural images), one-shot classification (on the Omniglot dataset), and reinforce-ment learning (in a maze exploration problem).  We showthat plastic networks provide competitive results on Om-niglot, improve performance in maze exploration, and out-perform advanced non-plastic recurrent networks (LSTMs)by orders of magnitude in complex pattern memorization.This result is interesting not only for opening up a newavenue of investigation in gradient-based neural networktraining, but also for showing that meta-properties of neuralstructures normally attributed to evolution or a priori designare in fact amenable to gradient descent, hinting at a wholeclass of heretofore unimagined meta-learning algorithms.

最近在机器学习方面取得的许多惊人的成功都涉及到通过数千或数百万个训练实例的紧张训练，很好地学习一个复杂的任务（Krizhevsky等人，2012；Mnih等人，2015；Sil-ver等人，2016）。学习完成后，agent的知识是固定不变的；如果agent要应用于不同的任务，它必须重新训练（完全或部分地），同样需要大量新的训练示例。相比之下，生物制剂表现出一种从不断的经验中快速有效地学习的能力：动物可以学会导航并记住食物来源的位置（和最快的方法），发现并记住新物体和新情况的有益或厌恶的特性，等等——通常是一次接触。
赋予人工智能体终身学习的能力，使其能够掌握具有变化或不可预测特征的环境，或训练时不可知的特定特征。例如，深层神经网络中的监督学习可以使神经网络识别出在训练过程中暴露出来的特定固定字母表中的字母；然而，动态学习能力将允许智能体获得字母表的知识，包括人类设计者当时不知道的字母表培训。
自主学习能力的另一个好处是，在许多任务中（例如，对象识别、迷宫导航等），任务中大部分固定不变的结构可以存储在agent的固定知识中，只留下特定情境的变化、或有参数可从经验中学习。因此，学习手头任务的实际特定实例（即，实际参数在多个常规任务实例中确实有所不同）可能非常快，只需要很少的时间甚至只需要一次环境体验。
有人提出了几种元学习方法来训练学习者自主学习。然而，与目前的方法不同，在生物大脑中，长期学习被认为主要是通过突触可塑性来实现的（Martin et al.，2000；Liu et al.，2012），即神经活动导致神经元之间连接的增强和减弱，经过数百万年进化的精心调整，使每个人在一生中都能有效地学习。虽然存在多种形式的突触可塑性，但其中许多都建立在赫伯法则的一般原理上：如果一个神经元重复参与另一个神经元的激发，它们之间的联系就会加强（通常粗略地概括为“一起点火的神经元，连接在一起的神经元”）（Hebb，1949）。
设计具有塑料连接的神经网络早就用进化算法进行了探索（参见Soltoggio等人2017年的最新评论），但迄今为止在深度学习方面的研究相对较少。然而，考虑到梯度下降在设计传统非塑性神经网络以完成复杂任务时的特殊结果，将反向传播训练扩展到具有塑性连接的网络中是非常有意义的——通过梯度下降而不仅仅是基本权值进行优化，也包括每个连接的可塑性。
我们先前证明了这种方法的理论可行性和分析推导性（Michoni，2016）。在这里我们证明了这种方法可以为非平凡的任务训练大型（数百万参数）网络。为了证明我们的方法，我们将其应用于三种不同类型的任务：复合模式记忆（包括自然图像）、一次性分类（在Omniglot数据集上）和强化学习（在迷宫探索问题中）。我们证明塑料网络在Om niglot上提供了竞争性的结果，提高了迷宫探索的性能，在复杂模式下，其性能优于先进的非塑性递归网络（LSTMs）记忆。这个这一结果不仅为基于梯度的神经网络训练开辟了一条新的研究途径，但同时也表明神经结构的元特性通常归因于进化或一种先验的设计，事实上可以适应梯度下降，暗示了迄今为止所有无法想象的元学习算法。

Related work

另一种方法是用一个塑料组件来增加每个权重，该组件会自动增长并作为输入和输出。输入我们的框架，该方法本质上等同于一个塑性网络，其中所有连接具有相同的、不可训练的塑性（即相同且不可学习的α，η，等）：仅训练网络的非塑性重量。Schmidhuber（1993a）指出，这种同质塑性网络原则上可以学习产生任何期望的轨迹。最近的“快速权重”ap proach（Ba et al.，2016）与可微塑性初始报告（Miconi，2016）同时发布，使用快速变化的Hebbian权重（所有具有相同的连接，不可训练的可塑性），并在每个时间步迭代计算激活（用慢加权非塑性网络的输出初始化每个这样的循环）。总体效果是强调最近遇到的模式，使网络“趋向于最近的过去”（Ba等人，2016）。

intro

最近在机器学习方面取得的许多惊人的成功都涉及到通过数千或数百万个训练实例的紧张训练，很好地学习一个复杂的任务（Krizhevsky等人，2012；Mnih等人，2015；Sil-ver等人，2016）。学习完成后，agent的知识是固定不变的；如果agent要应用于不同的任务，它必须重新训练（完全或部分地），同样需要大量新的训练示例。相比之下，生物制剂表现出一种从不断的经验中快速有效地学习的能力：动物可以学会导航并记住食物来源的位置（和最快的方法），发现并记住新物体和新情况的有益或厌恶的特性，等等——通常是一次接触。
赋予人工智能体终身学习的能力，使其能够掌握具有变化或不可预测特征的环境，或训练时不可知的特定特征。例如，深层神经网络中的监督学习可以使神经网络识别出在训练过程中暴露出来的特定固定字母表中的字母；然而，动态学习能力将允许智能体获得字母表的知识，包括人类设计者当时不知道的字母表培训。
自主学习能力的另一个好处是，在许多任务中（例如，对象识别、迷宫导航等），任务中大部分固定不变的结构可以存储在agent的固定知识中，只留下特定情境的变化、或有参数可从经验中学习。因此，学习手头任务的实际特定实例（即，实际参数在多个常规任务实例中确实有所不同）可能非常快，只需要很少的时间甚至只需要一次环境体验。
有人提出了几种元学习方法来训练学习者自主学习。然而，与目前的方法不同，在生物大脑中，长期学习被认为主要是通过突触可塑性来实现的（Martin et al.，2000；Liu et al.，2012），即神经活动导致神经元之间连接的增强和减弱，经过数百万年进化的精心调整，使每个人在一生中都能有效地学习。虽然存在多种形式的突触可塑性，但其中许多都建立在赫伯法则的一般原理上：如果一个神经元重复参与另一个神经元的激发，它们之间的联系就会加强（通常粗略地概括为“一起点火的神经元，连接在一起的神经元”）（Hebb，1949）。
设计具有塑料连接的神经网络早就用进化算法进行了探索（参见Soltoggio等人2017年的最新评论），但迄今为止在深度学习方面的研究相对较少。然而，考虑到梯度下降在设计传统非塑性神经网络以完成复杂任务时的特殊结果，将反向传播训练扩展到具有塑性连接的网络中是非常有意义的——通过梯度下降而不仅仅是基本权值进行优化，也包括每个连接的可塑性。
我们先前证明了这种方法的理论可行性和分析推导性（Michoni，2016）。在这里我们证明了这种方法可以为非平凡的任务训练大型（数百万参数）网络。为了证明我们的方法，我们将其应用于三种不同类型的任务：复合模式记忆（包括自然图像）、一次性分类（在Omniglot数据集上）和强化学习（在迷宫探索问题中）。我们证明塑料网络在Om niglot上提供了竞争性的结果，提高了迷宫探索的性能，在复杂模式下，其性能优于先进的非塑性递归网络（LSTMs）记忆。这个这一结果不仅为基于梯度的神经网络训练开辟了一条新的研究途径，但同时也表明神经结构的元特性通常归因于进化或一种先验的设计，事实上可以适应梯度下降，暗示了迄今为止所有无法想象的元学习算法。

结论：

塑性网络的探索起源于进化算法：设计具有塑料连接的神经网络早就用进化算法进行了探索（参见Soltoggio等人2017年的最新评论）
神经结构的元特性通常归因于进化或一种先验的设计，事实上可以适应梯度下降，暗示了迄今为止所有无法想象的元学习算法。

```

Differentiable plasticity: training plastic neural networks withbackpropagation相关推荐

【阅读笔记】Differentiable plasticity: training plastic neural networks with backpropagation
Differentiable plasticity: training plastic neural networks with backpropagation 作者: Thomas Miconi/J ...
Domain-Adversarial Training of Neural Networks
本篇是迁移学习专栏介绍的第十八篇论文,发表在JMLR2016上. Abstrac 提出了一种新的领域适应表示学习方法,即训练和测试时的数据来自相似但不同的分布.我们的方法直接受到域适应理论的启发,该理 ...
【论文阅读笔记】Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
该方法的简称为:IAO 该论文提出了一种允许网络在推测时只使用整数计算的方法,即 float32 --> int8. 该文在MobileNets等已经压缩过的网络进行量化,测试数据集为I ...
关于Training deep neural networks for binary communication with the Whetstone method的代码实现
GitHub网址如下: https://github.com/SNL-NERL/Whetstone/blob/master/examples/adaptive_mnist.py 实现过程中解决的问题: ...
DANN：Domain-Adversarial Training of Neural Networks
DANN原理理解 DANN中源域和目标域经过相同的映射来实现对齐. DANN的目标函数分为两部分: 1. 源域分类损失项 2. 源域和目标域域分类损失项 1.源域分类损失项对于一个m维的数据点X ...
On the difficulty of training Recurrent Neural Networks
1 摘要关于正确训练循环神经网络有两个常见的问题,梯度消失和梯度爆炸. 在本文中,我们试图通过从分析,几何和动态系统的角度探索这些问题来提高对潜在问题的理解. 我们的分析被用来证明一个简单而有效的解 ...
二值神经网络（Binary Neural Networks）最新综述
作者|秦浩桐.龚睿昊.张祥国单位|北京航空航天大学研究方向|网络量化压缩本文介绍了来自北京航空航天大学刘祥龙副教授研究团队的最新综述文章 Binary Neural Networks: A Su ...
Neural Networks and Deep Learning - 神经网络与深度学习 - Overfitting and regularization - 过拟合和正则化
Neural Networks and Deep Learning - 神经网络与深度学习 - Overfitting and regularization - 过拟合和正则化 Neural Netw ...
神经网络损失函数特别大_二值神经网络（Binary Neural Networks）最新综述
作者|秦浩桐.龚睿昊.张祥国单位|北京航空航天大学研究方向|网络量化压缩本文介绍了来自北京航空航天大学刘祥龙副教授研究团队的最新综述文章 Binary Neural Networks: A Su ...

Differentiable plasticity: training plastic neural networks withbackpropagation

文章目录

Abstract

Introduction

Related work

intro

Differentiable plasticity: training plastic neural networks withbackpropagation相关推荐

最新文章

热门文章