区别两个概念:
1) 在线学习(online/incremental Learning):指的是利用当前少量数据样本,去更新已经学习好的模型参数:这里的模型参数,可能包括不同时刻(指的是数据的不同时刻,而不学好的参数的不同时刻)的中间表达量。指的是模型参数的改变(经过了新一轮的求导和学习)
2)利用新数据和已经学习好的模型参数,做一些new input aware action(如recommendation、prediction)
区别这两个点的关键在于:看已经学习好的参数是否发生永久改变,而不是产生一个中间表达。
而本文关注的是第一点incremental learning中的两个问题:
1)scalability:deep model 和 shallow model各有优势。一个刻画数据能力强,一个学习效率快。所以本文就利用了一个attention机制,去动态刻画不同layer(deep和shallow反映在layer的数量上)的权重。实验展示出:在训练初期,越靠近input的层,weight较大,训练后期,靠近output的weight较大。文章解释为,样本数量少的时候,前面几次就可刻画数据了。当样本量增加的时候,可能更需要后面几层的参与,这个时候,前面几层通过之前少量数据的训练基本上稳定了,所以后面层起到了更大作用。注意,这里的不同层次的参数实际上还是属于同一时刻的模型参数。因为他们一同学好
2)Sustainability:这块主要考虑incremental learning的概念了。是已经训练好的模型参数的改变。如果我们只要新的数据去update的已经学习好的数据的话,那么这个模型参数会逐渐向新样本靠拢(除非每次update的时候,历史数据还在,一同被放进去学习)。怎么办,我们需要把之前已经学习好的参数也囊括的对于当前参数的更新。这里就用到了Fisher information matrix。注意,文章有关Fisher information matrix的等式里的前一刻模型参数theta,是一个常量,是被记下来的。
文献题目 | 去谷歌学术搜索 | ||||||||||
Adaptive Deep Models for Incremental Learning: Considering Capacity Scalability and Sustainability | |||||||||||
文献作者 | Yang Yang; Hui Xiong | ||||||||||
文献发表年限 | 2019 | ||||||||||
文献关键字 | |||||||||||
增量学习; 在线学习; fisher information matrix;对之前学好的模型参数建模,从而保存之前的信息 | |||||||||||
摘要描述 | |||||||||||
Recent years have witnessed growing interests in developing deep models for incremental learning. However, existing approaches often utilize the fixed structure and online backpropagation for deep model optimization, which is difficult to be implemented for incremental data scenarios. Indeed, for streaming data, there are two main challenges for building deep incremental models. First, there is a requirement to develop deep incremental models with Capacity Scalability. In other words, the entire training data are not available before learning the task. It is a challenge to make the deep model structure scaling with streaming data for flexible model evolution and faster convergence. Second, since the stream data distribution usually changes in nature (concept drift), there is a constraint for Capacity Sustainability. That is, how to update the model while preserving previous knowledge for overcoming the catastrophic forgetting. To this end, in this paper, we develop an incremental adaptive deep model (IADM) for dealing with the above two capacity challenges in real-world incremental data scenarios. Specifically, IADM provides an extra attention model for the hidden layers, which aims to learn deep models with adaptive depth from streaming data and enables capacity scalability. Also, we address capacity sustainability by exploiting the attention based fisher information matrix, which can prevent the forgetting in consequence. Finally, we conduct extensive experiments on real-world data and show that IADM outperforms the state-of-the-art methods with a substantial margin. Moreover, we show that IADM has better capacity scalability and sustainability in incremental learning scenarios. |