正如题目所说,本方法是nonlocal的,与之相对的则是local,例如传统的CNN有一个卷积操作就是一个多少乘以多少的小矩阵。RNN一样有一个时间窗口。
他们都是local的,只考虑周边的影响。而本文提出的是一个non-local的方法。也就是考虑所有的点对当前的影响。
具体方法就是定义其他所有点同当前点关系表达,主要涉及了两个函数:两个点之间关系;一个点从一个空间到另一个空间到映射。
文献题目 | 去谷歌学术搜索 | ||||||||||
Non-local Neural Networks | |||||||||||
文献作者 | Xiaolong Wang | ||||||||||
文献发表年限 | 2017 | ||||||||||
文献关键字 | |||||||||||
nonlocal | |||||||||||
摘要描述 | |||||||||||
Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time. In this paper, we present non-local operations as a generic family of building blocks for capturing long-range dependencies. Inspired by the classical non-local means method [4] in computer vision, our non-local operation computes the response at a position as a weighted sum of the features at all positions. This building block can be plugged into many computer vision architectures. On the task of video classification, even without any bells and whistles, our non- local models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code will be made available. |