Title | Visual Pre-training for Navigation: What Can We Learn from Noise? |
Publication Type | Workshop Paper |
Year of Publication | 2022 |
Authors | Wang, Y., C-Y. Ko, and P. Agrawal |
Conference Name | Thirty-sixth Annual Conference on Neural Information Processing Systems, New Orleans, LA |
Workshop Name | Synthetic Data for Empowering ML Research Workshop & Self-Supervised Learning Workshop |
Abstract | In visual navigation, one powerful paradigm is to predict actions from observations
directly. Training such an end-to-end system allows representations that are useful
for downstream tasks to emerge automatically. However, the lack of inductive
bias makes this system data-hungry. We hypothesize a sufficient representation
of the current view and the goal view for a navigation policy can be learned by
predicting the location and size of a crop of the current view that corresponds
to the goal. We further show that training such random crop prediction in a
self-supervised fashion purely on synthetic noise images transfers well to natural
home images. The learned representation can then be bootstrapped to learn a
navigation policy efficiently with little interaction data. The code is available at
https://yanweiw.github.io/noise2ptz/ |
URL | https://arxiv.org/abs/2207.00052 |
DOI | 10.48550/arXiv.2207.00052 |