Faculty of Information Technology, BUT

Publication Details

Deep Learning on Small Datasets using Online Image Search

KOLÁŘ Martin, HRADIŠ Michal and ZEMČÍK Pavel. Deep Learning on Small Datasets using Online Image Search. In: Proceedings of 32nd Spring Conference on Computer Graphics. Bratislava: Comenius University in Bratislava, 2016, pp. 1-7. ISBN 978-1-4503-3693-2. ISSN 1335-5694. Available from: http://dl.acm.org/citation.cfm?id=2948633
Czech title
Hluboké Učení na Malých Datasetech s použitím Online Obrazového Vyhledávání
conference paper
convolutional neural network, deep learning, image classification, reinforcement learning
Our contribution has the ability to learn visual categories from fewer images than previous approaches. We do this by modifying the pseudolabel method which augments labelled training images with unlabelled images, to create a method capable of handling labelled training images as well as queried images, which are likely to belong to the desired class. This is achieved by modifying the weighting and selection processes.
The presented method adapts the pseudolabel approach to allow the use of web-scale datasets of millions of images. The results are demonstrated on a toy problem&start=0&order=1 devised from the SUN 397 dataset, and on the full SUN 397 dataset expanded with images gathered from Google’s image search without human intervention.
This paper tackles the important unsolved problem of training deep models with small amounts of annotated data. We propose a
semi-supervised self-training bootstrap to deep learning which retrieves and utilizes additional images from internet image search.
We adapt the pseudolabel method proposed by Dong-Hyun Lee in 2013, previously used on the elementary MNIST handwritten
digit classification task. We show that by suitable modifications to its example weighting and selection mechanisms it can be adapted
to general image classification tasks supported by online image search.
The proposed approach does not require any human supervision, it is practical and efficient, and it actively avoids overtraining.
The usefulness of the proposed method is demonstrated on the SUN 397 dataset with only 50 training images per category. When
exploiting results of Google's Image Search, we achieve a significant improvement, with a classification accuracy of 51%, as
opposed to 39% without our method.
Proceeding of Spring Conference on Computer Graphics, vol. 2016, no. 32, ISSN 1335-5694
Proceedings of 32nd Spring Conference on Computer Graphics
Spring Conference on Computer Graphics 2016, Smolenice, SK
Comenius University in Bratislava
Bratislava, SK
   author = "Martin Kol\'{a}\v{r} and Michal Hradi\v{s} and Pavel Zem\v{c}\'{i}k",
   title = "Deep Learning on Small Datasets using Online Image Search",
   pages = "1--7",
   booktitle = "Proceedings of 32nd Spring Conference on Computer Graphics",
   journal = "Proceeding of Spring Conference on Computer Graphics",
   volume = 2016,
   number = 32,
   year = 2016,
   location = "Bratislava, SK",
   publisher = "Comenius University in Bratislava",
   ISBN = "978-1-4503-3693-2",
   ISSN = "1335-5694",
   doi = "10.1145/2948628.2948633",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11143"
Back to top