This page hosts datasets used to evaluate methods for geo-aware image classification.

  • source set (181 MB): A set of one million Flickr images associated with user tags and geo tags, used to extract variants of tag features for (unlabeled) images.
  • tag vocabulary: A vocabulary of 2k tags, each corresponding to a specific dimension in a tag feature vector.
  • Ground-truth data (which is a geo-tagged subset of the NUS-WIDE dataset)
    • test set (45MB): A test set of 27,401 images, annotations of 81 visual concepts
    • train-dev set (47MB): A development set of 28,821 images, used for training meta classifiers
    • train-val set (20MB): A validation set of 12,352 images, used for tuning hyper-parameters


Shuai Liao, Xirong Li, Heng Tao Shen, Yang Yang, Xiaoyong Du (2015): Tag Features for Geo-Aware Image Classification. In: IEEE Transactions on Multimedia (TMM), 17 (7), pp. 1058-1067, 2015.