Not logged in. Login | Signup

Download


Download Features
SIFT features

We currently provide densely sampled SIFT[1] features. We provide raw SIFT descriptors as well as quantized codewords. Spatial coordiates of each descriptor/codeword are also included. The quantized codewords are suitable for Bag of Words representations[2][3]. The features are packaged as Matlab files and can be freely downloaded ( no signing-in is required ). Details are as follows:

  • Each image is resized to have a max side length of no more than 300 pixel. SIFT descriptors are computed on 20x20 overlapping patches with a spacing of 10 pixels. Images are further downsized (to 1/2 the side length and then 1/4 of the side length) and more descriptors are computed. We use the VLFeat[4] implemenation of dense SIFT (version 0.9.4.1).
  • We perform k-means clustering of a random subset of 10 million SIFT descriptors to form a visual vocabulary of 1000 visual words. Each SIFT descriptor is quantized into a visual word using the nearest cluster center.
    References:
    1. David G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 2004. pdf
    2. L. Fei-Fei and P. Perona, A Bayesian Hierarchical Model for Learning Natural Scene Categories. IEEE Comp. Vis. Patt. Recog. 2005. pdf
    3. Svetlana Lazebnik, Cordelia Schmid and Jean Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. IEEE Comp. Vis. Patt. Recog. 2006. pdf
    4. A. Vedaldi and B. Fulkerson. VLFeat: An Open and Portable Library of Computer Vision Algorithms. 2008. http://www.vlfeat.org
    How to download?
    1. We have not yet released SIFT features for all synsets. To check the list of synsets with SIFT features released, please use the API:
      • http://www.image-net.org/api/text/imagenet.sbow.obtain_synset_list

      You can click here to obtain the synset names.

    2. When you browse ImageNet from the Explore page, you can download the bag of visual words (sbow) feature of a synset if there is an icon "Download BoW Feature" below the image view panel.
    3. You can also download the raw SIFT descriptors using the following API:
      • http://www.image-net.org/api/download/imagenet.sift.synset?wnid=[wnid]

      The API will return a Matlab ( .mat ) file. In the Matlab file, each descriptor has 5 fields: x, y, norm, scale, desc. The scale field indicates the scale at which the descriptor is computed. It is either 0 ( finest ), 1 ( 1/2 downsized ), or 2 ( 1/4 downsized ). The desc field is a 128 dimenional L2 normalized float vector.

    4. You can download the bag of visual words ( sbow ) feature for a given synset using the API:
      • http://www.image-net.org/api/download/imagenet.sbow.synset?wnid=[wnid]

      The API will return a Matlab ( .mat ) file. In the Matlab file, each descriptor has 5 fields: x, y, norm, scale, word. The word field is the index of the cluster center, i.e. an integer between 0 and 999.
    Code for computing the features
    To learn more about downloading using the HTTP protocol, please refer to API documentation.