RecordIO implements a file format for a sequence of records. We recommend storing images as records and packing them together. The benefits include:
We provide the im2rec tool so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how.
Download the data. You don't need to resize the images manually. You can use im2rec
to resize them automatically. For details, see the “Extension: Using Multiple Labels for a Single Image,” later in this topic.
After you download the data, you need to make an image list file. The format is:
integer_image_index \t label_index \t path_to_image
Typically, the program takes the list of names of all of the images, shuffles them, then separates them into two lists: a training filename list and a testing filename list. Write the list in the right format.
This is an example file:
95099 464 n04467665_17283.JPEG 10025081 412 ILSVRC2010_val_00025082.JPEG 74181 789 n01915811_2739.JPEG 10035553 859 ILSVRC2010_val_00035554.JPEG 10048727 929 ILSVRC2010_val_00048728.JPEG 94028 924 n01980166_4956.JPEG 1080682 650 n11807979_571.JPEG 972457 633 n07723039_1627.JPEG 7534 11 n01630670_4486.JPEG 1191261 249 n12407079_5106.JPEG
To generate a binary image, use im2rec
in the tool folder. im2rec
takes the path of the _image list file_
you generated, the _root path_
of the images, and the _output file path_
as input. This process usually takes several hours, so be patient.
Sample command:
./bin/im2rec image.lst image_root_dir output.bin resize=256
For more details, run ./bin/im2rec
.
The im2rec
tool and mx.io.ImageRecordIter
have multi-label support for a single image. For example, if you have four labels for a single image, you can use the following procedure to use the RecordIO tools.
integer_image_index \t label_1 \t label_2 \t label_3 \t label_4 \t path_to_image
im2rec
, adding a ‘label_width=4’ to the command argument, for example:./bin/im2rec image.lst image_root_dir output.bin resize=256 label_width=4
label_width=4
and path_imglist=<<The PATH TO YOUR image.lst>>
, for example:dataiter = mx.io.ImageRecordIter( path_imgrec="data/cifar/train.rec", data_shape=(3,28,28), path_imglist="data/cifar/image.lst", label_width=4 )