This document summarizes supporting functions and iterators to read and process images provided in
.. autosummary:: :nosignatures: mxnet.image
.. currentmodule:: mxnet .. autosummary:: :nosignatures: image.imdecode image.scale_down image.resize_short image.fixed_crop image.random_crop image.center_crop image.color_normalize image.random_size_crop
Iterators support loading image from binary Record IO and raw image files.
.. autosummary:: :nosignatures: image.ImageIter
>>> data_iter = mx.image.ImageIter(batch_size=4, data_shape=(3, 224, 224), label_width=1, path_imglist='data/custom.lst') >>> data_iter.reset() >>> for data in data_iter: ... d = data.data[0] ... print(d.shape) >>> # we can apply lots of augmentations as well >>> data_iter = mx.image.ImageIter(4, (3, 224, 224), path_imglist='data/custom.lst', rand_crop=True, rand_resize=True, rand_mirror=True, mean=True, brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1, pca_noise=0.1, rand_gray=0.05) >>> data = data_iter.next() >>> # specify augmenters manually is also supported >>> data_iter = mx.image.ImageIter(32, (3, 224, 224), path_rec='data/caltech.rec', path_imgidx='data/caltech.idx', shuffle=True, aug_list=[mx.image.HorizontalFlipAug(0.5), mx.image.ColorJitterAug(0.1, 0.1, 0.1)])
We use helper function to initialize augmenters
.. currentmodule:: mxnet .. autosummary:: :nosignatures: image.CreateAugmenter
A list of supporting augmenters
.. autosummary:: :nosignatures: image.Augmenter image.SequentialAug image.RandomOrderAug image.ResizeAug image.ForceResizeAug image.RandomCropAug image.RandomSizedCropAug image.CenterCropAug image.BrightnessJitterAug image.ContrastJitterAug image.SaturationJitterAug image.HueJitterAug image.ColorJitterAug image.LightingAug image.ColorNormalizeAug image.RandomGrayAug image.HorizontalFlipAug image.CastAug
Similar to ImageIter, ImageDetIter is designed for Object Detection tasks.
.. autosummary:: :nosignatures: image.ImageDetIter
>>> data_iter = mx.image.ImageDetIter(batch_size=4, data_shape=(3, 224, 224), path_imglist='data/train.lst') >>> data_iter.reset() >>> for data in data_iter: ... d = data.data[0] ... l = data.label[0] ... print(d.shape) ... print(l.shape)
Unlike object classification with fixed label_width, object count may vary from image to image. Thus we have special format for object detection labels. Usually the lst file generated by tools/im2rec.py is a list of
index_0 label_0 image_path_0 index_1 label_1 image_path_1
Where label_N is a number a of fixed-width vector. The format of label used in object detection is a variable length vector
A B [extra header] [(object0), (object1), ... (objectN)]
Where A is the width of header (2 + length of extra header), B is the width of each object. Extra header is optional and used for inserting helper information such as (width, height). Each object is usually 5 or 6 numbers describing the object properties, for example: [id, xmin, ymin, xmax, ymax, difficulty] Putting all together, we have a lst file for object detection:
0 4 5 640 480 1 0.1 0.2 0.8 0.9 2 0.5 0.3 0.6 0.8 data/xxx.jpg 1 4 5 480 640 3 0.05 0.16 0.75 0.9 data/yyy.jpg 2 4 5 500 600 2 0.6 0.1 0.7 0.5 0 0.1 0.3 0.2 0.4 3 0.25 0.25 0.3 0.3 data/zzz.jpg ...
A helper function to initialize Augmenters for Object detection task
.. autosummary:: :nosignatures: image.CreateDetAugmenter
Since Detection task is sensitive to object localization, any modification to image that introduced localization shift will require correction to label, and a list of augmenters specific for Object detection is provided
.. autosummary:: :nosignatures: image.DetBorrowAug image.DetRandomSelectAug image.DetHorizontalFlipAug image.DetRandomCropAug image.DetRandomPadAug
.. automodule:: mxnet.image .. autoclass:: mxnet.image.ImageIter :members: .. automethod:: mxnet.image.imdecode .. automethod:: mxnet.image.scale_down .. automethod:: mxnet.image.resize_short .. automethod:: mxnet.image.fixed_crop .. automethod:: mxnet.image.random_crop .. automethod:: mxnet.image.center_crop .. automethod:: mxnet.image.color_normalize .. automethod:: mxnet.image.random_size_crop .. autoclass:: mxnet.image.Augmenter :members: .. autoclass:: mxnet.image.ResizeAug .. autoclass:: mxnet.image.ForceResizeAug .. autoclass:: mxnet.image.RandomCropAug .. autoclass:: mxnet.image.RandomSizedCropAug .. autoclass:: mxnet.image.CenterCropAug .. autoclass:: mxnet.image.SequentialAug .. autoclass:: mxnet.image.RandomOrderAug .. autoclass:: mxnet.image.BrightnessJitterAug .. autoclass:: mxnet.image.ContrastJitterAug .. autoclass:: mxnet.image.SaturationJitterAug .. autoclass:: mxnet.image.HueJitterAug .. autoclass:: mxnet.image.ColorJitterAug .. autoclass:: mxnet.image.LightingAug .. autoclass:: mxnet.image.ColorNormalizeAug .. autoclass:: mxnet.image.RandomGrayAug .. autoclass:: mxnet.image.HorizontalFlipAug .. autoclass:: mxnet.image.CastAug .. automethod:: mxnet.image.CreateAugmenter .. autoclass:: mxnet.image.ImageDetIter :members: .. autoclass:: mxnet.image.DetAugmenter :members: .. autoclass:: mxnet.image.DetBorrowAug .. autoclass:: mxnet.image.DetRandomSelectAug .. autoclass:: mxnet.image.DetHorizontalFlipAug .. autoclass:: mxnet.image.DetRandomCropAug .. autoclass:: mxnet.image.DetRandomPadAug .. automethod:: mxnet.image.CreateDetAugmenter