blob: 1a1d0fd111025b8fb09aa8b0f04433321e6fb5c9 [file] [log] [blame]
# Image API
## Overview
This document summarizes supporting functions and iterators to read and process
images provided in
```eval_rst
.. autosummary::
:nosignatures:
mxnet.image
```
## Image processing functions
```eval_rst
.. currentmodule:: mxnet
.. autosummary::
:nosignatures:
image.imdecode
image.scale_down
image.resize_short
image.fixed_crop
image.random_crop
image.center_crop
image.color_normalize
image.random_size_crop
```
## Image iterators
Iterators support loading image from binary `Record IO` and raw image files.
```eval_rst
.. autosummary::
:nosignatures:
image.ImageIter
```
```python
>>> data_iter = mx.image.ImageIter(batch_size=4, data_shape=(3, 224, 224), label_width=1,
path_imglist='data/custom.lst')
>>> data_iter.reset()
>>> for data in data_iter:
... d = data.data[0]
... print(d.shape)
>>> # we can apply lots of augmentations as well
>>> data_iter = mx.image.ImageIter(4, (3, 224, 224), path_imglist='data/custom.lst',
rand_crop=resize=True, rand_mirror=True, mean=True,
brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1,
pca_noise=0.1, rand_gray=0.05)
>>> data = data_iter.next()
>>> # specify augmenters manually is also supported
>>> data_iter = mx.image.ImageIter(32, (3, 224, 224), path_rec='data/caltech.rec',
path_imgidx='data/caltech.idx', shuffle=True,
aug_list=[mx.image.HorizontalFlipAug(0.5),
mx.image.ColorJitterAug(0.1, 0.1, 0.1)])
```
We use helper function to initialize augmenters
```eval_rst
.. currentmodule:: mxnet
.. autosummary::
:nosignatures:
image.CreateAugmenter
```
A list of supporting augmenters
```eval_rst
.. autosummary::
:nosignatures:
image.Augmenter
image.SequentialAug
image.RandomOrderAug
image.ResizeAug
image.ForceResizeAug
image.RandomCropAug
image.RandomSizedCropAug
image.CenterCropAug
image.BrightnessJitterAug
image.ContrastJitterAug
image.SaturationJitterAug
image.HueJitterAug
image.ColorJitterAug
image.LightingAug
image.ColorNormalizeAug
image.RandomGrayAug
image.HorizontalFlipAug
image.CastAug
```
Similar to `ImageIter`, `ImageDetIter` is designed for `Object Detection` tasks.
```eval_rst
.. autosummary::
:nosignatures:
image.ImageDetIter
```
```python
>>> data_iter = mx.image.ImageDetIter(batch_size=4, data_shape=(3, 224, 224),
path_imglist='data/train.lst')
>>> data_iter.reset()
>>> for data in data_iter:
... d = data.data[0]
... l = data.label[0]
... print(d.shape)
... print(l.shape)
```
Unlike object classification with fixed label_width, object count may vary from
image to image. Thus we have special format for object detection labels.
Usually the `lst` file generated by `tools/im2rec.py` is a list of
```
index_0 label_0 image_path_0
index_1 label_1 image_path_1
```
Where `label_N` is a number a of fixed-width vector.
The format of label used in object detection is a variable length vector
```
A B [extra header] [(object0), (object1), ... (objectN)]
```
Where A is the width of header (2 + length of extra header), B is the width of each object.
Extra header is optional and used for inserting helper information such as (width, height).
Each object is usually 5 or 6 numbers describing the object properties, for example:
[id, xmin, ymin, xmax, ymax, difficulty]
Putting all together, we have a `lst` file for object detection:
```
0 4 5 640 480 1 0.1 0.2 0.8 0.9 2 0.5 0.3 0.6 0.8 data/xxx.jpg
1 4 5 480 640 3 0.05 0.16 0.75 0.9 data/yyy.jpg
2 4 5 500 600 2 0.6 0.1 0.7 0.5 0 0.1 0.3 0.2 0.4 3 0.25 0.25 0.3 0.3 data/zzz.jpg
...
```
A helper function to initialize Augmenters for `Object detection` task
```eval_rst
.. autosummary::
:nosignatures:
image.CreateDetAugmenter
```
Since `Detection` task is sensitive to object localization, any modification
to image that introduced localization shift will require correction to label,
and a list of augmenters specific for `Object detection` is provided
```eval_rst
.. autosummary::
:nosignatures:
image.DetBorrowAug
image.DetRandomSelectAug
image.DetHorizontalFlipAug
image.DetRandomCropAug
image.DetRandomPadAug
```
## API Reference
<script type="text/javascript" src='../../_static/js/auto_module_index.js'></script>
```eval_rst
.. automodule:: mxnet.image
.. autoclass:: mxnet.image.ImageIter
:members:
.. automethod:: mxnet.image.imdecode
.. automethod:: mxnet.image.scale_down
.. automethod:: mxnet.image.resize_short
.. automethod:: mxnet.image.fixed_crop
.. automethod:: mxnet.image.random_crop
.. automethod:: mxnet.image.center_crop
.. automethod:: mxnet.image.color_normalize
.. automethod:: mxnet.image.random_size_crop
.. autoclass:: mxnet.image.Augmenter
:members:
.. autoclass:: mxnet.image.ResizeAug
.. autoclass:: mxnet.image.ForceResizeAug
.. autoclass:: mxnet.image.RandomCropAug
.. autoclass:: mxnet.image.RandomSizedCropAug
.. autoclass:: mxnet.image.CenterCropAug
.. autoclass:: mxnet.image.RandomOrderAug
.. autoclass:: mxnet.image.BrightnessJitterAug
.. autoclass:: mxnet.image.ContrastJitterAug
.. autoclass:: mxnet.image.SaturationJitterAug
.. autoclass:: mxnet.image.HueJitterAug
.. autoclass:: mxnet.image.ColorJitterAug
.. autoclass:: mxnet.image.LightingAug
.. autoclass:: mxnet.image.ColorNormalizeAug
.. autoclass:: mxnet.image.RandomGrayAug
.. autoclass:: mxnet.image.HorizontalFlipAug
.. autoclass:: mxnet.image.CastAug
.. automethod:: mxnet.image.CreateAugmenter
.. autoclass:: mxnet.image.ImageDetIter
:members:
.. autoclass:: mxnet.image.DetAugmenter
:members:
.. autoclass:: mxnet.image.DetBorrowAug
.. autoclass:: mxnet.image.DetRandomSelectAug
.. autoclass:: mxnet.image.DetHorizontalFlipAug
.. autoclass:: mxnet.image.DetRandomCropAug
.. autoclass:: mxnet.image.DetRandomPadAug
.. automethod:: mxnet.image.CreateDetAugmenter
```
<script>auto_index("api-reference");</script>