blob: 3310f26aba180c2341fbe2b19010194280deb4d5 [file] [log] [blame] [view]
# Data Loading API
## Overview
A data iterator reads data batch by batch.
```perl
pdl> $data = mx->nd->ones([100,10])
pdl> $nd_iter = mx->io->NDArrayIter($data, batch_size=>25)
pdl> for my $batch (@{ $nd_iter }) { print $batch->data->[0],"\n" }
<AI::MXNet::NDArray 25x10 @cpu(0)>
<AI::MXNet::NDArray 25x10 @cpu(0)>
<AI::MXNet::NDArray 25x10 @cpu(0)>
<AI::MXNet::NDArray 25x10 @cpu(0)>
```
If `$nd_iter->reset()` is called, then reads the data again from beginning.
In addition, an iterator provides information about the batch, including the
shapes and name.
```perl
pdl> $nd_iter = mx->io->NDArrayIter(data=>{data => mx->nd->ones([100,10])}, label=>{softmax_label => mx->nd->ones([100])}, batch_size=>25)
pdl> print($nd_iter->provide_data->[0],"\n")
DataDesc[data,25x10,float32,NCHW]
pdl> print($nd_iter->provide_label->[0],"\n")
DataDesc[softmax_label,25,float32,NCHW]
```
So this iterator can be used to train a symbol whose input data variable has
name `data` and input label variable has name `softmax_label`.
```perl
pdl> $data = mx->sym->Variable('data')
pdl> $label = mx->sym->Variable('softmax_label')
pdl> $fullc = mx->sym->FullyConnected(data=>$data, num_hidden=>1)
pdl> $loss = mx->sym->SoftmaxOutput(data=>$data, label=>$label)
pdl> $mod = mx->mod->Module($loss)
pdl> print($mod->data_names->[0])
data
pdl> print($mod->label_names->[0])
softmax_label
pdl> $mod->bind(data_shapes=>$nd_iter->provide_data, label_shapes=>$nd_iter->provide_label)
```
Then we can call `$mod->fit($nd_iter, num_epoch=>2)` to train `loss` by 2 epochs.
## Predefined Data iterators
```perl
mx->io->NDArrayIter
mx->io->CSVIter
mx->io->ImageRecordIter
mx->io->ImageRecordUInt8Iter
mx->io->MNISTIter
mx->recordio->MXRecordIO
mx->recordio->MXIndexedRecordIO
mx->image->ImageIter
```
## Helper classes and functions
Data structures and other iterators provided in the `AI::MXNet::IO` package.
```perl
AI::MXNet::DataDesc
AI::MXNet::DataBatch
AI::MXNet::DataIter
AI::MXNet::ResizeIter
AI::MXNet::MXDataIter
```
A list of image modification functions provided by `AI::MXNet::Image`.
```perl
mx->image->imdecode
mx->image->scale_down
mx->image->resize_short
mx->image->fixed_crop
mx->image->random_crop
mx->image->center_crop
mx->image->color_normalize
mx->image->random_size_crop
mx->image->ResizeAug
mx->image->RandomCropAug
mx->image->RandomSizedCropAug
mx->image->CenterCropAug
mx->image->RandomOrderAug
mx->image->ColorJitterAug
mx->image->LightingAug
mx->image->ColorNormalizeAug
mx->image->HorizontalFlipAug
mx->image->CastAug
mx->image->CreateAugmenter
```
Functions to read and write RecordIO files.
```perl
mx->recordio->pack
mx->recordio->unpack
mx->recordio->unpack_img
```
## Develop a new iterator
Writing a new data iterator in Perl is straightforward. Most MXNet
training/inference programs accept an object with ``provide_data``
and ``provide_label`` properties.
Please refer to AI-MXNet/examples for the examples of custom iterators.