tutorials/basic/image_io.ipynb - mxnet-site - Git at Google

 {"nbformat": 4, "cells": [{"source": "# Image IO - Loading and pre-processing images\n\nThis tutorial explains how to prepare, load and train with image data in\nMXNet. All IO in MXNet is handled via `mx.io.DataIter` and its subclasses. In\nthis tutorial we focus on how to use pre-built data iterators as while as custom\niterators to process image data.\n\nThere are mainly three ways of loading image data in MXNet:\n\n- [NEW] `mx.img.ImageIter`: implemented in python, easily customizable, can load\n  from both .rec files and raw image files.\n- [OLD] `mx.io.ImageRecordIter`: implemented in backend (C++), less customizable\n  but can be used in all language bindings, load from .rec files\n- Custom iterator by inheriting mx.io.DataIter\n\nFirst, we explain the record io file format used by mxnet:\n\n## RecordIO\n\nRecord IO is the main file format used by MXNet for data IO. It supports reading\nand writing on various file systems including distributed file systems like\nHadoop HDFS and AWS S3.  First, we download the Caltech 101 dataset that\ncontains 101 classes of objects and convert them into record io format:", "cell_type": "markdown", "metadata": {}}, {"source": "%matplotlib inline\nimport os\nimport subprocess\nimport mxnet as mx\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n# change this to your mxnet location\nMXNET_HOME = '/scratch/mxnet'", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "Download and unzip:", "cell_type": "markdown", "metadata": {}}, {"source": "os.system('wget http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz -P data/')\nos.chdir('data')\nos.system('tar -xf 101_ObjectCategories.tar.gz')\nos.chdir('../')", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "Let's take a look at the data. As you can see, under the\n[root folder](./data/101_ObjectCategories) every category has a\n[subfolder](./data/101_ObjectCategories/yin_yang).\n\nNow let's convert them into record io format. First we need to make a list that\ncontains all the image files and their categories:", "cell_type": "markdown", "metadata": {}}, {"source": "os.system('python %s/tools/im2rec.py --list=1 --recursive=1 --shuffle=1 --test-ratio=0.2 data/caltech data/101_ObjectCategories'%MXNET_HOME)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "The resulting [list file](./data/caltech_train.lst) is in the format\n`index\\t(one or more label)\\tpath`. In this case there is only one label for\neach image but you can modify the list to add in more for multi label training.\n\nThen we can use this list to create our record io file:", "cell_type": "markdown", "metadata": {}}, {"source": "os.system(\"python %s/tools/im2rec.py --num-thread=4 --pass-through=1 data/caltech data/101_ObjectCategories\"%MXNET_HOME)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "The record io files are now saved at [here](./data)\n\n## ImageRecordIter\n\n`mx.io.ImageRecordIter` can be used for loading image data saved in record io\nformat. It is available in all frontend languages, but as it's implemented in\nC++, it is less flexible.\n\nTo use ImageRecordIter, simply create an instance by loading your record file:", "cell_type": "markdown", "metadata": {}}, {"source": "data_iter = mx.io.ImageRecordIter(\n    path_imgrec=\"./data/caltech_train.rec\", # the target record file\n    data_shape=(3, 227, 227), # output data shape. An 227x227 region will be cropped from the original image.\n    batch_size=4, # number of samples per batch\n    resize=256 # resize the shorter edge to 256 before cropping\n    # ... you can add more augumentation options here. use help(mx.io.ImageRecordIter) to see all possible choices\n    )\ndata_iter.reset()\nbatch = data_iter.next()\ndata = batch.data[0]\nfor i in range(4):\n    plt.subplot(1,4,i+1)\n    plt.imshow(data[i].asnumpy().astype(np.uint8).transpose((1,2,0)))\nplt.show()", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "\n<!-- INSERT SOURCE DOWNLOAD BUTTONS -->\n\n", "cell_type": "markdown", "metadata": {}}], "metadata": {"display_name": "", "name": "", "language": "python"}, "nbformat_minor": 2}