blob: 65c4c8f31c708402331a13ecdfa7fd99705887fb [file] [log] [blame]
{"nbformat": 4, "cells": [{"source": "<!--- Licensed to the Apache Software Foundation (ASF) under one -->\n<!--- or more contributor license agreements. See the NOTICE file -->\n<!--- distributed with this work for additional information -->\n<!--- regarding copyright ownership. The ASF licenses this file -->\n<!--- to you under the Apache License, Version 2.0 (the -->\n<!--- \"License\"); you may not use this file except in compliance -->\n<!--- with the License. You may obtain a copy of the License at -->\n\n<!--- http://www.apache.org/licenses/LICENSE-2.0 -->\n\n<!--- Unless required by applicable law or agreed to in writing, -->\n<!--- software distributed under the License is distributed on an -->\n<!--- \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -->\n<!--- KIND, either express or implied. See the License for the -->\n<!--- specific language governing permissions and limitations -->\n<!--- under the License. -->\n\n# Methods of applying data augmentation (Gluon API)\n\nData Augmentation is a regularization technique that's used to avoid overfitting when training Machine Learning models. Although the technique can be applied in a variety of domains, it's very common in Computer Vision. Adjustments are made to the original images in the training dataset before being used in training. Some example adjustments include translating, cropping, scaling, rotating, changing brightness and contrast. We do this to reduce the dependence of the model on spurious characteristics; e.g. training data may only contain faces that fill 1/4 of the image, so the model trained without data augmentation might unhelpfully learn that faces can only be of this size.\n\nIn this tutorial we demonstrate a method of applying data augmentation with Gluon [`mxnet.gluon.data.Dataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html#mxnet.gluon.data.Dataset)s, specifically the [`ImageFolderDataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html#mxnet.gluon.data.vision.datasets.ImageFolderDataset).", "cell_type": "markdown", "metadata": {}}, {"source": "%matplotlib inline\nimport mxnet as mx # used version '1.0.0' at time of writing\nimport numpy as np\nfrom matplotlib.pyplot import imshow\nimport multiprocessing\nimport os\n\nmx.random.seed(42) # set seed for repeatability", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "We define a utility function below, that will be used for visualising the augmentations in the tutorial.", "cell_type": "markdown", "metadata": {}}, {"source": "def plot_mx_array(array):\n \"\"\"\n Array expected to be height x width x 3 (channels), and values are floats between 0 and 255.\n \"\"\"\n assert array.shape[2] == 3, \"RGB Channel should be last\"\n imshow((array.clip(0, 255)/255).asnumpy())", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "image_folder = os.path.join('data','images')\nmx.test_utils.download('https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/data_aug/inputs/0.jpg', dirname=image_folder)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "example_image = mx.image.imread(os.path.join(image_folder, \"0.jpg\")).astype(\"float32\")\nplot_mx_array(example_image)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "## Quick start with [`ImageFolderDataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html#mxnet.gluon.data.vision.datasets.ImageFolderDataset)\n\nUsing Gluon, it's simple to add data augmentation to your training pipeline. When creating either [`ImageFolderDataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html#mxnet.gluon.data.vision.datasets.ImageFolderDataset) or [`ImageRecordDataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html#mxnet.gluon.data.vision.datasets.ImageRecordDataset), you can pass a `transform` function that will be applied to each image in the dataset, every time it's loaded from disk. Augmentations are intended to be random, so you'll pass a slightly different version of the image to the network on each epoch.\n\nWe define `aug_transform` below to perform a selection of augmentation steps and pass it to our dataset. It's worth noting that augmentations should only be applied to the training data (and not the test data), so you don't want to pass this augmentation transform function to the testing dataset.\n\n[`mxnet.image.CreateAugmenter`](https://mxnet.incubator.apache.org/api/python/image/image.html?highlight=createaugmenter#mxnet.image.CreateAugmenter) is a useful function for creating a diverse set of augmentations at once. Despite the singular `CreateAugmenter`, this function actually returns a list of Augmenters. We can then loop through this list and apply each type of augmentation one after another. Although the parameters of `CreateAugmenter` are fixed, the random augmentations (such as `rand_mirror` and `brightness`) will be different each time `aug_transform` is called.", "cell_type": "markdown", "metadata": {}}, {"source": "def aug_transform(data, label):\n data = data.astype('float32')/255\n augs = mx.image.CreateAugmenter(data_shape=(3, 300, 300),\n rand_crop=0.5, rand_mirror=True, inter_method=10,\n brightness=0.125, contrast=0.125, saturation=0.125,\n pca_noise=0.02)\n for aug in augs:\n data = aug(data)\n return data, label\n\n\ntraining_dataset = mx.gluon.data.vision.ImageFolderDataset('data', transform=aug_transform)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "We can quickly inspect the augmentations by indexing the dataset (which calls the `__getitem__` method of the dataset). When this method is called (with an index) the correct image is read from disk, and the `transform` is applied. We can see the result of the augmentations when comparing the image below with the original image above.", "cell_type": "markdown", "metadata": {}}, {"source": "sample = training_dataset[0]\nsample_data = sample[0]\nplot_mx_array(sample_data*255)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "In practice you should load images from a dataset with a [`mxnet.gluon.data.DataLoader`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=dataloader#mxnet.gluon.data.DataLoader) to take advantage of automatic batching and shuffling. Under the hood the `DataLoader` calls `__getitem__`, but you shouldn't need to call directly for anything other than debugging. Some practitioners pre-augment their datasets by applying a fixed number of augmentations to each image and saving the outputs to disk with the aim of increased throughput. With the `num_workers` parameter of `DataLoader` you can use all CPU cores to apply the augmentations, which often mitigates the need to perform pre-augmentation; reducing complexity and saving disk space.", "cell_type": "markdown", "metadata": {}}, {"source": "batch_size = 1\ntraining_data_loader = mx.gluon.data.DataLoader(training_dataset, batch_size=1, shuffle=True)\n\nfor data_batch, label_batch in training_data_loader:\n plot_mx_array(data_batch[0]*255)\n assert data_batch.shape == (1, 300, 300, 3)\n assert label_batch.shape == (1,)\n break", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "\n\n\n<!-- INSERT SOURCE DOWNLOAD BUTTONS -->", "cell_type": "markdown", "metadata": {}}], "metadata": {"display_name": "", "name": "", "language": "python"}, "nbformat_minor": 2}