blob: a05ba5e0b4def0d7ee66dc5ebbe229bf0c72fd2b [file] [log] [blame]
{"nbformat": 4, "cells": [{"source": "# NDArray Indexing - Array indexing features\n\nMXNet's advanced indexing features are modeled after [NumPy's implementation and documentation](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing). You will see direct adaptations of many NumPy indexing features and examples which are close, if not identical, so we borrow much from their documentation.\n\n`NDArray`s can be indexed using the standard Python `x[obj]` syntax, where _x_ is the array and _obj_ the selection.\n\nThere are two kinds of indexing available:\n\n1. basic slicing\n1. advanced indexing\n\nIn MXNet, we support both basic and advanced indexing following the convention of indexing NumPy's `ndarray`.\n\n\n## Basic Slicing and Indexing\n\nBasic slicing extends Python's basic concept of slicing to N dimensions. For a quick review:\n\n```\na[start:end] # items start through end-1\na[start:] # items start through the rest of the array\na[:end] # items from the beginning through end-1\na[:] # a copy of the whole array\n```", "cell_type": "markdown", "metadata": {}}, {"source": "from mxnet import nd", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "For some working examples of basic slicing we'll start simple.", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int32')\nx[5:]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [5 6 7 8 9]\n <NDArray 5 @cpu(0)>", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.array([0, 1, 2, 3])\nprint('1D complete array, x=', x)\ns = x[1:3]\nprint('slicing the 2nd and 3rd elements, s=', s)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " 1D complete array, x=\n [ 0. 1. 2. 3.]\n <NDArray 4 @cpu(0)>\n slicing the 2nd and 3rd elements, s=\n [ 1. 2.]\n <NDArray 2 @cpu(0)>\n\n\nNow let's try slicing the 2nd and 3rd elements of a multi-dimensional array.", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\nprint('multi-D complete array, x=', x)\ns = x[1:3]\nprint('slicing the 2nd and 3rd elements, s=', s)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " multi-D complete array, x=\n [[ 1. 2. 3. 4.]\n [ 5. 6. 7. 8.]\n [ 9. 10. 11. 12.]]\n <NDArray 3x4 @cpu(0)>\n slicing the 2nd and 3rd elements, s=\n [[ 5. 6. 7. 8.]\n [ 9. 10. 11. 12.]]\n <NDArray 2x4 @cpu(0)>\n\n\nNow let's try writing to a specific element. We'll write `9` to element `2` using `x[2] = 9.0`, which will update the whole row.", "cell_type": "markdown", "metadata": {}}, {"source": "print('original x, x=', x)\nx[2] = 9.0\nprint('replaced entire row with x[2] = 9.0, x=', x)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " original x, x=\n [[ 1. 2. 3. 4.]\n [ 5. 6. 7. 8.]\n [ 9. 10. 11. 12.]]\n <NDArray 3x4 @cpu(0)>\n replaced entire row with x[2] = 9.0, x=\n [[ 1. 2. 3. 4.]\n [ 5. 6. 7. 8.]\n [ 9. 9. 9. 9.]]\n <NDArray 3x4 @cpu(0)>\n\n\nWe can target specific elements too. Let's replace the number `3` in the first row with the number `9` using `x[0, 2] = 9.0`.", "cell_type": "markdown", "metadata": {}}, {"source": "print('original x, x=', x)\nx[0, 2] = 9.0\nprint('replaced specific element with x[0, 2] = 9.0, x=', x)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " original x, x=\n [[ 1. 2. 3. 4.]\n [ 5. 6. 7. 8.]\n [ 9. 9. 9. 9.]]\n <NDArray 3x4 @cpu(0)>\n replaced specific element with x[0, 2] = 9.0, x=\n [[ 1. 2. 9. 4.]\n [ 5. 6. 7. 8.]\n [ 9. 9. 9. 9.]]\n <NDArray 3x4 @cpu(0)>\n\n\nNow lets target even more by selecting a couple of targets at the same time. We'll replace the `6` and the `7` with `x[1:2, 1:3] = 5.0`.", "cell_type": "markdown", "metadata": {}}, {"source": "print('original x, x=', x)\nx[1:2, 1:3] = 5.0\nprint('replaced range of elements with x[1:2, 1:3] = 5.0, x=', x)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " original x, x=\n [[ 1. 2. 9. 4.]\n [ 5. 6. 7. 8.]\n [ 9. 9. 9. 9.]]\n <NDArray 3x4 @cpu(0)>\n replaced range of elements with x[1:2, 1:3] = 5.0, x=\n [[ 1. 2. 9. 4.]\n [ 5. 5. 5. 8.]\n [ 9. 9. 9. 9.]]\n <NDArray 3x4 @cpu(0)>\n\n\n## New Indexing Features in v1.0\n\n### Step\n\nThe basic slice syntax is `i:j:k` where _i_ is the starting index, _j_ is the stopping index, and _k_ is the step (_k_ must be nonzero).\n\n**Note**: Previously, MXNet supported basic slicing and indexing only with `step=1`. From release 1.0, arbitrary values of `step` are supported.", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int32')\n# Select elements 1 through 7, and use a step of 2\nx[1:7:2]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [1 3 5]\n <NDArray 3 @cpu(0)>\n\n\n\n## Negative Indices\nNegative _i_ and _j_ are interpreted as _n + i_ and _n + j_ where _n_ is the number of elements in the corresponding dimension. Negative _k_ makes stepping go towards smaller indices.", "cell_type": "markdown", "metadata": {}}, {"source": "x[-2:10]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [8 9]\n <NDArray 2 @cpu(0)>\n\n\n\nIf the number of objects in the selection tuple is less than N , then : is assumed for any subsequent dimensions.", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.array([[[1],[2],[3]],\n [[4],[5],[6]]], dtype='int32')\nx[1:2]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [[[4]\n [5]\n [6]]]\n <NDArray 1x3x1 @cpu(0)>\n\n\n\nYou may use slicing to set values in the array, but (unlike lists) you can never grow the array. The size of the value to be set in `x[obj] = value` must be able to broadcast to the same shape as `x[obj]`.", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.arange(16, dtype='int32').reshape((4, 4))\nprint(x)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [[ 0 1 2 3]\n [ 4 5 6 7]\n [ 8 9 10 11]\n [12 13 14 15]]\n <NDArray 4x4 @cpu(0)>", "cell_type": "markdown", "metadata": {}}, {"source": "print(x[1:4:2, 3:0:-1])", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [[ 7 6 5]\n [15 14 13]]\n <NDArray 2x3 @cpu(0)>", "cell_type": "markdown", "metadata": {}}, {"source": "x[1:4:2, 3:0:-1] = [[16], [17]]\nprint(x)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [[ 0 1 2 3]\n [ 4 16 16 16]\n [ 8 9 10 11]\n [12 17 17 17]]\n <NDArray 4x4 @cpu(0)>\n\n\n## New Advanced Indexing Features in v1.0\n\nAdvanced indexing is triggered when the selection object, obj, is a non-tuple sequence object (e.g. a Python list), a NumPy `ndarray` (of data type integer), an MXNet `NDArray`, or a tuple with at least one sequence object.\n\nAdvanced indexing always returns a __copy__ of the data.\n\n**Note**:\n- When the selection object is a Python list, it must be a list of integers. MXNet does not support the selection object being a nested list. That is, `x[[1, 2]]` is supported, while `x[[1], [2]]` is not.\n- When the selection object is a NumPy `ndarray` or an MXNet `NDArray`, there is no dimension restrictions on the object.\n- When the selection object is a tuple containing Python list(s), both integer lists and nested lists are supported. That is, both `x[1:4, [1, 2]]` and `x[1:4, [[1], [2]]` are supported.\n\n### Purely Integer Array Indexing\nWhen the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing.\n\nAdvanced indexes always are [broadcast](https://docs.scipy.org/doc/numpy-1.13.0/reference/ufuncs.html#ufuncs-broadcasting) and iterated as one:\n```\nresult[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],\n ..., ind_N[i_1, ..., i_M]]\n```\nNote that the result shape is identical to the (broadcast) indexing array shapes `ind_1, ..., ind_N`.\n\n**Example:**\nFrom each row, a specific element should be selected. The row index is just [0, 1, 2] and the column index specifies the element to choose for the corresponding row, here [0, 1, 0]. Using both together the task can be solved using advanced indexing:", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.array([[1, 2],\n [3, 4],\n [5, 6]], dtype='int32')\nx[[0, 1, 2], [0, 1, 0]]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [1 4 5]\n <NDArray 3 @cpu(0)>\n\n\n\nTo achieve a behavior similar to the basic slicing above, broadcasting can be used. This is best understood with an example.\n\nExample:\nFrom a 4x3 array the corner elements should be selected using advanced indexing. Thus all elements for which the column is one of `[0, 2]` and the row is one of `[0, 3]` need to be selected. To use advanced indexing one needs to select all elements explicitly. Using the method explained previously one could write:", "cell_type": "markdown", "metadata": {}}, {"source": "x = nd.array([[ 0, 1, 2],\n [ 3, 4, 5],\n [ 6, 7, 8],\n [ 9, 10, 11]], dtype='int32')\nx[[[0, 0], [3, 3]],\n [[0, 2], [0, 2]]]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [[ 0 2]\n [ 9 11]]\n <NDArray 2x2 @cpu(0)>\n\n\n\nHowever, since the indexing arrays above just repeat themselves, broadcasting can be used.", "cell_type": "markdown", "metadata": {}}, {"source": "x[[[0], [3]],\n [[0, 2]]]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " [[ 0 2]\n [ 9 11]]\n <NDArray 2x2 @cpu(0)>\n\n\n\n### Combining Advanced and Basic Indexing\nThere are three situations we need to consider when mix advanced and basic indices in a single selection object. Let's look at examples to understand each one's behavior.\n\n- There is only one advanced index in the selection object. For example, `x` is an `NDArray` with `shape=(10, 20, 30, 40, 50)` and `result=x[:, :, ind]` has one advanced index `ind` with `shape=(2, 3, 4)` on the third axis. The `result` will have `shape=(10, 20, 2, 3, 4, 40, 50)` because the subspace of `x` in the third dimension is replaced by the subspace of `shape=(2, 3, 4)`. If we let _i_, _j_, _k_ loop over the (2, 3, 4)-shaped subspace, it is equivalent to `result[:, :, i, j, k, :, :] = x[:, :, ind[i, j, k], :, :]`.", "cell_type": "markdown", "metadata": {}}, {"source": "import numpy as np\nshape = (10, 20, 30, 40, 50)\nx = nd.arange(np.prod(shape), dtype='int32').reshape(shape)\nind = nd.arange(24).reshape((2, 3, 4))\nprint(x[:, :, ind].shape)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " (10, 20, 2, 3, 4, 40, 50)\n\n\n- There are at least two advanced indices in the selection object, and all the advanced indices are adjacent to each other. For example, `x` is an `NDArray` with `shape=(10, 20, 30, 40, 50)` and `result=x[:, :, ind1, ind2, :]` has two advanced indices with shapes that are broadcastable to `shape=(2, 3, 4)`. Then the `result` has `shape=(10, 20, 2, 3, 4, 50)` because `(30, 40)`-shaped subspace has been replaced with `(2, 3, 4)`-shaped subspace from the indices.", "cell_type": "markdown", "metadata": {}}, {"source": "ind1 = [0, 1, 2, 3]\nind2 = [[[0], [1], [2]], [[3], [4], [5]]]\nprint(x[:, :, ind1, ind2, :].shape)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": " (10, 20, 2, 3, 4, 50)\n\n\n- There are at least two advanced indices in the selection object, and there is at least one advanced index separated from the others by basic indices. For example, `x` is an `NDArray` with `shape=(10, 20, 30, 40, 50)` and `result=x[:, :, ind1, :, ind2]` has two advanced indices with shapes that are broadcastable to `shape=(2, 3, 4)`. Then the `result` has `shape=(2, 3, 4, 10, 20, 40)` because there is no unambiguous place to place the indexing subspace, hence it is prepended to the beginning.", "cell_type": "markdown", "metadata": {}}, {"source": "print(x[:, :, ind1, :, ind2].shape)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "\n (2, 3, 4, 10, 20, 40)\n\n## References\n\n[NumPy documentation](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing)\n\n<!-- INSERT SOURCE DOWNLOAD BUTTONS -->\n\n", "cell_type": "markdown", "metadata": {}}], "metadata": {"display_name": "", "name": "", "language": "python"}, "nbformat_minor": 2}