blob: 64fc9ff0080275fb9d7fd41290b70d0ad1835eb8 [file] [log] [blame] [view]
# Symbolic and Automatic Differentiation
NDArray is the basic computation unit in MXNet. MXNet also provides a
symbolic interface, named Symbol, to simplify constructing neural networks. Symbol combines flexibility and efficiency. It is similar to
the network configuration in [Caffe](http://caffe.berkeleyvision.org/) and
[CXXNet](https://github.com/dmlc/cxxnet) and the symbols define
the computation graph as in [Theano](http://deeplearning.net/software/theano/).
## Basic Composition of Symbols
The following code creates a two-layer perceptron network:
```python
>>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
>>> net = mx.symbol.Activation(data=net, name='relu1', act_type="relu")
>>> net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
>>> net = mx.symbol.SoftmaxOutput(data=net, name='out')
>>> type(net)
<class 'mxnet.symbol.Symbol'>
```
Each symbol takes a (unique) string name. *Variable* often defines the inputs,
or free variables. Other symbols take a symbol as their input (*data*),
and might accept other hyper parameters, such as the number of hidden neurons (*num_hidden*)
or the activation type (*act_type*).
The symbol can be seen simply as a function taking several arguments whose
names are automatically generated and can be got with the following:
```python
>>> net.list_arguments()
['data', 'fc1_weight', 'fc1_bias', 'fc2_weight', 'fc2_bias', 'out_label']
```
These arguments are the parameters needed by each symbol:
- *data*: Input data needed by the variable *data*
- *fc1_weight* and *fc1_bias*: The weight and bias for the first fully connected layer *fc1*
- *fc2_weight* and *fc2_bias*: The weight and bias for the second fully connected layer *fc2*
- *out_label*: The label needed by the loss
We can also specify the automatically generated names explicitly:
```python
>>> net = mx.symbol.Variable('data')
>>> w = mx.symbol.Variable('myweight')
>>> net = mx.symbol.FullyConnected(data=net, weight=w, name='fc1', num_hidden=128)
>>> net.list_arguments()
['data', 'myweight', 'fc1_bias']
```
## More Complicated Composition
MXNet provides well-optimized symbols for layers
commonly used in deep learning (see
[src/operator](https://github.com/dmlc/mxnet/tree/master/src/operator)). We can also easily define new operators
in Python. The following example first performs an element-wise add between two
symbols, then feeds them to the fully connected operator:
```python
>>> lhs = mx.symbol.Variable('data1')
>>> rhs = mx.symbol.Variable('data2')
>>> net = mx.symbol.FullyConnected(data=lhs + rhs, name='fc1', num_hidden=128)
>>> net.list_arguments()
['data1', 'data2', 'fc1_weight', 'fc1_bias']
```
We can also construct a symbol in a more flexible way than the single
forward composition exemplified in the preceding example:
```python
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
>>> net2 = mx.symbol.Variable('data2')
>>> net2 = mx.symbol.FullyConnected(data=net2, name='net2', num_hidden=128)
>>> composed_net = net(data=net2, name='compose')
>>> composed_net.list_arguments()
['data2', 'net2_weight', 'net2_bias', 'fc1_weight', 'fc1_bias']
```
In the preceding example, *net* is used as a function to apply to an existing symbol
*net*, and the resulting *composed_net* will replace the original argument *data* with
*net2*.
## Argument Shape Inference
Now we know how to define a symbol. Next, we can infer the shapes of
all of the arguments it needs given the shape of its input data:
```python
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=10)
>>> arg_shape, out_shape, aux_shape = net.infer_shape(data=(100, 100))
>>> dict(zip(net.list_arguments(), arg_shape))
{'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)}
>>> out_shape
[(100, 10)]
```
We can use this shape inference as an early debugging mechanism to detect
shape inconsistency.
## Bind the Symbols and Run
Now we can bind the free variables of the symbol and perform forward and backward operations.
The ```bind``` function will create a ```Executor``` that can be used to carry out the real computations:
```python
>>> # define computation graphs
>>> A = mx.symbol.Variable('A')
>>> B = mx.symbol.Variable('B')
>>> C = A * B
>>> a = mx.nd.ones(3) * 4
>>> b = mx.nd.ones(3) * 2
>>> # bind the symbol with real arguments
>>> c_exec = C.bind(ctx=mx.cpu(), args={'A' : a, 'B': b})
>>> # do forward pass calculation.
>>> c_exec.forward()
>>> c_exec.outputs[0].asnumpy()
[ 8. 8. 8.]
```
For neural nets, a more commonly used pattern is ```simple_bind```, which creates all of the argument arrays for you. Then you can call ```forward```, and ```backward``` (if the gradient is needed)
to get the gradient:
```python
>>> # define computation graphs
>>> net = some symbol
>>> texec = net.simple_bind(data=input_shape)
>>> texec.forward()
>>> texec.backward()
```
The [model API](model.md) is a thin wrapper around the symbolic executors to support neural net training.
We strongly encouraged you to read [Symbolic Configuration and Execution in Pictures](symbol_in_pictures.md),
which provides a detailed explanation of the concepts in pictures.
## How Efficient Is the Symbolic API?
In short, it is designed to be very efficient in both memory and runtime.
The major reason for introducing the Symbolic API is to bring the efficient C++
operations in powerful toolkits, such as CXXNet and Caffe, together with the
flexible dynamic NDArray operations. To maximize runtime performance and memory
utilization, all of the memory and computation resources are
allocated statically during the bind operation.
The coarse-grained operators are equivalent to CXXNet layers, which are
extremely efficient. We also provide fine-grained operators for more flexible
composition. Because we are also performing more in-place memory allocation, MXNet can
be more memory efficient than CXXNet, and achieves the same runtime, with
greater flexibility.
## Next Steps
* [KVStore](kvstore.md)