docs/tutorials/python/symbol.md - mxnet-test - Git at Google

 # Symbolic and Automatic Differentiation

 NDArray is the basic computation unit in MXNet. MXNet also provides a
 symbolic interface, named Symbol, to simplify constructing neural networks. Symbol combines flexibility and efficiency. It is similar to
 the network configuration in [Caffe](http://caffe.berkeleyvision.org/) and
 [CXXNet](https://github.com/dmlc/cxxnet) and the symbols define
 the computation graph as in [Theano](http://deeplearning.net/software/theano/).

 ## Basic Composition of Symbols

 The following code creates a two-layer perceptron network:

  ```python
     >>> import mxnet as mx
     >>> net = mx.symbol.Variable('data')
     >>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
     >>> net = mx.symbol.Activation(data=net, name='relu1', act_type="relu")
     >>> net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
     >>> net = mx.symbol.SoftmaxOutput(data=net, name='out')
     >>> type(net)
     <class 'mxnet.symbol.Symbol'>
  ```

 Each symbol takes a (unique) string name. *Variable* often defines the inputs,
 or free variables. Other symbols take a symbol as their input (*data*),
 and might accept other hyper parameters, such as the number of hidden neurons (*num_hidden*)
 or the activation type (*act_type*).

 The symbol can be seen simply as a function taking several arguments whose
 names are automatically generated and can be got with the following:

  ```python
     >>> net.list_arguments()
     ['data', 'fc1_weight', 'fc1_bias', 'fc2_weight', 'fc2_bias', 'out_label']
  ```

  These arguments are the parameters needed by each symbol:

 - *data*: Input data needed by the variable *data*
 - *fc1_weight* and *fc1_bias*: The weight and bias for the first fully connected layer *fc1*
 - *fc2_weight* and *fc2_bias*: The weight and bias for the second fully connected layer *fc2*
 - *out_label*: The label needed by the loss

 We can also specify the automatically generated names explicitly:

  ```python
     >>> net = mx.symbol.Variable('data')
     >>> w = mx.symbol.Variable('myweight')
     >>> net = mx.symbol.FullyConnected(data=net, weight=w, name='fc1', num_hidden=128)
     >>> net.list_arguments()
     ['data', 'myweight', 'fc1_bias']
  ```

 ## More Complicated Composition

 MXNet provides well-optimized symbols for layers
 commonly used in deep learning (see
 [src/operator](https://github.com/dmlc/mxnet/tree/master/src/operator)). We can also easily define new operators
 in Python.  The following example first performs an element-wise add between two
 symbols, then feeds them to the fully connected operator:

  ```python
     >>> lhs = mx.symbol.Variable('data1')
     >>> rhs = mx.symbol.Variable('data2')
     >>> net = mx.symbol.FullyConnected(data=lhs + rhs, name='fc1', num_hidden=128)
     >>> net.list_arguments()
     ['data1', 'data2', 'fc1_weight', 'fc1_bias']
  ```

 We can also construct a symbol in a more flexible way than the single
 forward composition exemplified in the preceding example:

  ```python
     >>> net = mx.symbol.Variable('data')
     >>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
     >>> net2 = mx.symbol.Variable('data2')
     >>> net2 = mx.symbol.FullyConnected(data=net2, name='net2', num_hidden=128)
     >>> composed_net = net(data=net2, name='compose')
     >>> composed_net.list_arguments()
     ['data2', 'net2_weight', 'net2_bias', 'fc1_weight', 'fc1_bias']
  ```

 In the preceding example, *net* is used as a function to apply to an existing symbol
 *net*, and the resulting *composed_net* will replace the original argument *data* with
 *net2*.

 ## Argument Shape Inference

 Now we know how to define a symbol. Next, we can infer the shapes of
 all of the arguments it needs given the shape of its input data:

  ```python
     >>> net = mx.symbol.Variable('data')
     >>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=10)
     >>> arg_shape, out_shape, aux_shape = net.infer_shape(data=(100, 100))
     >>> dict(zip(net.list_arguments(), arg_shape))
     {'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)}
     >>> out_shape
     [(100, 10)]
  ```

 We can use this shape inference as an early debugging mechanism to detect
 shape inconsistency.

 ## Bind the Symbols and Run

 Now we can bind the free variables of the symbol and perform forward and backward operations.
 The ```bind``` function will create a ```Executor``` that can be used to carry out the real computations:

  ```python
     >>> # define computation graphs
     >>> A = mx.symbol.Variable('A')
     >>> B = mx.symbol.Variable('B')
     >>> C = A * B
     >>> a = mx.nd.ones(3) * 4
     >>> b = mx.nd.ones(3) * 2
     >>> # bind the symbol with real arguments
     >>> c_exec = C.bind(ctx=mx.cpu(), args={'A' : a, 'B': b})
     >>> # do forward pass calculation.
     >>> c_exec.forward()
     >>> c_exec.outputs[0].asnumpy()
     [ 8.  8.  8.]
  ```
 For neural nets, a more commonly used pattern is ```simple_bind```, which creates all of the argument arrays for you. Then you can call ```forward```, and ```backward``` (if the gradient is needed)
 to get the gradient:

  ```python
     >>> # define computation graphs
     >>> net = some symbol
     >>> texec = net.simple_bind(data=input_shape)
     >>> texec.forward()
     >>> texec.backward()
  ```
 The [model API](model.md) is a thin wrapper around the symbolic executors to support neural net training.

 We strongly encouraged you to read [Symbolic Configuration and Execution in Pictures](symbol_in_pictures.md),
 which provides a detailed explanation of the concepts in pictures.

 ## How Efficient Is the Symbolic API?

 In short, it is designed to be very efficient in both memory and runtime.

 The major reason for introducing the Symbolic API is to bring the efficient C++
 operations in powerful toolkits, such as CXXNet and Caffe, together with the
 flexible dynamic NDArray operations. To maximize runtime performance and memory
 utilization, all of the memory and computation resources are
 allocated statically during the bind operation.

 The coarse-grained operators are equivalent to CXXNet layers, which are
 extremely efficient.  We also provide fine-grained operators for more flexible
 composition. Because we are also performing more in-place memory allocation, MXNet can
 be more memory efficient than CXXNet, and achieves the same runtime, with
 greater flexibility.

 ## Next Steps
 * [KVStore](kvstore.md)
	# Symbolic and Automatic Differentiation

	NDArray is the basic computation unit in MXNet. MXNet also provides a
	symbolic interface, named Symbol, to simplify constructing neural networks. Symbol combines flexibility and efficiency. It is similar to
	the network configuration in [Caffe](http://caffe.berkeleyvision.org/) and
	[CXXNet](https://github.com/dmlc/cxxnet) and the symbols define
	the computation graph as in [Theano](http://deeplearning.net/software/theano/).

	## Basic Composition of Symbols

	The following code creates a two-layer perceptron network:

	```python
	>>> import mxnet as mx
	>>> net = mx.symbol.Variable('data')
	>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
	>>> net = mx.symbol.Activation(data=net, name='relu1', act_type="relu")
	>>> net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
	>>> net = mx.symbol.SoftmaxOutput(data=net, name='out')
	>>> type(net)
	<class 'mxnet.symbol.Symbol'>
	```

	Each symbol takes a (unique) string name. Variable often defines the inputs,
	or free variables. Other symbols take a symbol as their input (data),
	and might accept other hyper parameters, such as the number of hidden neurons (num_hidden)
	or the activation type (act_type).

	The symbol can be seen simply as a function taking several arguments whose
	names are automatically generated and can be got with the following:

	```python
	>>> net.list_arguments()
	['data', 'fc1_weight', 'fc1_bias', 'fc2_weight', 'fc2_bias', 'out_label']
	```

	These arguments are the parameters needed by each symbol:

	- data: Input data needed by the variable data
	- fc1_weight and fc1_bias: The weight and bias for the first fully connected layer fc1
	- fc2_weight and fc2_bias: The weight and bias for the second fully connected layer fc2
	- out_label: The label needed by the loss

	We can also specify the automatically generated names explicitly:

	```python
	>>> net = mx.symbol.Variable('data')
	>>> w = mx.symbol.Variable('myweight')
	>>> net = mx.symbol.FullyConnected(data=net, weight=w, name='fc1', num_hidden=128)
	>>> net.list_arguments()
	['data', 'myweight', 'fc1_bias']
	```

	## More Complicated Composition

	MXNet provides well-optimized symbols for layers
	commonly used in deep learning (see
	[src/operator](https://github.com/dmlc/mxnet/tree/master/src/operator)). We can also easily define new operators
	in Python. The following example first performs an element-wise add between two
	symbols, then feeds them to the fully connected operator:

	```python
	>>> lhs = mx.symbol.Variable('data1')
	>>> rhs = mx.symbol.Variable('data2')
	>>> net = mx.symbol.FullyConnected(data=lhs + rhs, name='fc1', num_hidden=128)
	>>> net.list_arguments()
	['data1', 'data2', 'fc1_weight', 'fc1_bias']
	```

	We can also construct a symbol in a more flexible way than the single
	forward composition exemplified in the preceding example:

	```python
	>>> net = mx.symbol.Variable('data')
	>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
	>>> net2 = mx.symbol.Variable('data2')
	>>> net2 = mx.symbol.FullyConnected(data=net2, name='net2', num_hidden=128)
	>>> composed_net = net(data=net2, name='compose')
	>>> composed_net.list_arguments()
	['data2', 'net2_weight', 'net2_bias', 'fc1_weight', 'fc1_bias']
	```

	In the preceding example, net is used as a function to apply to an existing symbol
	net, and the resulting composed_net will replace the original argument data with
	net2.

	## Argument Shape Inference

	Now we know how to define a symbol. Next, we can infer the shapes of
	all of the arguments it needs given the shape of its input data:

	```python
	>>> net = mx.symbol.Variable('data')
	>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=10)
	>>> arg_shape, out_shape, aux_shape = net.infer_shape(data=(100, 100))
	>>> dict(zip(net.list_arguments(), arg_shape))
	{'data': (100, 100), 'fc1_weight': (10, 100), 'fc1_bias': (10,)}
	>>> out_shape
	[(100, 10)]
	```

	We can use this shape inference as an early debugging mechanism to detect
	shape inconsistency.

	## Bind the Symbols and Run

	Now we can bind the free variables of the symbol and perform forward and backward operations.
	The ```bind``` function will create a ```Executor``` that can be used to carry out the real computations:

	```python
	>>> # define computation graphs
	>>> A = mx.symbol.Variable('A')
	>>> B = mx.symbol.Variable('B')
	>>> C = A * B
	>>> a = mx.nd.ones(3) * 4
	>>> b = mx.nd.ones(3) * 2
	>>> # bind the symbol with real arguments
	>>> c_exec = C.bind(ctx=mx.cpu(), args={'A' : a, 'B': b})
	>>> # do forward pass calculation.
	>>> c_exec.forward()
	>>> c_exec.outputs[0].asnumpy()
	[ 8. 8. 8.]
	```
	For neural nets, a more commonly used pattern is ```simple_bind```, which creates all of the argument arrays for you. Then you can call ```forward```, and ```backward``` (if the gradient is needed)
	to get the gradient:

	```python
	>>> # define computation graphs
	>>> net = some symbol
	>>> texec = net.simple_bind(data=input_shape)
	>>> texec.forward()
	>>> texec.backward()
	```
	The [model API](model.md) is a thin wrapper around the symbolic executors to support neural net training.

	We strongly encouraged you to read [Symbolic Configuration and Execution in Pictures](symbol_in_pictures.md),
	which provides a detailed explanation of the concepts in pictures.

	## How Efficient Is the Symbolic API?

	In short, it is designed to be very efficient in both memory and runtime.

	The major reason for introducing the Symbolic API is to bring the efficient C++
	operations in powerful toolkits, such as CXXNet and Caffe, together with the
	flexible dynamic NDArray operations. To maximize runtime performance and memory
	utilization, all of the memory and computation resources are
	allocated statically during the bind operation.

	The coarse-grained operators are equivalent to CXXNet layers, which are
	extremely efficient. We also provide fine-grained operators for more flexible
	composition. Because we are also performing more in-place memory allocation, MXNet can
	be more memory efficient than CXXNet, and achieves the same runtime, with
	greater flexibility.

	## Next Steps
	* [KVStore](kvstore.md)