merge updates from chris
diff --git a/docs-site/docs/assets/singav3.1-sw.png b/docs-site/docs/assets/singav3.1-sw.png
new file mode 100644
index 0000000..2bfd2c0
--- /dev/null
+++ b/docs-site/docs/assets/singav3.1-sw.png
Binary files differ
diff --git a/docs-site/docs/autograd.md b/docs-site/docs/autograd.md
index 9bde49d..20cfaa5 100644
--- a/docs-site/docs/autograd.md
+++ b/docs-site/docs/autograd.md
@@ -192,15 +192,15 @@
 
 ### Using the Model API
 
-The following
-[example](https://github.com/apache/singa/blob/master/examples/autograd/cnn_module.py)
-implements a CNN model using the Model provided by the model.
+The following <<<<<<< HEAD
+[example](https://github.com/apache/singa/blob/master/examples/cnn/model/cnn.py)
+implements a CNN model using the [Model API](./graph).
 
 #### Define the subclass of Model
 
-Define the model class, it should be the subclass of the Model. In this way,
-all operations used during traing phase will form a calculation graph and will
-be analyzed. The operations in the graph will be scheduled and executed
+Define the model class, it should be the subclass of Model. In this way, all
+operations used during the training phase will form a computational graph and
+will be analyzed. The operations in the graph will be scheduled and executed
 efficiently. Layers can also be included in the model class.
 
 ```python
@@ -283,5 +283,5 @@
 ### Python API
 
 Refer
-[here](https://singa.readthedocs.io/en/latest/docs/autograd.html#module-singa.autograd)
+[here](https://singa.readthedocs.io/en/latest/autograd.html#module-singa.autograd)
 for more details of Python API.
diff --git a/docs-site/docs/dist-train.md b/docs-site/docs/dist-train.md
index b6cc0da..f242ffb 100644
--- a/docs-site/docs/dist-train.md
+++ b/docs-site/docs/dist-train.md
@@ -347,8 +347,8 @@
 example code on top of this page. We could just copy the code for the options
 and use it in other models.
 
-With the defined options, we can put the arguments `dist_option` and `spars` when we 
-start the training with `model(tx, ty, dist_option, spars)`
+With the defined options, we can put the arguments `dist_option` and `spars`
+when we start the training with `model(tx, ty, dist_option, spars)`
 
 ### No Optimizations
 
diff --git a/docs-site/docs/examples.md b/docs-site/docs/examples.md
index c143563..53255f6 100644
--- a/docs-site/docs/examples.md
+++ b/docs-site/docs/examples.md
@@ -52,15 +52,15 @@
 
 ## Text Classification
 
-| Model       | Dataset     | Links      |
-| ----------- | ----------- | ---------- |
-| Simple LSTM | IMDB        | [python]() |
+| Model       | Dataset | Links      |
+| ----------- | ------- | ---------- |
+| Simple LSTM | IMDB    | [python]() |
 
 ## Text Ranking
-| Model       | Dataset     | Links      |
-| ----------- | ----------- | ---------- |
-| BiLSTM      | InsuranceQA | [python]() |
 
+| Model  | Dataset     | Links      |
+| ------ | ----------- | ---------- |
+| BiLSTM | InsuranceQA | [python]() |
 
 ## Misc.
 
diff --git a/docs-site/docs/graph.md b/docs-site/docs/graph.md
index e4dc7ba..bf6bac0 100644
--- a/docs-site/docs/graph.md
+++ b/docs-site/docs/graph.md
@@ -1,6 +1,6 @@
 ---
 id: graph
-title: Computational Graph
+title: Model
 ---
 
 <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License. -->
@@ -13,8 +13,17 @@
 speed and memory optimization can be conducted by scheduling the execution of
 the operations and memory allocation/release intelligently. In SINGA, users only
 need to define the neural network model using the
-[Model](https://github.com/apache/singa/blob/master/python/singa/model.py)
-API. The graph is constructed and optimized at the C++ backend automatically.
+[Model](https://github.com/apache/singa/blob/master/python/singa/model.py) API.
+The graph is constructed and optimized at the C++ backend automatically.
+
+In this way, on the one hand, users implement a network using the
+[Model](./graph) API following the imperative programming style like PyTorch.
+Different to PyTorch which recreates the operations in every iteration, SINGA
+buffers the operations to create a computational graph implicitly (when this
+feature is enabled) after the first iteration. Therefore, on the other hand,
+SINGA has a similar computational graph as the one created by libraries using
+declarative programming, e.g., TensorFlow. Consequently, it can enjoy the
+optimizations done over the graph.
 
 ## Example
 
@@ -127,9 +136,8 @@
         ...
 ```
 
-The `Linear` layer is composed of the `mutmul` operator.
-`autograd` implements the `matmul` operator by calling the function `Mult`
-exposed from CPP via SWIG.
+The `Linear` layer is composed of the `mutmul` operator. `autograd` implements
+the `matmul` operator by calling the function `Mult` exposed from CPP via SWIG.
 
 ```python
 # implementation of matmul()
@@ -419,7 +427,7 @@
 ### Multi processes
 
 - Experiment settings
-  - Model
+  - API
     - using Layer: ResNet50 in
       [resnet_dist.py](https://github.com/apache/singa/blob/master/examples/cnn/autograd/resnet_dist.py)
     - using Model: ResNet50 in
diff --git a/docs-site/docs/installation.md b/docs-site/docs/installation.md
index b7a21cd..72b7d3f 100644
--- a/docs-site/docs/installation.md
+++ b/docs-site/docs/installation.md
@@ -5,7 +5,7 @@
 
 <!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.  -->
 
-## From Conda
+## Using Conda
 
 Conda is a package manager for Python, CPP and other packages.
 
@@ -15,14 +15,14 @@
 SINGA.
 
 1. CPU only
-   [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Ntkhi-Z6XTR8WYPXiLwujHd2dOm0772V)
+   [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Ntkhi-Z6XTR8WYPXiLwujHd2dOm0772V?usp=sharing)
 
 ```shell
 $ conda install -c nusdbsystem -c conda-forge singa-cpu
 ```
 
 2. GPU with CUDA and cuDNN (CUDA driver >=384.81 is required)
-   [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1do_TLJe18IthLOnBOsHCEe-FFPGk1sPJ)
+   [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1do_TLJe18IthLOnBOsHCEe-FFPGk1sPJ?usp=sharing)
 
 ```shell
 $ conda install -c nusdbsystem -c conda-forge singa-gpu
@@ -56,6 +56,45 @@
 
 then SINGA is installed successfully.
 
+## Using Pip
+
+1. CPU only
+   [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/17RA056Brwk0vBQTFaZ-l9EbqwADO0NA9?usp=sharing)
+
+```bash
+pip install singa -f http://singa.apache.org/docs/next/wheel-cpu.html --trusted-host singa.apache.org
+```
+
+You can install a specific version of SINGA via `singa==<version>`, where the
+`<version>` field should be replaced, e.g., `3.1.0`. The available SINGA
+versions are listed at the link.
+
+To install the latest develop version, replace the link with
+http://singa.apache.org/docs/next/wheel-cpu-dev.html
+
+2. GPU With CUDA and cuDNN
+   [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1W30IPCqj5fG8ADAQsFqclaCLyIclVcJL?usp=sharing)
+
+```bash
+pip install singa -f http://singa.apache.org/docs/next/wheel-cuda.html --trusted-host singa.apache.org
+```
+
+You can also configure SINGA version and the CUDA version, like
+`singa==3.1.0+cuda10.2`. The available combinations of SINGA version and CUDA
+version are listed at the link.
+
+To install the latest develop version, replace the link with
+http://singa.apache.org/docs/next/wheel-cuda-dev.html
+
+Note: the Python version of your local Python environment will be used to find
+the corresponding wheel package. For example, if your local Python is 3.6, then
+the wheel package compiled on Python 3.6 will be selected by pip and installed.
+In fact, the wheel file's name include SINGA version, CUDA version and Python
+version. Therefore, `pip` knows which wheel file to download and install.
+
+Refer to the comments at the top of the `setup.py` file for how to build the
+wheel packages.
+
 ## Using Docker
 
 Install Docker on your local host machine following the
diff --git a/docs-site/docs/optimizer.md b/docs-site/docs/optimizer.md
index 1347e84..4949471 100644
--- a/docs-site/docs/optimizer.md
+++ b/docs-site/docs/optimizer.md
@@ -5,7 +5,11 @@
 
 <!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.  -->
 
-SINGA supports various popular optimizers including stochastic gradient descent with momentum, Adam, RMSProp, and AdaGrad, etc. For each of the optimizer, it supports to use a decay schedular to schedule the learning rate to be applied in different epochs. The optimizers and the decay schedulers are included in `singa/opt.py`.
+SINGA supports various popular optimizers including stochastic gradient descent
+with momentum, Adam, RMSProp, and AdaGrad, etc. For each of the optimizer, it
+supports to use a decay schedular to schedule the learning rate to be applied in
+different epochs. The optimizers and the decay schedulers are included in
+`singa/opt.py`.
 
 ## Create an optimizer
 
@@ -16,7 +20,7 @@
 lr = 0.001
 # define hyperparameter momentum
 momentum = 0.9
-# define hyperparameter weight decay 
+# define hyperparameter weight decay
 weight_decay = 0.0001
 
 from singa import opt
@@ -30,9 +34,9 @@
 lr = 0.001
 # define hyperparameter rho
 rho = 0.9
-# define hyperparameter epsilon 
+# define hyperparameter epsilon
 epsilon = 1e-8
-# define hyperparameter weight decay 
+# define hyperparameter weight decay
 weight_decay = 0.0001
 
 from singa import opt
@@ -44,9 +48,9 @@
 ```python
 # define hyperparameter learning rate
 lr = 0.001
-# define hyperparameter epsilon 
+# define hyperparameter epsilon
 epsilon = 1e-8
-# define hyperparameter weight decay 
+# define hyperparameter weight decay
 weight_decay = 0.0001
 
 from singa import opt
@@ -58,13 +62,13 @@
 ```python
 # define hyperparameter learning rate
 lr = 0.001
-# define hyperparameter beta 1 
+# define hyperparameter beta 1
 beta_1= 0.9
-# define hyperparameter beta 2 
+# define hyperparameter beta 2
 beta_1= 0.999
-# define hyperparameter epsilon 
+# define hyperparameter epsilon
 epsilon = 1e-8
-# define hyperparameter weight decay 
+# define hyperparameter weight decay
 weight_decay = 0.0001
 
 from singa import opt
@@ -104,7 +108,8 @@
 model.set_optimizer(sgd)
 ```
 
-Then, when we call the model, it runs the `train_one_batch` method that utilizes the optimizer.
+Then, when we call the model, it runs the `train_one_batch` method that utilizes
+the optimizer.
 
 Hence, an example of an iterative loop to optimize the model is:
 
@@ -119,4 +124,4 @@
 
     # Training with one batch
     out, loss = model(tx, ty)
-```
\ No newline at end of file
+```
diff --git a/docs-site/docs/software-stack.md b/docs-site/docs/software-stack.md
index c4244f0..05d8208 100644
--- a/docs-site/docs/software-stack.md
+++ b/docs-site/docs/software-stack.md
@@ -12,9 +12,11 @@
 and communication components for distributed training. The Python interface
 wraps some CPP data structures and provides additional high-level classes for
 neural network training, which makes it convenient to implement complex neural
-network models. Next, we introduce the software stack in a bottom-up manner.
+network models.
 
-![SINGA V3 software stack](assets/singav3-sw.png) <br/> **Figure 1 - SINGA V3
+Next, we introduce the software stack in a bottom-up manner.
+
+![SINGA V3 software stack](assets/singav3.1-sw.png) <br/> **Figure 1 - SINGA V3
 software stack.**
 
 ## Low-level Backend
@@ -126,12 +128,20 @@
 buffered by the `Scheduler` to create a [computational graph](./graph) for
 efficiency and memory optimization.
 
-### Module
+### Model
 
-`Module` provides an easy interface to implement new network models. You just
-need to inherit `Module` and define the forward propagation of the model by
-creating and calling the layers or operators. `Module` will do autograd and
+[Model](./graph) provides an easy interface to implement new network models. You
+just need to inherit `Model` and define the forward propagation of the model by
+creating and calling the layers or operators. `Model` will do autograd and
 update the parameters via `Opt` automatically when training data is fed into it.
+With the `Model` API, SINGA enjoys the advantages of imperative programming and
+declarative programming. Users implement a network using the [Model](./graph)
+API following the imperative programming style like PyTorch. Different to
+PyTorch which recreates the operations in every iteration, SINGA buffers the
+operations to create a computational graph implicitly (when this feature is
+enabled) after the first iteration. The graph is similar to that created by
+libraries using declarative programming, e.g., TensorFlow. Therefore, SINGA can
+apply the memory and speed optimization techniques over the computational graph.
 
 ### ONNX
 
diff --git a/docs-site/docs/team-list.md b/docs-site/docs/team-list.md
index 680cb5b..41bd786 100644
--- a/docs-site/docs/team-list.md
+++ b/docs-site/docs/team-list.md
@@ -46,14 +46,14 @@
 
 ## Contributors
 
-| Name               | Email                        | Organization                                  |
-| ------------------ | ---------------------------- | --------------------------------------------- |
-| Haibo Chen         | hzchenhaibo@corp.netease.com | NetEase                                       |
-| Shicheng Chen      | chengsc@comp.nus.edu.sg      | National University of Singapore              |
-| Xin Ji             | vincent.j.xin@gmail.com      | Visenze, Singapore                            |
-| Anthony K. H. Tung | atung@comp.nus.edu.sg        | National University of Singapore              |
-| Ji Wang            | wangji@mzhtechnologies.com   | Hangzhou MZH Technologies                     |
-| Yuan Wang          | wangyuan@corp.netease.com    | NetEase                                       |
-| Wenfeng Wu         | dcswuw@gmail.com             | Freelancer, China                             |
-| Kaiyuan Yang       | yangky@comp.nus.edu.sg       | National University of Singapore              |
-| Chang Yao          | yaochang2009@gmail.com       | Hangzhou MZH Technologies                     |
+| Name               | Email                        | Organization                     |
+| ------------------ | ---------------------------- | -------------------------------- |
+| Haibo Chen         | hzchenhaibo@corp.netease.com | NetEase                          |
+| Shicheng Chen      | chengsc@comp.nus.edu.sg      | National University of Singapore |
+| Xin Ji             | vincent.j.xin@gmail.com      | Visenze, Singapore               |
+| Anthony K. H. Tung | atung@comp.nus.edu.sg        | National University of Singapore |
+| Ji Wang            | wangji@mzhtechnologies.com   | Hangzhou MZH Technologies        |
+| Yuan Wang          | wangyuan@corp.netease.com    | NetEase                          |
+| Wenfeng Wu         | dcswuw@gmail.com             | Freelancer, China                |
+| Kaiyuan Yang       | yangky@comp.nus.edu.sg       | National University of Singapore |
+| Chang Yao          | yaochang2009@gmail.com       | Hangzhou MZH Technologies        |
diff --git a/docs-site/docs/tensor.md b/docs-site/docs/tensor.md
index 6e983b5..aad5366 100644
--- a/docs-site/docs/tensor.md
+++ b/docs-site/docs/tensor.md
@@ -46,7 +46,8 @@
 ```
 
 `tensor` transformation up to 6 dims
-``` python
+
+```python
 >>> a = tensor.random((2,3,4,5,6,7))
 >>> a.shape
 (2, 3, 4, 5, 6, 7)
@@ -68,7 +69,8 @@
 ```
 
 `tensor` broadcasting arithmetic:
-``` python
+
+```python
 >>> a
 [[1. 2. 3.]
  [4. 5. 6.]]
@@ -90,7 +92,8 @@
 ```
 
 `tensor` broadcasting on matrix multiplication (GEMM)
-``` python
+
+```python
 >>> from singa import tensor
 >>> a = tensor.random((2,2,2,3))
 >>> b = tensor.random((2,3,4))
@@ -126,13 +129,12 @@
 >>> x.to_device(device.get_default_device())
 ```
 
-
 ### use Tensor to train MLP
 
 ```python
 
 """
-  code snipet from examples/mlp/module.py 
+  code snipet from examples/mlp/module.py
 """
 
 label = get_label()
@@ -162,8 +164,8 @@
 
 Output:
 
-``` bash
-$ python3 examples/mlp/module.py 
+```bash
+$ python3 examples/mlp/module.py
 training loss =  0.6158037
 training loss =  0.52852553
 training loss =  0.4571422
diff --git a/docs-site/docs/time-profiling.md b/docs-site/docs/time-profiling.md
index 8f38da0..8f9fda1 100644
--- a/docs-site/docs/time-profiling.md
+++ b/docs-site/docs/time-profiling.md
@@ -5,11 +5,19 @@
 
 <!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.  -->
 
-SINGA supports the time profiling of each of the operators buffered in the graph. To utilize the time profiling function, we first call the ```device.SetVerbosity``` method to set the verbosity of the time profilier, and then call the ```device.PrintTimeProfiling``` to print out the results of time profiling.
+SINGA supports the time profiling of each of the operators buffered in the
+graph. To utilize the time profiling function, we first call the
+`device.SetVerbosity` method to set the verbosity of the time profilier, and
+then call the `device.PrintTimeProfiling` to print out the results of time
+profiling.
 
 ## Setup the Time Profiling Verbosity
 
-To use the time profiling function, we need to set the verbosity. There are three levels of verbosity. With the default value ```verbosity == 0```, it will not do any time profiling. When we set ```verbosity == 1```, it will profile the forward and backward propagation time. When ```verbosity == 2 ```, it will profile the time spent on every buffered operation in the graph.
+To use the time profiling function, we need to set the verbosity. There are
+three levels of verbosity. With the default value `verbosity == 0`, it will not
+do any time profiling. When we set `verbosity == 1`, it will profile the forward
+and backward propagation time. When `verbosity == 2`, it will profile the time
+spent on every buffered operation in the graph.
 
 The following is the example code to setup the time profiling function:
 
@@ -18,13 +26,15 @@
 from singa import device
 dev = device.create_cuda_gpu()
 # set the verbosity
-verbosity = 2 
+verbosity = 2
 dev.SetVerbosity(verbosity)
 # optional: skip the first 5 iterations when profiling the time
 dev.SetSkipIteration(5)
 ```
 
-Then, after we have completed the training at the end of the program, we can print the time profiling result by calling the ```device.PrintTimeProfiling``` method:
+Then, after we have completed the training at the end of the program, we can
+print the time profiling result by calling the `device.PrintTimeProfiling`
+method:
 
 ```python
 dev.PrintTimeProfiling()
@@ -32,9 +42,11 @@
 
 ## Example Outputs for Different Verbosity
 
-We can run the ResNet [example](https://github.com/apache/singa/blob/master/examples/cnn/benchmark.py) to see the output with different setting of verbosity:
+We can run the ResNet
+[example](https://github.com/apache/singa/blob/master/examples/cnn/benchmark.py)
+to see the output with different setting of verbosity:
 
-1. ```verbosity == 1```
+1. `verbosity == 1`
 
 ```
 Time Profiling:
@@ -42,7 +54,7 @@
 Backward Propagation Time : 0.114813 sec
 ```
 
-2. ```verbosity == 2```
+2. `verbosity == 2`
 
 ```
 Time Profiling:
@@ -150,4 +162,4 @@
 .
 .
 .
-```
\ No newline at end of file
+```
diff --git a/docs-site/docs/wheel-cpu-dev.md b/docs-site/docs/wheel-cpu-dev.md
new file mode 100644
index 0000000..9f8f7af
--- /dev/null
+++ b/docs-site/docs/wheel-cpu-dev.md
@@ -0,0 +1,12 @@
+---
+id: wheel-cpu-dev
+title: CPU only Wheel Packages (develop version)
+---
+
+<!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.  -->
+
+## 3.0.0.dev200720
+
+- [Python 3.6](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0.dev200720-cp36-cp36m-manylinux2014_x86_64.whl)
+- [Python 3.7](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0.dev200720-cp37-cp37m-manylinux2014_x86_64.whl)
+- [Python 3.8](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0.dev200720-cp38-cp38-manylinux2014_x86_64.whl)
diff --git a/docs-site/docs/wheel-cpu.md b/docs-site/docs/wheel-cpu.md
new file mode 100644
index 0000000..4f0b921
--- /dev/null
+++ b/docs-site/docs/wheel-cpu.md
@@ -0,0 +1,18 @@
+---
+id: wheel-cpu
+title: CPU only Wheel Packages
+---
+
+<!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.  -->
+
+## 3.1.0.RC1
+
+- [Python 3.6](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.1.0rc1-cp36-cp36m-manylinux2014_x86_64.whl)
+- [Python 3.7](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.1.0rc1-cp37-cp37m-manylinux2014_x86_64.whl)
+- [Python 3.8](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.1.0rc1-cp38-cp38-manylinux2014_x86_64.whl)
+
+## 3.0.0
+
+- [Python 3.6](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0-cp36-cp36m-manylinux2014_x86_64.whl)
+- [Python 3.7](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0-cp37-cp37m-manylinux2014_x86_64.whl)
+- [Python 3.8](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0-cp38-cp38-manylinux2014_x86_64.whl)
diff --git a/docs-site/docs/wheel-gpu-dev.md b/docs-site/docs/wheel-gpu-dev.md
new file mode 100644
index 0000000..3c061e9
--- /dev/null
+++ b/docs-site/docs/wheel-gpu-dev.md
@@ -0,0 +1,12 @@
+---
+id: wheel-gpu-dev
+title: Wheel Packages with CUDA enabled (develop version)
+---
+
+<!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.  -->
+
+## 3.0.0.dev200720
+
+- [CUDA10.2, cuDNN 7.6.5, Python 3.6](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0.dev200720%2Bcuda10.2-cp36-cp36m-manylinux2014_x86_64.whl)
+- [CUDA10.2, cuDNN 7.6.5, Python 3.7](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0.dev200720%2Bcuda10.2-cp37-cp37m-manylinux2014_x86_64.whl)
+- [CUDA10.2, cuDNN 7.6.5, Python 3.8](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0.dev200720%2Bcuda10.2-cp38-cp38-manylinux2014_x86_64.whl)
diff --git a/docs-site/docs/wheel-gpu.md b/docs-site/docs/wheel-gpu.md
new file mode 100644
index 0000000..c64e4f8
--- /dev/null
+++ b/docs-site/docs/wheel-gpu.md
@@ -0,0 +1,21 @@
+---
+id: wheel-gpu
+title: Wheel Packages with CUDA Enabled
+---
+
+<!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.  See the NOTICE file distributed with this work for additional information regarding copyright ownership.  The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.  You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.  -->
+
+## 3.1.0.RC1
+
+- [CUDA10.2, cuDNN 7.6.5, Python
+  3.6]https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.1.0rc1%2Bcuda10.2-cp36-cp36m-manylinux2014_x86_64.whl)
+- [CUDA10.2, cuDNN 7.6.5, Python
+  3.7]https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.1.0rc1%2Bcuda10.2-cp37-cp37m-manylinux2014_x86_64.whl)
+- [CUDA10.2, cuDNN 7.6.5, Python
+  3.8]https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.1.0rc1%2Bcuda10.2-cp38-cp38-manylinux2014_x86_64.whl)
+
+## 3.0.0
+
+- [CUDA10.2, cuDNN 7.6.5, Python 3.6](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0%2Bcuda10.2-cp36-cp36m-manylinux2014_x86_64.whl)
+- [CUDA10.2, cuDNN 7.6.5, Python 3.7](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0%2Bcuda10.2-cp37-cp37m-manylinux2014_x86_64.whl)
+- [CUDA10.2, cuDNN 7.6.5, Python 3.8](https://singa-wheel.s3-ap-southeast-1.amazonaws.com/singa-3.0.0%2Bcuda10.2-cp38-cp38-manylinux2014_x86_64.whl)
diff --git a/docs-site/website/pages/en/index.js b/docs-site/website/pages/en/index.js
index b26dfef..7aad38e 100644
--- a/docs-site/website/pages/en/index.js
+++ b/docs-site/website/pages/en/index.js
@@ -143,7 +143,7 @@
                 {
                   content: `SINGA has a simple [software stack and Python interface](./docs/software-stack) to improve usability.`,
                   imageAlign: "left",
-                  image: `${siteConfig.baseUrl}img/singav3-sw.png`,
+                  image: `${siteConfig.baseUrl}img/singav3.1-sw.png`,
                   imageAlt: "Usability",
                   title: "Usability",
                 },
diff --git a/docs-site/website/static/img/singav3.1-sw.png b/docs-site/website/static/img/singav3.1-sw.png
new file mode 100644
index 0000000..2bfd2c0
--- /dev/null
+++ b/docs-site/website/static/img/singav3.1-sw.png
Binary files differ