Update onnx.md
diff --git a/docs-site/docs/onnx.md b/docs-site/docs/onnx.md
index 877bf9a..25ffe0b 100644
--- a/docs-site/docs/onnx.md
+++ b/docs-site/docs/onnx.md
@@ -199,7 +199,7 @@
| <b>[MobileNet](https://github.com/onnx/models/tree/master/vision/classification/mobilenet)</b> | [Sandler et al.](https://arxiv.org/abs/1801.04381) | Light-weight deep neural network best suited for mobile and embedded vision applications. <br>Top-5 error from paper - ~10% | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1HsixqJMIpKyEPhkbB8jy7NwNEFEAUWAf) |
| <b>[ResNet18](https://github.com/onnx/models/tree/master/vision/classification/resnet)</b> | [He et al.](https://arxiv.org/abs/1512.03385) | A CNN model (up to 152 layers). Uses shortcut connections to achieve higher accuracy when classifying images. <br> Top-5 error from paper - ~3.6% | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1u1RYefSsVbiP4I-5wiBKHjsT9L0FxLm9) |
| <b>[VGG16](https://github.com/onnx/models/tree/master/vision/classification/vgg)</b> | [Simonyan et al.](https://arxiv.org/abs/1409.1556) | Deep CNN model(up to 19 layers). Similar to AlexNet but uses multiple smaller kernel-sized filters that provides more accuracy when classifying images. <br>Top-5 error from paper - ~8% | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/14kxgRKtbjPCKKsDJVNi3AvTev81Gp_Ds) |
-| <b>[ShuffleNet_V2](https://github.com/onnx/models/tree/master/vision/classification/shufflenet)</b> | [Simonyan et al.](https://arxiv.org/pdf/1707.01083.pdf) | Extremely computation efficient CNN model that is designed specifically for mobile devices. This network architecture design considers direct metric such as speed, instead of indirect metric like FLOP. Top-1 error from paper - ~30.6% | [![Open In Colab](https://colab.research.google.com/drive/19HfRu3YHP_H2z3BcZujVFRp23_J5XsuA?usp=sharing) |
+| <b>[ShuffleNet_V2](https://github.com/onnx/models/tree/master/vision/classification/shufflenet)</b> | [Simonyan et al.](https://arxiv.org/pdf/1707.01083.pdf) | Extremely computation efficient CNN model that is designed specifically for mobile devices. This network architecture design considers direct metric such as speed, instead of indirect metric like FLOP. Top-1 error from paper - ~30.6% | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/19HfRu3YHP_H2z3BcZujVFRp23_J5XsuA?usp=sharing) |
We also give some re-training examples by using VGG and ResNet, please check
`examples/onnx/training`.
@@ -232,7 +232,7 @@
| ----------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <b>[BERT-Squad](https://github.com/onnx/models/tree/master/text/machine_comprehension/bert-squad)</b> | [Devlin et al.](https://arxiv.org/pdf/1810.04805.pdf) | This model answers questions based on the context of the given input paragraph. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kud-lUPjS_u-TkDAzihBTw0Vqr0FjCE-) |
| <b>[RoBERTa](https://github.com/onnx/models/tree/master/text/machine_comprehension/roberta)</b> | [Devlin et al.](https://arxiv.org/pdf/1907.11692.pdf) | A large transformer-based model that predicts sentiment based on given input text. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1F-c4LJSx3Cb2jW6tP7f8nAZDigyLH6iN?usp=sharing) |
-| <b>[GPT-2](https://github.com/onnx/models/tree/master/text/machine_comprehension/gpt-2)</b> | [Devlin et al.](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) | A large transformer-based language model that given a sequence of words within some text, predicts the next word. | [![Open In Colab](https://colab.research.google.com/drive/1ZlXLSIMppPch6HgzKRillJiUcWn3PiK7?usp=sharing) |
+| <b>[GPT-2](https://github.com/onnx/models/tree/master/text/machine_comprehension/gpt-2)</b> | [Devlin et al.](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) | A large transformer-based language model that given a sequence of words within some text, predicts the next word. | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ZlXLSIMppPch6HgzKRillJiUcWn3PiK7) |
## Supported operators