commit | 5b1350f2f73e5981b658a6b56cc8e5bde1192e91 | [log] [tgz] |
---|---|---|
author | Siyuan Feng <Hzfengsy@vip.qq.com> | Thu Oct 24 12:04:37 2019 -0700 |
committer | Leyuan Wang <laurawly@gmail.com> | Thu Oct 24 12:04:37 2019 -0700 |
tree | 759203631808baf11b4ed8d463c3c0c8dc91a8ea | |
parent | 6e0dbeed3209a43ab621fdb9eea0687e3b44611d [diff] |
TensorCore Support using Intrinsic (#4136) * add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
VTA (versatile tensor accelerator) is an open-source deep learning accelerator complemented with an end-to-end TVM-based compiler stack.
The key features of VTA include:
Learn more about VTA here.