commit | effa5d79930b1103c36d8cc53618a6dce1ba3760 | [log] [tgz] |
---|---|---|
author | Ivan Sidorenko <98739392+ibsidorenko@users.noreply.github.com> | Fri May 03 23:32:15 2024 +0300 |
committer | GitHub <noreply@github.com> | Sat May 04 05:32:15 2024 +0900 |
tree | 04fda2da4cda3a1a4d57b7b9af44cef97842b74b | |
parent | 20d769617fa6ab561d7ed2b7cd61ed2b6b4710ba [diff] |
[CUBLAS] Enable offloading of R.matmul + R.dequantize (#16896) This commit enables offloading of R.matmul + R.dequantize to cuBLAS codegen. Dequantization scale is passed to runtime function and set to alpha parameter. If there is no dequantization, then alpha == 1.0.
Documentation | Contributors | Community | Release Notes
Apache TVM is a compiler stack for deep learning systems. It is designed to close the gap between the productivity-focused deep learning frameworks, and the performance- and efficiency-focused hardware backends. TVM works with deep learning frameworks to provide end to end compilation to different backends.
TVM is licensed under the Apache-2.0 license.
Check out the TVM Documentation site for installation instructions, tutorials, examples, and more. The Getting Started with TVM tutorial is a great place to start.
TVM adopts apache committer model, we aim to create an open source project that is maintained and owned by the community. Check out the Contributor Guide.
We learned a lot from the following projects when building TVM.