Log - HEAD - tvm - Git at Google

fffd168 [Unity][BYOC] Use arith.Analyzer to check batch equality of matmul in cublas (#16982) by Rick Zhou · 17 hours ago main
4c1ebcf [Relax] Implement relax.op.view (#16955) by Eric Lunderberg · 22 hours ago
c0a47ed [CUBLAS][FP8] Enable R.matmul + R.multiply offloading (#16974) by Ivan Sidorenko · 2 days ago nightly
02c4c55 [SVE] Add codegen support for `vscale_range()` function attribute (#16962) by Andrei Hutu · 2 days ago
819b002 [Relax] Support nested ModuleList in nn.Module (#16971) by Wuwei Lin · 3 days ago
28d32b5 [TIR] Support narrow dtype for let binding (#16947) by Siyuan Feng · 4 days ago
876f528 [LLVM] Stringref API deprecation fixes (#16968) by Anirudh Sundar Subramaniam · 4 days ago
9cfebca [TVMScript] Fix error reporting inside Macro func (#16967) by Siyuan Feng · 5 days ago
59ef0ee [Bugfix][ONNX] Improve broadcast and batch_matmul conversion (#16961) by XinhuaHamiMelon · 5 days ago
944d180 [SVE] Add get_active_lane_mask builtin (#16965) by Luke Hutton · 6 days ago
effa5d7 [CUBLAS] Enable offloading of R.matmul + R.dequantize (#16896) by Ivan Sidorenko · 6 days ago
20d7696 [Relax] Express dynamic arguments of strided_slice as arguments (#16826) by Eric Lunderberg · 9 days ago
a320b63 [Unity][Cutlass] Fix C source generation of dense operation (#16476) by Jinbae Park · 9 days ago
6252fa5 [TIR] Enhance CLZ intrinsic support (#16952) by Siyuan Feng · 10 days ago
bc8742b [Misc] Add script for testing release package (#16956) by ysh329 · 10 days ago
c8deb7f Overriding the StructuralEqual() for easy usage (#16908) by sdalvi-quic · 10 days ago
114ad70 [TOPI] Revert unification of conv2d NHWC hybrid scheduling for `arm_cpu` targets (#16951) by Andrei Hutu · 11 days ago
b4a69de Enable gemv schedule for adreno (#16932) by krishnaraj36 · 11 days ago
c0385c7 [Runtime] Allow offset to be specified in NDArray::CreateView (#16938) by Eric Lunderberg · 11 days ago
dd09c85 [CI] Update image tag to 20240428-060115-0b09ed018 (#16948) by Yong Wu · 11 days ago
2d7663c [CI] Use LLVM17 for tests on `ci_cpu` (#16931) by Luke Hutton · 11 days ago
e10cdc5 [tir][Compute-at] Make compute-ated block simple when the predicate could be merged (#16945) by wrongtest · 11 days ago
b00fc55 [CI] Enable Conda setup v3 (#16942) by Tianqi Chen · 11 days ago
081c23b [Relax] Allow PrimValue as index in relax.op.take (#16940) by Eric Lunderberg · 12 days ago
b54f57a [TFLite] Add support for GELU conversion (#16936) by Luke Hutton · 12 days ago
0b09ed0 [3rdparty] Bump FlashInfer for sampling functions (#16935) by Ruihang Lai · 12 days ago
63e0a0f [Thrust] Increase static workspace size (#16937) by Ruihang Lai · 12 days ago
3ff3daa [CI] Upgrade CUDA to 12.4 (#16939) by Yong Wu · 12 days ago
1453893 [CLML] Fix in clml pattern check condition (#16933) by krishnaraj36 · 13 days ago
97ff7cc [VM][OPENCL] Take advantage of OpenCL host ptr for improved copy (#16929) by Siva · 13 days ago
278a6af [Relax][TIR] Introduce new `cumsum` op for gpu (#16934) by Siyuan Feng · 14 days ago
5bd1047 [SCRIPT][ADRENO] Fix in build config for adreno (#16927) by krishnaraj36 · 14 days ago
51cfb70 [Fix][Dlight] Fix GeneralReduction for log-sum-exp (#16923) by Ruihang Lai · 2 weeks ago
39f2482 [Fix] Fix SSA conversion for SizeVar retention (#16924) by Ruihang Lai · 2 weeks ago
4f8c03f [TVMScript] Support `T.launch_thread` with i64 dtype (#16916) by Siyuan Feng · 2 weeks ago
5cf4ca6 [Marvell BYOC]: Marvell AI Accelerator Integration - Phase 2 (#16915) by Krishna Bindumadhavan · 2 weeks ago
2f395f1 [SVE][TOPI] Add conv2d NHWC hybrid SVE schedule for `arm_cpu` (#16899) by Andrei Hutu · 2 weeks ago
11f2253 Restore "pytest.mark.gpu" for RELAX tests (#16741) by apeskov · 2 weeks ago
342f472 [Disco] Improve error message for CallPacked (#16919) by Wuwei Lin · 2 weeks ago
b0143d1 [CMAKE] Make LOG_BEFORE_THROW explicit (#16914) by Tianqi Chen · 3 weeks ago
29534b7 [SVE] Check for SVE target in VectorizeLoop (#16893) by Elen Kalda · 3 weeks ago
57316da [Web] Support string[] in setPackedFunc() and exceptionally long arrays (#16910) by Charlie Ruan · 3 weeks ago
6b77cba [Misc] Enhance Release Note Script and Remove Useless File (#16913) by ysh329 · 3 weeks ago
a2511cc [QoL][Relax] Use SeqExpr in IR types when SeqExpr is required (#16859) by Eric Lunderberg · 3 weeks ago
2978427 [Relax] Prevent to generate duplicate func in dispatch_sort_scan (#16904) by Siyuan Feng · 3 weeks ago
6afbc12 [Bugfix][Relax] Raise exception for OOM allocation (#16905) by Eric Lunderberg · 3 weeks ago
36efa36 [Upd] Fixed lld search in rocm (#16907) by Shrey Gupta · 3 weeks ago
622bd15 [Relax] Handle binary operations between Tensor and PrimValue (#16827) by Eric Lunderberg · 3 weeks ago
fe52709 [CMAKE] Misc improvment of Util (#16900) by Tianqi Chen · 3 weeks ago
59376ee [Relax] Allow specifying entry_funcs for BYOC (#16902) by Wuwei Lin · 3 weeks ago
7dc0472 [Bugfix] CudaDeviceAPI::GetAttr may check kExist when GPUs absent (#16903) by Eric Lunderberg · 3 weeks ago
de91c5c [Bugfix] rocm shared memory issue on MI250 (#16901) by Lesheng Jin · 3 weeks ago
da56c89 [Dlight] Enhance vectorization for gpu matmul (#16894) by Wuwei Lin · 3 weeks ago
b3ffd97 [BYOC] Add layout check and update shape check for cublas FP8 BYOC (#16895) by Wuwei Lin · 3 weeks ago
857fe61 [Target] Don't register AArch64 target tags without LLVM compiler support (#16897) by Luke Hutton · 3 weeks ago
d030ce2 [TVMScript] Optionally use `ruff format` instead of `black` (#16876) by Eric Lunderberg · 3 weeks ago
460f6f1 [QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860) by Eric Lunderberg · 3 weeks ago
94a44d7 [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty (#16861) by Eric Lunderberg · 3 weeks ago
4cb4605 [TVMScript][Bug] Add test case for missing symbolic bounds (#16877) by Eric Lunderberg · 3 weeks ago
08965f0 [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul (#16892) by Ivan Sidorenko · 3 weeks ago
3680a0d [RUNTIME][VULKAN] Support total_global_memory (#16890) by Tianqi Chen · 3 weeks ago
d1ac73c [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888) by Ivan Sidorenko · 3 weeks ago
95d6778 [dlight] Add check for matmul dtype and fix reduction rule (#16884) by Wuwei Lin · 3 weeks ago
e738f1d [Relax][Frontend] Fix sort, argsort and topk in nn module (#16886) by Siyuan Feng · 3 weeks ago
cdfdd0e [Contrib] Enable fp16 for thrust sort (#16887) by Siyuan Feng · 3 weeks ago
d4056ca [SVE] Support splitting by vscale in `tir::split` and `te::split` (#16862) by Luke Hutton · 4 weeks ago
f267691 [Relax] Stabilize relax pass mutation order (#16883) by Siyuan Feng · 4 weeks ago
a64d1f1 [TIR] Make T.reinterpret nop when dtype is the same (#16879) by Wuwei Lin · 4 weeks ago
64911ab [Runtime] Implemented Datatype.itemsize() (#16880) by Wuwei Lin · 4 weeks ago
d0cbb02 [release] Update version to 0.17.dev0 on main branch by Star Yuan · 4 weeks ago v0.17.dev0
6496903 [release] Update version to 0.16.0 on main branch by Star Yuan · 4 weeks ago v0.16.0 v0.16.0 v0.16.0.rc0
5c80691 [Dlight] Enhance vectorization loading weight for gemv (#16878) by Wuwei Lin · 4 weeks ago
0a3fe22 [Relax] Enhance symbolic expr estimation in memory planning (#16872) by Ruihang Lai · 4 weeks ago
3f09e7f [Thrust] Fix thrust workspace allocation (#16873) by Wuwei Lin · 4 weeks ago
88a1c65 [3rdparty] Bump flashinfer (#16868) by Wuwei Lin · 4 weeks ago
0aae97d [PageKV] allow PopN to pop all the tokens in last block (#16871) by ZCHNO · 4 weeks ago
4b90655 [OpenCL] Add OpenCL device for automatic target detection (#16854) by Mengshiun Yu · 4 weeks ago
c67a055 [BugFix][Target] Added null check to fix segfault at ->defined() in cpu.cc DetectSystemTriple() (#16766) by Otto Rasmussen · 4 weeks ago
f9e36fc [3rdparty] Bump FlashInfer (#16866) by Ruihang Lai · 4 weeks ago
4617efa [Relax] Dispatch sort/scan for non-cuda gpu backends (#16867) by Wuwei Lin · 4 weeks ago
6748215 [Codegen, CUDA] Add handling of fp8 broadcast / const (#16865) by Wuwei Lin · 4 weeks ago
2829b59 [TVMScript] Add parser and printer support for e4m3/e5m2 fp8 (#16864) by Wuwei Lin · 4 weeks ago
a482b4c [Picojson] Let the key of objects in json be ordered by default (#16863) by Yixin Dong · 4 weeks ago
95cb0de [VULKAN] Fix CLZ support for Vulkan (#16858) by Siyuan Feng · 4 weeks ago
4d4f050 [SVE] Support scalable vectors in LoopVectorizer (#16782) by Elen Kalda · 4 weeks ago
a309b6b [Thrust] Use pointer to tls pool to prevent creating new pool (#16856) by Wuwei Lin · 4 weeks ago
0594994 [ONNX] Fix interpreting auto_pad parameters in ConvTranspose operator (#16001) by padreofthegame · 4 weeks ago
d1e24ca [Web] Support web indexDB cache for larger model storage (#16733) by Hangrui Cao · 5 weeks ago
81a8506 [TIR] Use constructor for new PrimFunc in TransformLayout (#16832) by Eric Lunderberg · 5 weeks ago
97d7a35 Fixing probability comment (#16850) by Thais Camacho · 5 weeks ago
a7be540 [KVCache] Initialize one extra page than specified (#16849) by Ruihang Lai · 5 weeks ago
a156181 [Relax] Fix EliminiateCommonSubexpr removing alloc tensor (#16852) by Wuwei Lin · 5 weeks ago
3e802d1 [Relax,Topi] Allow passing workspace to thrust to avoid allocations (#16851) by Wuwei Lin · 5 weeks ago
9b5a7a4 [IR] Provide well-formed intermediate in ApplyPassToFunction (#16843) by Eric Lunderberg · 5 weeks ago
ee3f7bc [MSC][M5.3] Support torch.dynamo for dynamic models (#16772) by Archermmt · 5 weeks ago
b91d4e5 [TVMScript] Produce empty DictAttrs when R.func_attrs is absent (#16844) by Eric Lunderberg · 5 weeks ago
b01de08 [DLight] Fix a corner case for reduction rule (#16848) by Siyuan Feng · 5 weeks ago
ab94ca3 [CI] Disable flaky unit test (#16837) by Eric Lunderberg · 5 weeks ago
c93f0ba [Meta-Schedule][OpenCL] Enable MS tuning for Android OpenCL (#16846) by Egor Churaev · 5 weeks ago
cd08356 [TIR] Fix segfaults from ordering of Let/Assert in MakePackedAPI (#16543) by Eric Lunderberg · 5 weeks ago