docs/_sources/how_to/work_with_microtvm/micro_autotune.rst.txt - tvm-site - Git at Google


 .. DO NOT EDIT. THIS FILE WAS AUTOMATICALLY GENERATED BY
 .. TVM'S MONKEY-PATCHED VERSION OF SPHINX-GALLERY. TO MAKE
 .. CHANGES, EDIT THE SOURCE PYTHON FILE:
 .. "how_to/work_with_microtvm/micro_autotune.py"

 .. only:: html

     .. note::
         :class: sphx-glr-download-link-note

         This tutorial can be used interactively with Google Colab! You can also click
         :ref:`here <sphx_glr_download_how_to_work_with_microtvm_micro_autotune.py>` to run the Jupyter notebook locally.

         .. image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/utilities/colab_button.svg
             :align: center
             :target: https://colab.research.google.com/github/apache/tvm-site/blob/asf-site/docs/_downloads/f83ba3df2d52f9b54cf141114359481a/micro_autotune.ipynb
             :width: 300px

 .. rst-class:: sphx-glr-example-title

 .. _sphx_glr_how_to_work_with_microtvm_micro_autotune.py:


 .. _tutorial-micro-autotune:

 6. Model Tuning with microTVM
 =============================
 **Authors**:
 `Andrew Reusch <https://github.com/areusch>`_,
 `Mehrdad Hessar <https://github.com/mehrdadh>`_

 This tutorial explains how to autotune a model using the C runtime.

 .. GENERATED FROM PYTHON SOURCE LINES 31-33

 .. include:: ../../../../gallery/how_to/work_with_microtvm/install_dependencies.rst


 .. GENERATED FROM PYTHON SOURCE LINES 34-42

 .. code-block:: default


     # You can skip the following section (installing Zephyr) if the following flag is False.
     # Installing Zephyr takes ~20 min.
     import os

     use_physical_hw = bool(os.getenv("TVM_MICRO_USE_HW"))


 .. GENERATED FROM PYTHON SOURCE LINES 43-45

 .. include:: ../../../../gallery/how_to/work_with_microtvm/install_zephyr.rst


 .. GENERATED FROM PYTHON SOURCE LINES 49-52

 Import Python dependencies
 -------------------------------


 .. GENERATED FROM PYTHON SOURCE LINES 52-60

 .. code-block:: default

     import json
     import numpy as np
     import pathlib

     import tvm
     from tvm.relay.backend import Runtime
     import tvm.micro.testing


 .. GENERATED FROM PYTHON SOURCE LINES 61-67

 Defining the model
 ###################

  To begin with, define a model in Relay to be executed on-device. Then create an IRModule from relay model and
  fill parameters with random numbers.


 .. GENERATED FROM PYTHON SOURCE LINES 67-92

 .. code-block:: default


     data_shape = (1, 3, 10, 10)
     weight_shape = (6, 3, 5, 5)

     data = tvm.relay.var("data", tvm.relay.TensorType(data_shape, "float32"))
     weight = tvm.relay.var("weight", tvm.relay.TensorType(weight_shape, "float32"))

     y = tvm.relay.nn.conv2d(
         data,
         weight,
         padding=(2, 2),
         kernel_size=(5, 5),
         kernel_layout="OIHW",
         out_dtype="float32",
     )
     f = tvm.relay.Function([data, weight], y)

     relay_mod = tvm.IRModule.from_expr(f)
     relay_mod = tvm.relay.transform.InferType()(relay_mod)

     weight_sample = np.random.rand(
         weight_shape[0], weight_shape[1], weight_shape[2], weight_shape[3]
     ).astype("float32")
     params = {"weight": weight_sample}


 .. GENERATED FROM PYTHON SOURCE LINES 93-104

 Defining the target
 ######################
  Now we define the TVM target that describes the execution environment. This looks very similar
  to target definitions from other microTVM tutorials. Alongside this we pick the C Runtime to code
  generate our model against.

  When running on physical hardware, choose a target and a board that
  describe the hardware. There are multiple hardware targets that could be selected from
  PLATFORM list in this tutorial. You can chose the platform by passing --platform argument when running
  this tutorial.


 .. GENERATED FROM PYTHON SOURCE LINES 104-118

 .. code-block:: default


     RUNTIME = Runtime("crt", {"system-lib": True})
     TARGET = tvm.micro.testing.get_target("crt")

     # Compiling for physical hardware
     # --------------------------------------------------------------------------
     #  When running on physical hardware, choose a TARGET and a BOARD that describe the hardware. The
     #  STM32L4R5ZI Nucleo target and board is chosen in the example below.
     if use_physical_hw:
         BOARD = os.getenv("TVM_MICRO_BOARD", default="nucleo_l4r5zi")
         SERIAL = os.getenv("TVM_MICRO_SERIAL", default=None)
         TARGET = tvm.micro.testing.get_target("zephyr", BOARD)


 .. GENERATED FROM PYTHON SOURCE LINES 119-128

 Extracting tuning tasks
 ########################
  Not all operators in the Relay program printed above can be tuned. Some are so trivial that only
  a single implementation is defined; others don't make sense as tuning tasks. Using
  `extract_from_program`, you can produce a list of tunable tasks.

  Because task extraction involves running the compiler, we first configure the compiler's
  transformation passes; we'll apply the same configuration later on during autotuning.


 .. GENERATED FROM PYTHON SOURCE LINES 128-134

 .. code-block:: default


     pass_context = tvm.transform.PassContext(opt_level=3, config={"tir.disable_vectorize": True})
     with pass_context:
         tasks = tvm.autotvm.task.extract_from_program(relay_mod["main"], {}, TARGET)
     assert len(tasks) > 0


 .. GENERATED FROM PYTHON SOURCE LINES 135-145

 Configuring microTVM
 #####################
  Before autotuning, we need to define a module loader and then pass that to
  a `tvm.autotvm.LocalBuilder`. Then we create a `tvm.autotvm.LocalRunner` and use
  both builder and runner to generates multiple measurements for auto tunner.

  In this tutorial, we have the option to use x86 host as an example or use different targets
  from Zephyr RTOS. If you choose pass `--platform=host` to this tutorial it will uses x86. You can
  choose other options by choosing from `PLATFORM` list.


 .. GENERATED FROM PYTHON SOURCE LINES 145-183

 .. code-block:: default


     module_loader = tvm.micro.AutoTvmModuleLoader(
         template_project_dir=pathlib.Path(tvm.micro.get_microtvm_template_projects("crt")),
         project_options={"verbose": False},
     )
     builder = tvm.autotvm.LocalBuilder(
         n_parallel=1,
         build_kwargs={"build_option": {"tir.disable_vectorize": True}},
         do_fork=True,
         build_func=tvm.micro.autotvm_build_func,
         runtime=RUNTIME,
     )
     runner = tvm.autotvm.LocalRunner(number=1, repeat=1, timeout=100, module_loader=module_loader)

     measure_option = tvm.autotvm.measure_option(builder=builder, runner=runner)

     # Compiling for physical hardware
     if use_physical_hw:
         module_loader = tvm.micro.AutoTvmModuleLoader(
             template_project_dir=pathlib.Path(tvm.micro.get_microtvm_template_projects("zephyr")),
             project_options={
                 "board": BOARD,
                 "verbose": False,
                 "project_type": "host_driven",
                 "serial_number": SERIAL,
             },
         )
         builder = tvm.autotvm.LocalBuilder(
             n_parallel=1,
             build_kwargs={"build_option": {"tir.disable_vectorize": True}},
             do_fork=False,
             build_func=tvm.micro.autotvm_build_func,
             runtime=RUNTIME,
         )
         runner = tvm.autotvm.LocalRunner(number=1, repeat=1, timeout=100, module_loader=module_loader)

         measure_option = tvm.autotvm.measure_option(builder=builder, runner=runner)


 .. GENERATED FROM PYTHON SOURCE LINES 184-188

 Run Autotuning
 #########################
  Now we can run autotuning separately on each extracted task on microTVM device.


 .. GENERATED FROM PYTHON SOURCE LINES 188-206

 .. code-block:: default


     autotune_log_file = pathlib.Path("microtvm_autotune.log.txt")
     if os.path.exists(autotune_log_file):
         os.remove(autotune_log_file)

     num_trials = 10
     for task in tasks:
         tuner = tvm.autotvm.tuner.GATuner(task)
         tuner.tune(
             n_trial=num_trials,
             measure_option=measure_option,
             callbacks=[
                 tvm.autotvm.callback.log_to_file(str(autotune_log_file)),
                 tvm.autotvm.callback.progress_bar(num_trials, si_prefix="M"),
             ],
             si_prefix="M",
         )


 .. GENERATED FROM PYTHON SOURCE LINES 207-213

 Timing the untuned program
 ###########################
  For comparison, let's compile and run the graph without imposing any autotuning schedules. TVM
  will select a randomly-tuned implementation for each operator, which should not perform as well as
  the tuned operator.


 .. GENERATED FROM PYTHON SOURCE LINES 213-252

 .. code-block:: default


     with pass_context:
         lowered = tvm.relay.build(relay_mod, target=TARGET, runtime=RUNTIME, params=params)

     temp_dir = tvm.contrib.utils.tempdir()
     project = tvm.micro.generate_project(
         str(tvm.micro.get_microtvm_template_projects("crt")),
         lowered,
         temp_dir / "project",
         {"verbose": False},
     )

     # Compiling for physical hardware
     if use_physical_hw:
         temp_dir = tvm.contrib.utils.tempdir()
         project = tvm.micro.generate_project(
             str(tvm.micro.get_microtvm_template_projects("zephyr")),
             lowered,
             temp_dir / "project",
             {
                 "board": BOARD,
                 "verbose": False,
                 "project_type": "host_driven",
                 "serial_number": SERIAL,
                 "config_main_stack_size": 4096,
             },
         )

     project.build()
     project.flash()
     with tvm.micro.Session(project.transport()) as session:
         debug_module = tvm.micro.create_local_debug_executor(
             lowered.get_graph_json(), session.get_system_lib(), session.device
         )
         debug_module.set_input(**lowered.get_params())
         print("########## Build without Autotuning ##########")
         debug_module.run()
         del debug_module


 .. rst-class:: sphx-glr-script-out

  .. code-block:: none

     ########## Build without Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
     tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  302.9     98.717   (1, 2, 10, 10, 3)  2       1        [302.9]
     tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       2.983     0.972    (1, 6, 10, 10)     1       1        [2.983]
     tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.954     0.311    (1, 1, 10, 10, 3)  1       1        [0.954]
     Total_time                                    -                                             306.836   -        -                  -       -        -


 .. GENERATED FROM PYTHON SOURCE LINES 253-256

 Timing the tuned program
 #########################
  Once autotuning completes, you can time execution of the entire program using the Debug Runtime:

 .. GENERATED FROM PYTHON SOURCE LINES 256-295

 .. code-block:: default


     with tvm.autotvm.apply_history_best(str(autotune_log_file)):
         with pass_context:
             lowered_tuned = tvm.relay.build(relay_mod, target=TARGET, runtime=RUNTIME, params=params)

     temp_dir = tvm.contrib.utils.tempdir()
     project = tvm.micro.generate_project(
         str(tvm.micro.get_microtvm_template_projects("crt")),
         lowered_tuned,
         temp_dir / "project",
         {"verbose": False},
     )

     # Compiling for physical hardware
     if use_physical_hw:
         temp_dir = tvm.contrib.utils.tempdir()
         project = tvm.micro.generate_project(
             str(tvm.micro.get_microtvm_template_projects("zephyr")),
             lowered_tuned,
             temp_dir / "project",
             {
                 "board": BOARD,
                 "verbose": False,
                 "project_type": "host_driven",
                 "serial_number": SERIAL,
                 "config_main_stack_size": 4096,
             },
         )

     project.build()
     project.flash()
     with tvm.micro.Session(project.transport()) as session:
         debug_module = tvm.micro.create_local_debug_executor(
             lowered_tuned.get_graph_json(), session.get_system_lib(), session.device
         )
         debug_module.set_input(**lowered_tuned.get_params())
         print("########## Build with Autotuning ##########")
         debug_module.run()
         del debug_module


 .. rst-class:: sphx-glr-script-out

  .. code-block:: none

     ########## Build with Autotuning ##########
     Node Name                                     Ops                                           Time(us)  Time(%)  Shape              Inputs  Outputs  Measurements(us)
     ---------                                     ---                                           --------  -------  -----              ------  -------  ----------------
     tvmgen_default_fused_nn_contrib_conv2d_NCHWc  tvmgen_default_fused_nn_contrib_conv2d_NCHWc  100.5     97.31    (1, 6, 10, 10, 1)  2       1        [100.5]
     tvmgen_default_fused_layout_transform_1       tvmgen_default_fused_layout_transform_1       1.795     1.738    (1, 6, 10, 10)     1       1        [1.795]
     tvmgen_default_fused_layout_transform         tvmgen_default_fused_layout_transform         0.984     0.953    (1, 1, 10, 10, 3)  1       1        [0.984]
     Total_time                                    -                                             103.278   -        -                  -       -        -


 .. rst-class:: sphx-glr-timing

    **Total running time of the script:** ( 1 minutes  31.783 seconds)


 .. _sphx_glr_download_how_to_work_with_microtvm_micro_autotune.py:

 .. only:: html

   .. container:: sphx-glr-footer sphx-glr-footer-example


     .. container:: sphx-glr-download sphx-glr-download-python

       :download:`Download Python source code: micro_autotune.py <micro_autotune.py>`

     .. container:: sphx-glr-download sphx-glr-download-jupyter

       :download:`Download Jupyter notebook: micro_autotune.ipynb <micro_autotune.ipynb>`


 .. only:: html

  .. rst-class:: sphx-glr-signature

     `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_

	.. DO NOT EDIT. THIS FILE WAS AUTOMATICALLY GENERATED BY
	.. TVM'S MONKEY-PATCHED VERSION OF SPHINX-GALLERY. TO MAKE
	.. CHANGES, EDIT THE SOURCE PYTHON FILE:
	.. "how_to/work_with_microtvm/micro_autotune.py"

	.. only:: html

	.. note::
	:class: sphx-glr-download-link-note

	This tutorial can be used interactively with Google Colab! You can also click
	:ref:`here <sphx_glr_download_how_to_work_with_microtvm_micro_autotune.py>` to run the Jupyter notebook locally.

	.. image:: https://raw.githubusercontent.com/tlc-pack/web-data/main/images/utilities/colab_button.svg
	:align: center
	:target: https://colab.research.google.com/github/apache/tvm-site/blob/asf-site/docs/_downloads/f83ba3df2d52f9b54cf141114359481a/micro_autotune.ipynb
	:width: 300px

	.. rst-class:: sphx-glr-example-title

	.. _sphx_glr_how_to_work_with_microtvm_micro_autotune.py:


	.. _tutorial-micro-autotune:

	6. Model Tuning with microTVM
	=============================
	Authors:
	`Andrew Reusch <https://github.com/areusch>`_,
	`Mehrdad Hessar <https://github.com/mehrdadh>`_

	This tutorial explains how to autotune a model using the C runtime.

	.. GENERATED FROM PYTHON SOURCE LINES 31-33

	.. include:: ../../../../gallery/how_to/work_with_microtvm/install_dependencies.rst


	.. GENERATED FROM PYTHON SOURCE LINES 34-42

	.. code-block:: default



	# You can skip the following section (installing Zephyr) if the following flag is False.
	# Installing Zephyr takes ~20 min.
	import os

	use_physical_hw = bool(os.getenv("TVM_MICRO_USE_HW"))








	.. GENERATED FROM PYTHON SOURCE LINES 43-45

	.. include:: ../../../../gallery/how_to/work_with_microtvm/install_zephyr.rst


	.. GENERATED FROM PYTHON SOURCE LINES 49-52

	Import Python dependencies
	-------------------------------


	.. GENERATED FROM PYTHON SOURCE LINES 52-60

	.. code-block:: default

	import json
	import numpy as np
	import pathlib

	import tvm
	from tvm.relay.backend import Runtime
	import tvm.micro.testing








	.. GENERATED FROM PYTHON SOURCE LINES 61-67

	Defining the model
	###################

	To begin with, define a model in Relay to be executed on-device. Then create an IRModule from relay model and
	fill parameters with random numbers.


	.. GENERATED FROM PYTHON SOURCE LINES 67-92

	.. code-block:: default


	data_shape = (1, 3, 10, 10)
	weight_shape = (6, 3, 5, 5)

	data = tvm.relay.var("data", tvm.relay.TensorType(data_shape, "float32"))
	weight = tvm.relay.var("weight", tvm.relay.TensorType(weight_shape, "float32"))

	y = tvm.relay.nn.conv2d(
	data,
	weight,
	padding=(2, 2),
	kernel_size=(5, 5),
	kernel_layout="OIHW",
	out_dtype="float32",
	)
	f = tvm.relay.Function([data, weight], y)

	relay_mod = tvm.IRModule.from_expr(f)
	relay_mod = tvm.relay.transform.InferType()(relay_mod)

	weight_sample = np.random.rand(
	weight_shape[0], weight_shape[1], weight_shape[2], weight_shape[3]
	).astype("float32")
	params = {"weight": weight_sample}








	.. GENERATED FROM PYTHON SOURCE LINES 93-104

	Defining the target
	######################
	Now we define the TVM target that describes the execution environment. This looks very similar
	to target definitions from other microTVM tutorials. Alongside this we pick the C Runtime to code
	generate our model against.

	When running on physical hardware, choose a target and a board that
	describe the hardware. There are multiple hardware targets that could be selected from
	PLATFORM list in this tutorial. You can chose the platform by passing --platform argument when running
	this tutorial.


	.. GENERATED FROM PYTHON SOURCE LINES 104-118

	.. code-block:: default


	RUNTIME = Runtime("crt", {"system-lib": True})
	TARGET = tvm.micro.testing.get_target("crt")

	# Compiling for physical hardware
	# --------------------------------------------------------------------------
	# When running on physical hardware, choose a TARGET and a BOARD that describe the hardware. The
	# STM32L4R5ZI Nucleo target and board is chosen in the example below.
	if use_physical_hw:
	BOARD = os.getenv("TVM_MICRO_BOARD", default="nucleo_l4r5zi")
	SERIAL = os.getenv("TVM_MICRO_SERIAL", default=None)
	TARGET = tvm.micro.testing.get_target("zephyr", BOARD)









	.. GENERATED FROM PYTHON SOURCE LINES 119-128

	Extracting tuning tasks
	########################
	Not all operators in the Relay program printed above can be tuned. Some are so trivial that only
	a single implementation is defined; others don't make sense as tuning tasks. Using
	`extract_from_program`, you can produce a list of tunable tasks.

	Because task extraction involves running the compiler, we first configure the compiler's
	transformation passes; we'll apply the same configuration later on during autotuning.


	.. GENERATED FROM PYTHON SOURCE LINES 128-134

	.. code-block:: default


	pass_context = tvm.transform.PassContext(opt_level=3, config={"tir.disable_vectorize": True})
	with pass_context:
	tasks = tvm.autotvm.task.extract_from_program(relay_mod["main"], {}, TARGET)
	assert len(tasks) > 0








	.. GENERATED FROM PYTHON SOURCE LINES 135-145

	Configuring microTVM
	#####################
	Before autotuning, we need to define a module loader and then pass that to
	a `tvm.autotvm.LocalBuilder`. Then we create a `tvm.autotvm.LocalRunner` and use
	both builder and runner to generates multiple measurements for auto tunner.

	In this tutorial, we have the option to use x86 host as an example or use different targets
	from Zephyr RTOS. If you choose pass `--platform=host` to this tutorial it will uses x86. You can
	choose other options by choosing from `PLATFORM` list.


	.. GENERATED FROM PYTHON SOURCE LINES 145-183

	.. code-block:: default


	module_loader = tvm.micro.AutoTvmModuleLoader(
	template_project_dir=pathlib.Path(tvm.micro.get_microtvm_template_projects("crt")),
	project_options={"verbose": False},
	)
	builder = tvm.autotvm.LocalBuilder(
	n_parallel=1,
	build_kwargs={"build_option": {"tir.disable_vectorize": True}},
	do_fork=True,
	build_func=tvm.micro.autotvm_build_func,
	runtime=RUNTIME,
	)
	runner = tvm.autotvm.LocalRunner(number=1, repeat=1, timeout=100, module_loader=module_loader)

	measure_option = tvm.autotvm.measure_option(builder=builder, runner=runner)

	# Compiling for physical hardware
	if use_physical_hw:
	module_loader = tvm.micro.AutoTvmModuleLoader(
	template_project_dir=pathlib.Path(tvm.micro.get_microtvm_template_projects("zephyr")),
	project_options={
	"board": BOARD,
	"verbose": False,
	"project_type": "host_driven",
	"serial_number": SERIAL,
	},
	)
	builder = tvm.autotvm.LocalBuilder(
	n_parallel=1,
	build_kwargs={"build_option": {"tir.disable_vectorize": True}},
	do_fork=False,
	build_func=tvm.micro.autotvm_build_func,
	runtime=RUNTIME,
	)
	runner = tvm.autotvm.LocalRunner(number=1, repeat=1, timeout=100, module_loader=module_loader)

	measure_option = tvm.autotvm.measure_option(builder=builder, runner=runner)








	.. GENERATED FROM PYTHON SOURCE LINES 184-188

	Run Autotuning
	#########################
	Now we can run autotuning separately on each extracted task on microTVM device.


	.. GENERATED FROM PYTHON SOURCE LINES 188-206

	.. code-block:: default


	autotune_log_file = pathlib.Path("microtvm_autotune.log.txt")
	if os.path.exists(autotune_log_file):
	os.remove(autotune_log_file)

	num_trials = 10
	for task in tasks:
	tuner = tvm.autotvm.tuner.GATuner(task)
	tuner.tune(
	n_trial=num_trials,
	measure_option=measure_option,
	callbacks=[
	tvm.autotvm.callback.log_to_file(str(autotune_log_file)),
	tvm.autotvm.callback.progress_bar(num_trials, si_prefix="M"),
	],
	si_prefix="M",
	)








	.. GENERATED FROM PYTHON SOURCE LINES 207-213

	Timing the untuned program
	###########################
	For comparison, let's compile and run the graph without imposing any autotuning schedules. TVM
	will select a randomly-tuned implementation for each operator, which should not perform as well as
	the tuned operator.


	.. GENERATED FROM PYTHON SOURCE LINES 213-252

	.. code-block:: default


	with pass_context:
	lowered = tvm.relay.build(relay_mod, target=TARGET, runtime=RUNTIME, params=params)

	temp_dir = tvm.contrib.utils.tempdir()
	project = tvm.micro.generate_project(
	str(tvm.micro.get_microtvm_template_projects("crt")),
	lowered,
	temp_dir / "project",
	{"verbose": False},
	)

	# Compiling for physical hardware
	if use_physical_hw:
	temp_dir = tvm.contrib.utils.tempdir()
	project = tvm.micro.generate_project(
	str(tvm.micro.get_microtvm_template_projects("zephyr")),
	lowered,
	temp_dir / "project",
	{
	"board": BOARD,
	"verbose": False,
	"project_type": "host_driven",
	"serial_number": SERIAL,
	"config_main_stack_size": 4096,
	},
	)

	project.build()
	project.flash()
	with tvm.micro.Session(project.transport()) as session:
	debug_module = tvm.micro.create_local_debug_executor(
	lowered.get_graph_json(), session.get_system_lib(), session.device
	)
	debug_module.set_input(**lowered.get_params())
	print("########## Build without Autotuning ##########")
	debug_module.run()
	del debug_module





	.. rst-class:: sphx-glr-script-out

	.. code-block:: none

	########## Build without Autotuning ##########
	Node Name Ops Time(us) Time(%) Shape Inputs Outputs Measurements(us)
	--------- --- -------- ------- ----- ------ ------- ----------------
	tvmgen_default_fused_nn_contrib_conv2d_NCHWc tvmgen_default_fused_nn_contrib_conv2d_NCHWc 302.9 98.717 (1, 2, 10, 10, 3) 2 1 [302.9]
	tvmgen_default_fused_layout_transform_1 tvmgen_default_fused_layout_transform_1 2.983 0.972 (1, 6, 10, 10) 1 1 [2.983]
	tvmgen_default_fused_layout_transform tvmgen_default_fused_layout_transform 0.954 0.311 (1, 1, 10, 10, 3) 1 1 [0.954]
	Total_time - 306.836 - - - - -




	.. GENERATED FROM PYTHON SOURCE LINES 253-256

	Timing the tuned program
	#########################
	Once autotuning completes, you can time execution of the entire program using the Debug Runtime:

	.. GENERATED FROM PYTHON SOURCE LINES 256-295

	.. code-block:: default


	with tvm.autotvm.apply_history_best(str(autotune_log_file)):
	with pass_context:
	lowered_tuned = tvm.relay.build(relay_mod, target=TARGET, runtime=RUNTIME, params=params)

	temp_dir = tvm.contrib.utils.tempdir()
	project = tvm.micro.generate_project(
	str(tvm.micro.get_microtvm_template_projects("crt")),
	lowered_tuned,
	temp_dir / "project",
	{"verbose": False},
	)

	# Compiling for physical hardware
	if use_physical_hw:
	temp_dir = tvm.contrib.utils.tempdir()
	project = tvm.micro.generate_project(
	str(tvm.micro.get_microtvm_template_projects("zephyr")),
	lowered_tuned,
	temp_dir / "project",
	{
	"board": BOARD,
	"verbose": False,
	"project_type": "host_driven",
	"serial_number": SERIAL,
	"config_main_stack_size": 4096,
	},
	)

	project.build()
	project.flash()
	with tvm.micro.Session(project.transport()) as session:
	debug_module = tvm.micro.create_local_debug_executor(
	lowered_tuned.get_graph_json(), session.get_system_lib(), session.device
	)
	debug_module.set_input(**lowered_tuned.get_params())
	print("########## Build with Autotuning ##########")
	debug_module.run()
	del debug_module




	.. rst-class:: sphx-glr-script-out

	.. code-block:: none

	########## Build with Autotuning ##########
	Node Name Ops Time(us) Time(%) Shape Inputs Outputs Measurements(us)
	--------- --- -------- ------- ----- ------ ------- ----------------
	tvmgen_default_fused_nn_contrib_conv2d_NCHWc tvmgen_default_fused_nn_contrib_conv2d_NCHWc 100.5 97.31 (1, 6, 10, 10, 1) 2 1 [100.5]
	tvmgen_default_fused_layout_transform_1 tvmgen_default_fused_layout_transform_1 1.795 1.738 (1, 6, 10, 10) 1 1 [1.795]
	tvmgen_default_fused_layout_transform tvmgen_default_fused_layout_transform 0.984 0.953 (1, 1, 10, 10, 3) 1 1 [0.984]
	Total_time - 103.278 - - - - -





	.. rst-class:: sphx-glr-timing

	Total running time of the script: ( 1 minutes 31.783 seconds)


	.. _sphx_glr_download_how_to_work_with_microtvm_micro_autotune.py:

	.. only:: html

	.. container:: sphx-glr-footer sphx-glr-footer-example


	.. container:: sphx-glr-download sphx-glr-download-python

	:download:`Download Python source code: micro_autotune.py <micro_autotune.py>`

	.. container:: sphx-glr-download sphx-glr-download-jupyter

	:download:`Download Jupyter notebook: micro_autotune.ipynb <micro_autotune.ipynb>`


	.. only:: html

	.. rst-class:: sphx-glr-signature

	`Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_