docs/concepts/decorators-overview.rst - hamilton - Git at Google

 ======================
 Overview of Decorators
 ======================


 While the one to one mapping of node -> function implementation is powerful, it can sometimes lead to repeated code. In
 order to minimize this, Hamilton employs decorators to promote reuse of business logic. In this section we will go over a
 few different classes of decorators with some examples. For reference on specific decorators, see
 :doc:`../../reference/decorators/index`.

 In Hamilton, decorators allow the author of Hamilton dataflows to modify functions as well as their role in the DAG.
 Decorators do one (or multiple) of the following:

 Determine if a function should be included
 ------------------------------------------

 ``@config.when`` is the primary example of this. It allows the user to determine nodes that are only present in certain
 DAG configurations. For instance, say that your company has different marketing spend source in different geographic
 regions. You might write code that looks like this:

 .. code-block:: python

     from hamilton.function_modifiers import config

     @config.when(region='US')
     def marketing_spend(tv_spend: pd.Series, billboard_spend: pd.Series):
         return tv_spend + billboard_spend

     @config.when(region='CA')
     def marketing_spend(tv_spend: pd.Series, radio_spend: pd.Series):
         return tv_spend + radio_spend

 With the code above, the implementation of ``marketing_spend`` will depend on whether the DAG of transforms is compiled
 with the region set to the US or Canada.

 Specify the implementation of a function
 ----------------------------------------

 The function doesn't have to specify everything about the transform. In fact, with the ``@does`` decorator, it can be
 left empty! The above can be reduced even further...

 .. code-block:: python

     import functools
     import operators

     from hamilton.function_modifiers import config, does

     def _sum_series(**series: pd.Series) -> pd.Series:
         return functools.reduce(operators.add, series.values())

     @config.when(region='US')
     @does(_sum_series)
     def marketing_spend(tv_spend: pd.Series, billboard_spend: pd.Series):
         pass

     @config.when(region='CA')
     @does(_sum_series)
     def marketing_spend(tv_spend: pd.Series, radio_spend: pd.Series):
         pass

 Note that we have no implementation for the functions! They're automagically replaced. There are other decorators that
 similar things.

 Split the function into multiple nodes
 --------------------------------------

 In order to reduce duplicated code, Hamilton has decorators that allow a single function to output multiple nodes. These
 can be incredibly powerful. Take, for example, the extract\_columns decorator. In this case, let's say that we have all
 the source spend in a big dataframe, and want to split it out into multiple columns, so the above can read it. One might
 write the following:

 .. code-block:: python

     import functools
     import operators

     from hamilton.function_modifiers import extract_columns

     @extract_columns(['tv_spend', 'radio_spend', 'billboard_spend'])
     def marketing_spend_df() -> pd.DataFrame:
         return load_all_marketing_spend_from_external_source()

 The ``marketing_spend_df`` function becomes three separate available datums from the pipeline. Note it actually provides
 a fourth, as the original node itself is added so the computation is not rerun. ``@extract_columns`` works specifically
 only over dataframes. In fact, it will break if you try it on a function annotated with the wrong type. The other most
 widely used transform is ``@parametrized``.

 While there are decorators that don't quite fit into the above, these should give you a sense of how and why decorators
 are used. For more information on available decorators, see: :doc:`../reference/decorators/index`.
	======================
	Overview of Decorators
	======================


	While the one to one mapping of node -> function implementation is powerful, it can sometimes lead to repeated code. In
	order to minimize this, Hamilton employs decorators to promote reuse of business logic. In this section we will go over a
	few different classes of decorators with some examples. For reference on specific decorators, see
	:doc:`../../reference/decorators/index`.

	In Hamilton, decorators allow the author of Hamilton dataflows to modify functions as well as their role in the DAG.
	Decorators do one (or multiple) of the following:

	Determine if a function should be included
	------------------------------------------

	``@config.when`` is the primary example of this. It allows the user to determine nodes that are only present in certain
	DAG configurations. For instance, say that your company has different marketing spend source in different geographic
	regions. You might write code that looks like this:

	.. code-block:: python

	from hamilton.function_modifiers import config

	@config.when(region='US')
	def marketing_spend(tv_spend: pd.Series, billboard_spend: pd.Series):
	return tv_spend + billboard_spend

	@config.when(region='CA')
	def marketing_spend(tv_spend: pd.Series, radio_spend: pd.Series):
	return tv_spend + radio_spend

	With the code above, the implementation of ``marketing_spend`` will depend on whether the DAG of transforms is compiled
	with the region set to the US or Canada.

	Specify the implementation of a function
	----------------------------------------

	The function doesn't have to specify everything about the transform. In fact, with the ``@does`` decorator, it can be
	left empty! The above can be reduced even further...

	.. code-block:: python

	import functools
	import operators

	from hamilton.function_modifiers import config, does

	def _sum_series(**series: pd.Series) -> pd.Series:
	return functools.reduce(operators.add, series.values())

	@config.when(region='US')
	@does(_sum_series)
	def marketing_spend(tv_spend: pd.Series, billboard_spend: pd.Series):
	pass

	@config.when(region='CA')
	@does(_sum_series)
	def marketing_spend(tv_spend: pd.Series, radio_spend: pd.Series):
	pass

	Note that we have no implementation for the functions! They're automagically replaced. There are other decorators that
	similar things.

	Split the function into multiple nodes
	--------------------------------------

	In order to reduce duplicated code, Hamilton has decorators that allow a single function to output multiple nodes. These
	can be incredibly powerful. Take, for example, the extract\_columns decorator. In this case, let's say that we have all
	the source spend in a big dataframe, and want to split it out into multiple columns, so the above can read it. One might
	write the following:

	.. code-block:: python

	import functools
	import operators

	from hamilton.function_modifiers import extract_columns

	@extract_columns(['tv_spend', 'radio_spend', 'billboard_spend'])
	def marketing_spend_df() -> pd.DataFrame:
	return load_all_marketing_spend_from_external_source()

	The ``marketing_spend_df`` function becomes three separate available datums from the pipeline. Note it actually provides
	a fourth, as the original node itself is added so the computation is not rerun. ``@extract_columns`` works specifically
	only over dataframes. In fact, it will break if you try it on a function annotated with the wrong type. The other most
	widely used transform is ``@parametrized``.

	While there are decorators that don't quite fit into the above, these should give you a sense of how and why decorators
	are used. For more information on available decorators, see: :doc:`../reference/decorators/index`.