blob: 342fc60335ecbef31ead0f3770b643573ff16360 [file]
# Mutate
We give some application suggestions for mutating the outputs of functions in a distributed manner with `@mutate.`
When you scroll through the notebook we build examples from most straight forward applications to more complex logic that showcases the amount of flexibility you get with this decorator.
Mutate gives the ability to apply the same transformation to the each output of multiple functions in the DAG. It can be particularly useful in the following scenarios:
1. Loading data and applying pre-cleaning step.
2. Feature engineering via joining, filtering, sorting, applying adjustment factors on a per column basis etc.
3. Experimenting with different transformations across nodes by selectively turning transformations on / off.
and effectively replaces:
1. Having to have unique names and then changing wiring if you want to add/remove/replace something.
2. Enabling more verb like names on functions.
3. Potentially simpler "reuse" of transform functions across DAG paths...
# Modules
The same modules can be viewed and executed in `notebook.ipynb`.
We have six modules:
1. procedural.py: basic example without using Hamilton
2. pipe_output.py: how the above would be implemented using `pipe_output` from Hamilton
3. mutate.py: how the above would be implemented using `mutate`
4. pipe_output_on_output.py: functionality that allows to apply `pipe_output` to user selected nodes (comes in handy with `extract_columns`/`extract_fields`)
5. mutate_on_output.py: same as above but implemented using `mutate`
6. mutate_twice_the_same: how you would apply the same transformation on a node twice usign `mutate`
that demonstrate the same behavior achieved either without Hamilton, using `pipe_output` or `mutate` and that should give you some idea of potential application
![image info](./DAG.png)