| # Mutate |
| |
| We give some application suggestions for mutating the outputs of functions in a distributed manner with `@mutate.` |
| When you scroll through the notebook we build examples from most straight forward applications to more complex logic that showcases the amount of flexibility you get with this decorator. |
| |
| Mutate gives the ability to apply the same transformation to the each output of multiple functions in the DAG. It can be particularly useful in the following scenarios: |
| |
| 1. Loading data and applying pre-cleaning step. |
| 2. Feature engineering via joining, filtering, sorting, applying adjustment factors on a per column basis etc. |
| 3. Experimenting with different transformations across nodes by selectively turning transformations on / off. |
| |
| |
| and effectively replaces: |
| 1. Having to have unique names and then changing wiring if you want to add/remove/replace something. |
| 2. Enabling more verb like names on functions. |
| 3. Potentially simpler "reuse" of transform functions across DAG paths... |
| |
| # Modules |
| The same modules can be viewed and executed in `notebook.ipynb`. |
| |
| We have six modules: |
| 1. procedural.py: basic example without using Hamilton |
| 2. pipe_output.py: how the above would be implemented using `pipe_output` from Hamilton |
| 3. mutate.py: how the above would be implemented using `mutate` |
| 4. pipe_output_on_output.py: functionality that allows to apply `pipe_output` to user selected nodes (comes in handy with `extract_columns`/`extract_fields`) |
| 5. mutate_on_output.py: same as above but implemented using `mutate` |
| 6. mutate_twice_the_same: how you would apply the same transformation on a node twice usign `mutate` |
| |
| that demonstrate the same behavior achieved either without Hamilton, using `pipe_output` or `mutate` and that should give you some idea of potential application |
| |
|  |