guide/blueprints/workflow/tips.md - brooklyn-docs - Git at Google

 ---
 title: Tips and Tricks
 layout: website-normal
 ---

 ## Filtering Lists

 There are many `transform` filters available, but none will filter lists.
 Removing elements from a list is easily be done by using `foreach`` with a `condition`,
 returning elements that meet the condition.

 For example, to filter out non-empty strings from a list:

 ```
 - let list_to_filter = [ "word1", "", "word2" ]
 - step: foreach item in ${list_to_filter}
   condition:
     size: { greater-than: 0 }
   steps:
   - return ${item}
 ```

 The output from the `foreach` step is the list of results,
 so this results in the filtered list `["word1", "word2"]`.

 (This condition uses `size` to find the length of the string.
 The `condition: { not: { equals: "" } }` could just as well be used,
 or using the fact that `equals` can be made implicit, simply
 `condition: { not: "" }`.)

 Here is a more complicated example which filters comma-separated words
 to return known fruits and proper names:

 ```
 - let requested_foods = "apple, banana, Nutella, invalid"
 - transform value ${requested_foods} | split regex \S*,\S* | trim
 - let list known_fruits = [ "apple", "orange", "banana" ]
 - step: foreach item in ${output}
   condition:
     any:
     - regex: [A-Z].*               # match proper name
     - target: ${known_fruits}      # match known fruits
       contains: ${item}
   steps:
   - return ${item}
 ```

 The above will return the list of `apple`, `banana`, and `Nutella`, dropping `invalid`.


 ## Optimizing for Workflows

 Workflows can generate a huge amount of data which can impact memory usage, persistence, and the UI.
 The REST API and UI do some filtering (e.g. in the body of the `internal` sensors used by workflow),
 but when working with large `ssh` `output` and `http` `content` payloads, and with `update-children`,
 performance can be dramatically improved by following these tips:

 * Optimize external calls to return the minimal amount of information needed
   * Use `jq` to filter when using `ssh` or `container` steps
   * Pass filter arguments to `http` endpoints that accept them
   * Loop over small page sizes (e.g. 20 records per cycle from `http`) using `retry from` steps

 * Optimize the data which is stored
   * Override the `output` on `ssh` and `http` steps to remove unnecessary objects;
     for example `http` returns several `content*` fields, and often just one is needed.
     Simply settings `output: { content: ${content} }` will achieve this.
   * Set `retention: 1` or `retention: 0` on workflows that use a large amount of information
     and can simply be replayed from the start; additionally `retention: disabled` can be used
     to prevent any persistence (even for ongoing workflows), but only for workflows that do
     not acquire any `lock`
	---
	title: Tips and Tricks
	layout: website-normal
	---

	## Filtering Lists

	There are many `transform` filters available, but none will filter lists.
	Removing elements from a list is easily be done by using `foreach`` with a `condition`,
	returning elements that meet the condition.

	For example, to filter out non-empty strings from a list:

	```
	- let list_to_filter = [ "word1", "", "word2" ]
	- step: foreach item in ${list_to_filter}
	condition:
	size: { greater-than: 0 }
	steps:
	- return ${item}
	```

	The output from the `foreach` step is the list of results,
	so this results in the filtered list `["word1", "word2"]`.

	(This condition uses `size` to find the length of the string.
	The `condition: { not: { equals: "" } }` could just as well be used,
	or using the fact that `equals` can be made implicit, simply
	`condition: { not: "" }`.)

	Here is a more complicated example which filters comma-separated words
	to return known fruits and proper names:

	```
	- let requested_foods = "apple, banana, Nutella, invalid"
	- transform value ${requested_foods} \| split regex \S,\S \| trim
	- let list known_fruits = [ "apple", "orange", "banana" ]
	- step: foreach item in ${output}
	condition:
	any:
	- regex: [A-Z].* # match proper name
	- target: ${known_fruits} # match known fruits
	contains: ${item}
	steps:
	- return ${item}
	```

	The above will return the list of `apple`, `banana`, and `Nutella`, dropping `invalid`.



	## Optimizing for Workflows

	Workflows can generate a huge amount of data which can impact memory usage, persistence, and the UI.
	The REST API and UI do some filtering (e.g. in the body of the `internal` sensors used by workflow),
	but when working with large `ssh` `output` and `http` `content` payloads, and with `update-children`,
	performance can be dramatically improved by following these tips:

	* Optimize external calls to return the minimal amount of information needed
	* Use `jq` to filter when using `ssh` or `container` steps
	* Pass filter arguments to `http` endpoints that accept them
	* Loop over small page sizes (e.g. 20 records per cycle from `http`) using `retry from` steps

	* Optimize the data which is stored
	* Override the `output` on `ssh` and `http` steps to remove unnecessary objects;
	for example `http` returns several `content*` fields, and often just one is needed.
	Simply settings `output: { content: ${content} }` will achieve this.
	* Set `retention: 1` or `retention: 0` on workflows that use a large amount of information
	and can simply be replayed from the start; additionally `retention: disabled` can be used
	to prevent any persistence (even for ongoing workflows), but only for workflows that do
	not acquire any `lock`