docs/k8s-custom-build-cluster-scaleup.md - openwhisk-deploy-kube - Git at Google

 <!--
 #
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
 # The ASF licenses this file to You under the Apache License, Version 2.0
 # (the "License"); you may not use this file except in compliance with
 # the License.  You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
 -->

 # Scaling-up OpenWhisk Deployment on custom-built-kubernetes cluster

 ## Overview

 The default configurations of openwhisk deployment, support low concurrency-limit which can only be used for testing purposes. This document outlines how this concurrency-limit can be increased to scale-up openwhisk deployment for more practical use, on custom-built-kubernetes cluster. Also, provides information regarding some issues one might encounter while scaling-up.

 ## Scale-up

 ### Small Scale

 By default, openwhisk deployment is configured to provide a bare-minimum working platform for testing and exploration. For your specialized workloads, you can scale-up your openwhisk deployment by defining your deployment configurations in your `mycluster.yaml` which overrides the defaults in `helm/openwhisk/values.yaml`. Some important parameters to consider (for other parameters, check `helm/openwhisk/values.yaml` and [configurationChoices](./docs/configurationChoices.md)):
 * `actionsInvokesPerminute`: limits the maximum number of invocations per minute.
 * `actionsInvokesPerminute`: limits the maximum concurrent invocations.
 * `containerPool.userMemory`: total memory available per `invoker` instance. `Invoker` uses this memory to create containers for user-actions. The concurrency-limit (actions running in parallel) will depend upon the total memory configured for `containerPool` and memory allocated per action (`default:` 256mb per container).
 * `triggersFiresPerminute`: limits the maximum triggers invoked per minute.

 Modifying the above mentioned parameters, one can easily increase the concurrency-limit (`default:` 8) to `100` or `200` without affecting the runtime performance (may vary based on the running functions). To further increase the concurrency-limit, check `Large` scale-up below.

 ### Large Scale

 In order to further increase the scale-up beyond `Small Scale`, one needs to modify the following additional configurations appropriately (on top of the above mentioned):
 * `invoker:jvmHeapMB`: jvmHeap memory available to each invoker instance. May or may not require increase based on running functions. For more information check `troubleshooting` below.
 * `invoker:containerFactory:_:replicaCount`: number of invoker instances that will be used to handle the incoming workload. By default, there is only one invoker instance which can become overwhelmed if workload goes beyond a certain threshold.
 * `controller:replicaCount`: number of controller instances that will be used to handle the incoming workload. Same as invoker instances.
 * `invoker:options`: Log processing at the invoker can become a bottleneck for the KubernetesContainerFactory. One might try disabling invoker log processing by setting it to `-Dwhisk.spi.LogStoreProvider=org.apache.openwhisk.core.containerpool.logging.LogDriverLogStoreProvider`. In general, one needs to offload log processing from the invoker to a node-level log store provider if one is trying to push a large load through the system.

 ## Troubleshooting

 ### Client-side

 On the client-side, the most frequently received error:
 ```
 "error": "The server is currently unavailable (because it is overloaded or down for maintenance).
 ```
 The above mentioned error occurs when controller is unable to find any healthy invoker instance to serve the incoming requests. To resolve this issue, one needs to debug the `Deployment-side` to figure-out the cause for unhealth invoker instances.

 ### Deployment-side

 For debugging, one needs to identify the `invoker` and `controller` pods and check their logs for further details. Few known errors:
 ```
 class io.fabric8.kubernetes.client.KubernetesClientTimeoutException - Timed out waiting for [0] milliseconds for [Pod] with name
 ```
 The above error occurs when one has configured too large a `containerPool` to match the incoming workloads, without configuring the scale-up for the invoker instance(s) to keep up with the serving rate.

 ```
 java.lang.OutOfMemoryError: Java heap space
 ```
 The above error occurs when the configured `invoker:jvmHeapMB` memory is insufficient for the faced workload.

 #### error: only single invoker instance being used to handle all the workload

 OpenWhisk treats [blackbox (docker) actions](https://github.com/apache/openwhisk/blob/master/docs/actions-docker.md) differently when compared to regular actions. By default, OpenWhisk loadbalancer is configured to use only `10%` (only 1 invoker-instance if total invoker-instances are less than 10) of invoker instances for `blackbox` actions. This behavior can be configured by modifying `whisk.loadbalancer.blackbox-fraction` in `helm/openwhisk/values.yaml`.
	<!--
	#
	# Licensed to the Apache Software Foundation (ASF) under one or more
	# contributor license agreements. See the NOTICE file distributed with
	# this work for additional information regarding copyright ownership.
	# The ASF licenses this file to You under the Apache License, Version 2.0
	# (the "License"); you may not use this file except in compliance with
	# the License. You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,
	# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	# See the License for the specific language governing permissions and
	# limitations under the License.
	#
	-->

	# Scaling-up OpenWhisk Deployment on custom-built-kubernetes cluster

	## Overview

	The default configurations of openwhisk deployment, support low concurrency-limit which can only be used for testing purposes. This document outlines how this concurrency-limit can be increased to scale-up openwhisk deployment for more practical use, on custom-built-kubernetes cluster. Also, provides information regarding some issues one might encounter while scaling-up.

	## Scale-up

	### Small Scale

	By default, openwhisk deployment is configured to provide a bare-minimum working platform for testing and exploration. For your specialized workloads, you can scale-up your openwhisk deployment by defining your deployment configurations in your `mycluster.yaml` which overrides the defaults in `helm/openwhisk/values.yaml`. Some important parameters to consider (for other parameters, check `helm/openwhisk/values.yaml` and [configurationChoices](./docs/configurationChoices.md)):
	* `actionsInvokesPerminute`: limits the maximum number of invocations per minute.
	* `actionsInvokesPerminute`: limits the maximum concurrent invocations.
	* `containerPool.userMemory`: total memory available per `invoker` instance. `Invoker` uses this memory to create containers for user-actions. The concurrency-limit (actions running in parallel) will depend upon the total memory configured for `containerPool` and memory allocated per action (`default:` 256mb per container).
	* `triggersFiresPerminute`: limits the maximum triggers invoked per minute.

	Modifying the above mentioned parameters, one can easily increase the concurrency-limit (`default:` 8) to `100` or `200` without affecting the runtime performance (may vary based on the running functions). To further increase the concurrency-limit, check `Large` scale-up below.

	### Large Scale

	In order to further increase the scale-up beyond `Small Scale`, one needs to modify the following additional configurations appropriately (on top of the above mentioned):
	* `invoker:jvmHeapMB`: jvmHeap memory available to each invoker instance. May or may not require increase based on running functions. For more information check `troubleshooting` below.
	* `invoker:containerFactory:_:replicaCount`: number of invoker instances that will be used to handle the incoming workload. By default, there is only one invoker instance which can become overwhelmed if workload goes beyond a certain threshold.
	* `controller:replicaCount`: number of controller instances that will be used to handle the incoming workload. Same as invoker instances.
	* `invoker:options`: Log processing at the invoker can become a bottleneck for the KubernetesContainerFactory. One might try disabling invoker log processing by setting it to `-Dwhisk.spi.LogStoreProvider=org.apache.openwhisk.core.containerpool.logging.LogDriverLogStoreProvider`. In general, one needs to offload log processing from the invoker to a node-level log store provider if one is trying to push a large load through the system.

	## Troubleshooting

	### Client-side

	On the client-side, the most frequently received error:
	```
	"error": "The server is currently unavailable (because it is overloaded or down for maintenance).
	```
	The above mentioned error occurs when controller is unable to find any healthy invoker instance to serve the incoming requests. To resolve this issue, one needs to debug the `Deployment-side` to figure-out the cause for unhealth invoker instances.

	### Deployment-side

	For debugging, one needs to identify the `invoker` and `controller` pods and check their logs for further details. Few known errors:
	```
	class io.fabric8.kubernetes.client.KubernetesClientTimeoutException - Timed out waiting for [0] milliseconds for [Pod] with name
	```
	The above error occurs when one has configured too large a `containerPool` to match the incoming workloads, without configuring the scale-up for the invoker instance(s) to keep up with the serving rate.

	```
	java.lang.OutOfMemoryError: Java heap space
	```
	The above error occurs when the configured `invoker:jvmHeapMB` memory is insufficient for the faced workload.

	#### error: only single invoker instance being used to handle all the workload

	OpenWhisk treats [blackbox (docker) actions](https://github.com/apache/openwhisk/blob/master/docs/actions-docker.md) differently when compared to regular actions. By default, OpenWhisk loadbalancer is configured to use only `10%` (only 1 invoker-instance if total invoker-instances are less than 10) of invoker instances for `blackbox` actions. This behavior can be configured by modifying `whisk.loadbalancer.blackbox-fraction` in `helm/openwhisk/values.yaml`.