User Defined Action Level Concurrency Limits Within Confines of Global Namespace Limit
Currently, openwhisk has a single concurrency limit for managing auto scaling within a namespace. This limit for each namespace is managed rightly by system administrators to maintain a good balance between the namespaces of the system and the total system's resources.
However, this does not allow for the user to control how their applications scale within the namespace that they are operating. There is no fairness across functions within a namespace. The semantics of a namespace can vary heavily depending on how openwhisk is being used. A namespace could represent an organization for public cloud, a group within an organization, an application of functions, a logical grouping of applications (for example putting all of your interactions with slack in one namespace).
The problem is that a single function can runaway and end up using all of the namespace‘s resources. It shouldn’t be on the system administrators to provide this fairness as it‘s dependent on the application and what the user wants. They may want the existing behavior to allow any action to scale up to the total namespace’s resources, they may want to restrict one less prioritized function scale up to a smaller threshold so it can‘t eat the entire namespace’s resources but still allow other high priority functions access to the entire namespace's resources, or they may want to provide limits to all of their actions that add up to their namespace limit which will guarantee each action in their namespace can have up to their defined action concurrency limits similar to other FaaS providers concept of reserved concurrency for actions.
With the major revision to how Openwhisk processes activations with the new scheduler, such a feature becomes extremely easy to implement by just adding a single new limit that users can configure on their action document.
Add a optional maxContainerConcurrency
limit field to action documents in the limits section. This limit will be used in the scheduler when deciding if there is capacity for the action to scale up more containers. Previously, the scheduler was completely naive of functions across a namespace when provisioning more containers, but if this limit is defined the scheduler will only allow to provision containers up to the defined action limit (which must be less than or equal to the namespace limit).
A working PR of this POEM is already done in which implementation details can be reviewed but I will describe implementation considerations here. Once the POEM is approved, I will add any feedback from the POEM, tests, and documentation.
maxContainerConcurrency
on the action document is an optional field. If the field does not exist, the action limit used by the system is the namespace limit making this an optional feature.The feature is fully backwards compatible with existing action documents since the new limit is an optional field. If the limit is not defined on an action document, the existing behavior is used where the action can have up to the namespace concurrency limit so there is no change to behavior if the feature is not used. If using the old scheduler and the limit is defined on the action document, the limit just won't do anything until migrated to the new scheduler.