A few simple but efficient test suites for determining the maximum throughput and end-user latency of the Apache OpenWhisk system.
overlay
driver and HTTP API enabled via a UNIX port.The load is driven by the blazingly fast wrk
.
The machine provided by Travis has ~2 CPU cores (likely shared through virtualization) and 7.5GB memory.
Determines the end-to-end latency a user experience when doing a blocking invocation. The action used is a no-op so the numbers returned are the plain overhead of the OpenWhisk system. The requests are directly against the controller.
Note: The throughput number has a 100% correlation with the latency in this case. This test does not serve to determine the maximum throughput of the system.
Determines the maximum throughput a user can get out of the system while using a single action. The action used is a no-op, so the numbers are plain OpenWhisk overhead. Note that the throughput does not directly correlate to end-to-end latency here, as the system does more processing in the background as it shows to the user in a blocking invocation. The requests are directly against the controller.
All you have to do is use the corresponding script located in /*_tests
folder, remembering that the parameters are defined inline.
You can specify two thresholds for the simulations. The reason is, that Gatling is able to handle each assertion as a JUnit test. On using CI/CD pipelines (e.g. Jenkins) you will be able to set a threshold on an amount of failed testcases to mark the build as stable, unstable and failed.
This Simulation calls the api/v1
. You can specify the endpoint, the amount of connections against the backend and the duration of this burst.
The test is doing as many requests as possible for the given amount of time (SECONDS
). Afterwards it compares if the test reached the intended throughput (REQUESTS_PER_SEC
, MIN_REQUESTS_PER_SEC
).
Available environment variables:
OPENWHISK_HOST (required) CONNECTIONS (required) SECONDS (default: 10) REQUESTS_PER_SEC (required) MIN_REQUESTS_PER_SEC (default: REQUESTS_PER_SEC) MAX_ERRORS_ALLOWED (default: 0) MAX_ERRORS_ALLOWED_PERCENTAGE (default: 0)
You can run the simulation with (in OPENWHISK_HOME)
OPENWHISK_HOST="openwhisk.mydomain.com" CONNECTIONS="10" REQUESTS_PER_SEC="50" ./gradlew gatlingRun-org.apache.openwhisk.ApiV1Simulation
This simulation creates actions of the following four kinds: nodejs:default
, swift:default
, java:default
and python:default
. Afterwards the action is invoked once. This is the cold-start and will not be part of the thresholds. Next, the action will be invoked 100 times blocking and one after each other. Between each invoke is a pause of PAUSE_BETWEEN_INVOKES
milliseconds. The last step is to delete the action.
Once one language is finished, the next kind will be taken. They are not running in parallel. There are never more than 1 activations in the system, as we only want to meassure latency of warm activations. As all actions are invoked blocking and only one action is in the system, it doesn't matter how many controllers and invokers are deployed. If several controllers or invokers are deployed, all controllers send the activation always to the same invoker.
The comparison of the thresholds is against the mean response times of the warm activations.
Available environment variables:
OPENWHISK_HOST (required) API_KEY (required, format: UUID:KEY) PAUSE_BETWEEN_INVOKES (default: 0) MEAN_RESPONSE_TIME (required) MAX_MEAN_RESPONSE_TIME (default: MEAN_RESPONSE_TIME) EXCLUDED_KINDS (default: "", format: "python:default,java:default,swift:default") MAX_ERRORS_ALLOWED (default: 0) MAX_ERRORS_ALLOWED_PERCENTAGE (default: 0)
It is possible to override the MEAN_RESPONSE_TIME
, MAX_MEAN_RESPONSE_TIME
, MAX_ERRORS_ALLOWED
and MAX_ERRORS_ALLOWED_PERCENTAGE
for each kind by adding the kind as prefix in upper case, like JAVA_MEAN_RESPONSE_TIME
.
You can run the simulation with (in OPENWHISK_HOME)
OPENWHISK_HOST="openwhisk.mydomain.com" MEAN_RESPONSE_TIME="20" API_KEY="UUID:KEY" ./gradlew gatlingRun-org.apache.openwhisk.LatencySimulation
This simulation executes the same action with the same user over and over again. The aim of this test is, to test the throughput of the system, if all containers are always warm.
The action that is invoked, writes one log line and returns a little JSON.
The simulations creates the action in the beginning, invokes it as often as possible for 5 seconds, to warm all containers up and invokes it afterwards for the given amount of time. The warmup-phase will not be part of the assertions.
To run the test, you can specify the amount of concurrent requests. Keep in mind, that the actions are invoked blocking and the system is limited to AMOUNT_OF_INVOKERS * SLOTS_PER_INVOKER * NON_BLACKBOX_INVOKER_RATIO
concurrent actions/requests.
The test is doing as many requests as possible for the given amount of time (SECONDS
). Afterwards it compares if the test reached the intended throughput (REQUESTS_PER_SEC
, MIN_REQUESTS_PER_SEC
).
Available environment variables:
OPENWHISK_HOST (required) API_KEY (required, format: UUID:KEY) CONNECTIONS (required) SECONDS (default: 10) REQUESTS_PER_SEC (required) MIN_REQUESTS_PER_SEC (default: REQUESTS_PER_SEC) MAX_ERRORS_ALLOWED (default: 0) MAX_ERRORS_ALLOWED_PERCENTAGE (default: 0)
You can run the simulation with
OPENWHISK_HOST="openwhisk.mydomain.com" CONNECTIONS="10" REQUESTS_PER_SEC="50" API_KEY="UUID:KEY" ./gradlew gatlingRun-org.apache.openwhisk.BlockingInvokeOneActionSimulation
This simulation makes as much cold invocations as possible. Therefore, you have to specify how many users should be used. This amount of users is executing actions in parallel. I recommend using the same amount of users like your amount of node-js action slots in your invokers.
The users, that are used are loaded from the file gatling_tests/src/gatling/resources/data/users.csv
. If you want to increase the number of parallel users, you have to specify at least this amount of valid users in that file.
Each user creates n actions (default is 5). Afterwards all users are executing their actions in parallel. But each user is rotating it‘s action. That’s how the cold starts are enforced.
The aim of the test is, to test the throughput of the system, if all containers are always cold.
The action that is invoked, writes one log line and returns a little JSON.
The test is doing as many requests as possible for the given amount of time (SECONDS
). Afterwards it compares if the test reached the intended throughput (REQUESTS_PER_SEC
, MIN_REQUESTS_PER_SEC
).
Available environment variables:
OPENWHISK_HOST (required) USERS (required) SECONDS (default: 10) REQUESTS_PER_SEC (required) MIN_REQUESTS_PER_SEC (default: REQUESTS_PER_SEC) MAX_ERRORS_ALLOWED (default: 0) MAX_ERRORS_ALLOWED_PERCENTAGE (default: 0)
You can run the simulation with
OPENWHISK_HOST="openwhisk.mydomain.com" USERS="10" REQUESTS_PER_SEC="50" ./gradlew gatlingRun-org.apache.openwhisk.ColdBlockingInvokeSimulation