Traffic Router Ultimate Test Harness

Problem Description

As the entrypoint of an Apache Traffic Control CDN, Traffic Router is the most exposed, most critical component, and it must be able to route traffic quickly and at a very high rate. Although Traffic Router has met this requirement in the past, if Traffic Router undergoes no performance tests as it evolves, there is no guarantee that its performance will not decline with time.

Important times to run performance tests:

  • When traffic volume to your CDN changes significantly
  • When making harware changes to the server hosting the Traffic Router instance
  • When upgrading Traffic Router to a new Apache Traffic Control version
  • When developing or maintaining a Traffic Router feature that could have an impact on any aspect of Traffic Router performance, including before, during, and after pull request review
  • When a commit that modifies Traffic Router is pushed to a GitHub branch, for all Apache Traffic Control branches

A load test for Delivery Services exists in the project at /test/router, but

  • It has not been maintained over time, currently does not work
  • Only tests HTTP Delivery Services
  • Does not support testing Coverage Zone Maps
  • Does not use the TO Client Library for requests to Traffic Ops
  • Prompts only for the number of requests to make to Delivery Services, not a length of time to run the test
  • Does not fail if some minimum threshold of requests per second is not met
  • Is not configurable in other ways, such as length of paths generated for HTTP requests to Delivery Service, which client IP address to use

Proposed Change

The Traffic Router Ultimate Test Harness will include an end-to-end performance test suite verify that the features of Traffic Router meet expected performance thresholds, as well as additional end-to-end tests of other Traffic Router features.

The TR Ultimate Test Harness may extend /test/router where possible, but it should not limit itself for that secondary goal.

Traffic Portal Impact

No Traffic Portal impact is anticipated.

Traffic Ops Impact

No Traffic Ops impact is anticipated.

REST API Impact

No Traffic Ops REST API impact is anticipated.

Client Impact

Clients importing the github.com/apache/trafficcontrol/lib/go-tc package will optionally be able to import a constant for X-MM-Client-IP, a request header Traffic Router to specify to Traffic Router the IP address to use to geolocate that client:
https://github.com/apache/trafficcontrol/blob/1ed2964d16618aeebef142b01a538336a44d07dd/traffic_router/core/src/main/java/org/apache/traffic_control/traffic_router/core/request/HTTPRequest.java#L29

Additionally, a struct used to unmarshall a Coverage Zone File could be placed in lib/go-tc.

Data Model / Database Impact

No Data Model impact is anticipated.

Cache Config Impact

No Cache Config impact is anticipated.

Traffic Monitor Impact

No Traffic Monitor impact is anticipated.

Traffic Router Impact

The addition of the TR Ultimate Test Harness themselves will not change Traffic Router functionality in any way. For visibility, however, the TR Ultimate Test Harness should reside in a directory within the traffic_router directory. This will be the first time since 545929f7cc that Golang sources will exist in the traffic_router directory, so any assumption that all sources within the traffic_router directory directly impact Traffic Router's ability to compile should be abandoned.

The TR Ultimate Test Harness should not be included in the Traffic Router RPM, as it is meant to be run on a host separate from Traffic Routers.

Traffic Stats Impact

No Traffic Stats impact is anticipated.

Traffic Vault Impact

No Traffic Vault impact is anticipated.

Documentation Impact

Instructions for using the Traffic Router Ultimate Test Harness should be added to the documentation. This should include:

  • Small rationale for inclusion of TR Ultimate Test Harness
  • Setup instructions
    • The permissions that a user running the Traffic Router Test Harness should have:
      • CDN snapshots
      • CDN information
      • information about Traffic Router-type Servers in those CDNs
      • Type information
      • Delivery Service information
    • Documentation of each option
    • Example commands

Testing Impact

Load Tests

The Router Ultimate Test Harness should include a load test for HTTP-routed Delivery Services and for DNS-routed Delivery Services

Load Test Options
OptionDescriptionDelivery Service TypeDefault
IPv4 TR addresses onlyTest IPv4 Traffic Router addresses onlyHTTP, DNSFalse
IPv6 TR addresses onlyTest IPv6 Traffic Router addresses onlyHTTP, DNSFalse
CDN nameThe name of a CDN to search for Delivery ServicesHTTP, DNSall
Delivery Service nameThe name (XMLID) of a Delivery Service to use for testsHTTP, DNSNone
Traffic Router nameInstead of iterating through Traffic Routers, test only a specific Traffic Router, identified by hostname.HTTP, DNSall
Client IP addressIf provided, Traffic Router will use the value of the X-MM-Client-IP request header as the IP address that Traffic Router's geolocation considers. This option should you specify such an IP address.HTTP, DNSNone
Use coverage zone mapWhether to use an IP address from the Traffic Router's Coverage Zone FileHTTP, DNSFalse
Coverage zone locationThe coverage zone location to use (implies Use coverage zone map)HTTP, DNSNone
Requests per second thresholdThe minimum number of requests per second a Traffic Router must successfully respond toHTTP, DNS8000 for HTTP, 7200 for DNS
Benchmark timeThe duration of each load test, in secondsHTTP, DNS300
Thread countThe number of threads to spawn for each testHTTP, DNS12
Path countThe number of paths to generate for use in requests to Delivery ServicesHTTP10000
Maximum path lengthThe maximum string length for each generated pathHTTP100
Use location headerWhether the HTTP HTTP Delivery service should redirect the user or server the routing information as a JSON response.HTTPTrue

These options will be structured in a config file:

{
    "all": {
        "cdn_name": "Kabletown"
    },
    "http": {
        "ipv4_only": true,
        /* more options */
        "path_count": 5000
    },
    "dns": {
        "delivery_service_name": "static"
    }
}

Options that should apply to a specific type of test should go under a key named that test type, while options applying to all tests can go under the "all" key.

Other Tests

Additionally, the TR Ultimate Test Harness should provide the ability to verify that a DNS-routed Delivery Service assigned to a Federation resolves to that Federation‘s CNAME, rather than a Cache’s IP address, depending on the IP address of the client querying Traffic Router.

Automation Impact

A GitHub Action to run the tests that the TR Ultimate Test Harness should be added, but only if it consistently meets a constant requests per second threshold. Traffic Router will perform better on some GitHub Actions runners than others, so this should be tested after writing the GitHub Action.

If a meaningful requests per second threshold cannot be found for GitHub Actions runners, we may consider trying again in the future, in case consistency of Traffic Router performance on GitHub Actions runners improves.

Performance Impact

By increasing our attention to Traffic Router‘s performance, Traffic Router’s performance should not decrease, and its performance may increase.

Security Impact

If Traffic Router is vulnerable to denial-of-service attacks relating to HTTP requests or DNS queries, there is potential for the TR Ultimate Test Harness to uncover such vulnerabilities.

Upgrade Impact

An Apache Traffic Control administrator may choose to use the TR Ultimate Test Harness to verify that they can get as good of Traffic Router performance in a new Apache Traffic Control version as they can in the version they are upgrading from. If, according to their testing, performance decreases in the newer Traffic Router, the administrator may choose to delay upgrading until they are able to attain the same level of performance, either by changing Traffic Router configuration or waiting for an even newer Traffic Router version to be released.

Operations Impact

Needless to say, the Traffic Router Ultimate Test Harness should not be run against a Traffic Router that is simultaneously being used to route production traffic. In order to avoid this. the Traffic Router Ultimate Test Harness should only be run in non-production environments.

Developer Impact

When developing Traffic Router features, especially ones found or anticipated to affect Traffic Router performance, a developer may choose to test the result of the new feature on Traffic Router's performance using the Traffic Router Ultimate Test Harness.

Alternatives

wrk is an alternative for HTTP load testing.

flamethrower is an alternative for DNS load testing.

Dependencies

No additional dependencies are anticipated. If additional Go dependencies are required, those dependencies will be added to the Apache Traffic Control go.mod file.

References