Delivery services of type CLIENT_STEERING
are composed of one or more delivery service targets, and each target typically has its own origin. That gives clients some redundant options in case of origin failure, because the client will be able to retry a different target which is served by a different origin. However, it is also desirable to have redundancy in terms of edge cache failure, but the problem is that CLIENT_STEERING
does not take this into account very well. Depending on the delivery-service-to-server assignments and initial dispersion settings of the CLIENT_STEERING
service's targets, the result of a CLIENT_STEERING
request can include locations that all point to the same edge cache. In the case of edge cache failure, the client might not have any other edge caches in the result to retry, so it will retry the same failed edge cache multiple times. Ideally, there should be a way to configure the CDN to make sure there are as many unique edge caches in a CLIENT_STEERING
result as possible so that clients have a diverse set of edge caches to retry in case of failure.
Add a new TR_PROFILE
parameter (client.steering.forced.diversity
) that will end up in the CRConfig
for Traffic Router to use. If true
, Traffic Router will diversify CLIENT_STEERING
results by including more unique edge caches. If false
or unset, Traffic Router will stick to the old behavior as the default.
n/a
n/a
n/a
n/a
n/a
n/a
n/a
Traffic Router will consume a new TR_PROFILE
parameter named client.steering.forced.diversity
. Similar to other existing Traffic Router config params, it will end up in the top-level "config"
section of the CRConfig.json
when it is generated by Traffic Ops.
If the parameter value is set to "true"
, Traffic Router will change its current behavior in order to diversify CLIENT_STEERING
results. Otherwise, Traffic Router will continue to process CLIENT_STEERING
requests using the old behavior (non-diverse) as the default.
The way Traffic Router processes CLIENT_STEERING
results today is that it basically processes and routes each individual target separately, and the route result of one target does not affect the potential route results of the other targets. With client.steering.forced.diversity
set to "true"
, Traffic Router will track the set of edge caches being returned in the CLIENT_STEERING
result as it processes and routes each target. Once an edge cache has already been chosen for a target, that same edge should not be chosen for other targets in the same CLIENT_STEERING
result. By doing this, the end CLIENT_STEERING
result will contain as many unique edge caches in the cachegroup as possible.
In the case of “deep” cachegroups which might have less caches than the number of targets to route to, Traffic Router will start choosing edge caches from the “regular” cachegroup once all edges from the “deep” cachegroup have been chosen. If there are more targets than available edge caches in the cachegroup, then Traffic Router will start including duplicate edge caches via the old, default behavior until edge caches have been chosen for all targets.
n/a
n/a
The new client.steering.forced.diversity
parameter will be documented in Traffic Router's profile parameter configuration section.
This could be tested via Traffic Router's integration test framework. Given a CLIENT_STEERING
delivery service that would normally return all duplicate edges in the result, enable the new parameter and verify that the result is actually diverse as expected.
No impact to Traffic Router performance should be expected at all with this feature enabled.
Due to the diversification of CLIENT_STEERING
results, clients may experience some “first request” latency due to “cold” caches that hadn't previously been taking the same requests.
n/a
This change only impacts Traffic Router, and by default Traffic Router will continue following the existing behavior of non-diverse CLIENT_STEERING
(with the feature disabled/not configured). Only once the feature is enabled via the profile parameter will TR change its behavior to make CLIENT_STEERING
results more diverse.
The recommended upgrade procedure would be to enable this feature via the profile parameter only after all TRs have been upgraded, so that all TRs can switch over to the new behavior at the same time.
Operators would need to know about the new TR_PROFILE
parameter to enable this feature, but this feature can also be safely ignored if the default (non-diverse) CLIENT_STEERING
is sufficient.
This change will make the logic around finding available caches for a CLIENT_STEERING
request slightly more complex, because the code will have to check for the new behavior flag and process the request accordingly. If after some time this change in behavior should become the new default, we could remove the check for the flag and just process all requests as diverse-enabled.
One alternative would be to make this a per-delivery-service setting by adding a new column to the deliveryservice table, but we did not think that level of granularity was necessary and settled on a per-CDN level of granularity by allowing the feature to be enabled via a TR_PROFILE
parameter.
There was also the possibility of changing the default behavior of TR altogether instead of enabling it via a TR_PROFILE
parameter, but we thought it would be desirable to be able to upgrade/deploy TRs without changing the behavior at the same time.
Another design choice to note was how to handle the case where there are more CLIENT_STEERING
targets than available caches to choose from. If all the caches in a deep cachegroup have already been chosen for the same request, caches from the best regular cachegroup will be selected for the request until all targets have a selected cache. If all the caches in a regular cachegroup have already been chosen for the same request, TR will continue to select caches from that same cachegroup (as opposed to the next closest cachegroup or fallback) until all targets have a selected cache.
n/a
n/a