blob: 9a5ed9d6a423bd5381b2ebb1153738fb546578f0 [file] [log] [blame]
FEATURE INTRODUCTION
====================
Feature Name:
-------------
Congestion Control
Synopsis of Feature:
--------------------
The main purpose of this congestion control feature is to keep track
of which hosts are congested so that TS will not forward requests to
those congested hosts; instead, TS will send back to clients a
Retry-After response to tell them to retry congested hosts at a later
time.
CASE 1. Connection Failures:
------------------------------
(1) For each request to a live (non-congested) server, TS will try at most m
times to connect to the server, and the timeout is n seconds for each try.
If TS does not succeed with m tries, then one connection failure is counted
towards the server.
Note that if a client aborts a request before a timeout occurs, it does not
count as a connection failure.
(2) A server is marked congested if there are more than M connection failures
within N seconds.
(3) If a server is marked congested, then TS will not forward requests to it
until Proxy Retry After Time (PRAT) (which is current time + t)
(4) For a request to a congested server before the server's PRAT time, TS sends
a Retry-After response to tell the client to retry the request after
Client Retry After Time (CRAT) (= PRAT - current time + T + a random
interger from 0 to alpha).
(5) For a request to a congested server after the server's PRAT time, TS will
try at most m' times to connect to the server, and the timeout is n'
seconds for each try.
(6) A congested server will stay dead if TS cannot make a successful
connection; otherwise, the server becomes live again.
CASE 2: Maximum Number of Connections
-------------------------------------
TS will temporarily mark a server as congested if a "max_connection" number
to the server is reached. If a new client request comes in and needs a new
connection to the server, the client will get 503 Retry-After back.
There is no PRAT on the "max_connection" reached servers.
Here a server can be identified by IP address (per_ip) or by host name
(per_host). For example, www.inktomi.com has two IP addresses,
209.131.63.206 and 209.131.63.207. If per_ip is used, then each IP address
has its own number of connection failures, and each IP address will be marked
congested or not by itself. That is, if 206 is marked congested (but 207 is
not), requests can still be forwarded to 207. On the other hand, if per_host is
used, then one connection failure to either 206 or 207 will be counted to the
number if connection failures of host www.inktomi.com. If the host
www.inktomi.com is marked congested, then essentially both 206 and 207 are
marked congested.
We can also use prefix as a secondary specifier to specify the scope of
congestion control to sub-host (service) area. For example,
dest_host=www.inktomi.com prefix=/cgi/search.exe
This rule can detect the stop of the cgi program or database it depends on.
Each specification has an independent counter. The error of requests to
www.inktomi.com/index.html will count independently to the counter of this
line. The prefix=/cgi/ means all requests to the objects under /cgi/ have
one common counter with specified parameters. It does not mean each URI
under the directory has its own counter.
The TS administrator will be able to specify the customizeable error_page,
the error_page can be customized to return the reason (for example: "The
site is under maintenance") of congestion for '503'. In the error_page, the
URL of the page congested, and the Retry-After time can be returned.
ENGINEERING DESCRIPTION:
========================
Risk points of feature:
-----------------------
The set of "origin server connect attempts" configuration varibles in
records.config will be affected by this feature. See the following
"Requirement on Server Management" and the above "Synopsis of feature"
sections for more information.
Also, there are some known problematic/unexpected behaviors of this feature.
See the following "Problematic behavior" section.
Effect on SDK/API:
------------------
None.
Management Implications:
-----------------------
<Record Changes>
There is one new config variable to enable/disable/test congestion control.
proxy.config.http.congestion_control.enabled INT 0|1|2
proxy.config.http.congestion_control.filename STRING congestion.config
<Statistics Changes>
1. Number of congestions because of connection failures
stat name: proxy.process.congestion.congested_on_conn_failures
2. Number of congestions because of max_connection reached
stat name: proxy.process.congestion.congested_on_max_connection
<Config File>
A new .config file "congestion.config" is used to specify the parameters for
different servers.
Each rule will have one primary key to identify the servers, the primaries
can be
dest_host=
dest_domain=
dest_ip=
regex_host=
Each rule can also have secondary keys, secondary keys include
prefix= // for different directory / service
port= // for different server ports
The tag=value pairs are used to specify the rules:
max_connection_failures=<integer> // M
fail_window=<interger> // N
proxy_retry_interval=<integer> // t
client_wait_interval=<integer> // T
wait_interval_alpha=<integer> //alpha
live_os_conn_timeout=<integer> // n
live_os_conn_retries=<integer> // m
dead_os_conn_timeout=<integer> // n'
dead_os_conn_retries=<interger> // m'
max_connection=<integer> // -1 means unlimited
error_page=<page uri>
congestion_scheme=per_ip|per_host
The suggested default values are as follows:
max_connection_failures=5
fail_window=120
proxy_retry_interval=10
client_wait_interval=300
wait_interval_alpha=30
live_os_conn_timeout=60
live_os_conn_retries=2
dead_os_conn_timeout=15
dead_os_conn_retries=1
max_connection=-1
error_page="congestion#retryAfter"
congestion_scheme="per_ip"
The above tag values will be used as default if the tag is not specified
in the rule.
The default values can be overrided by setting the records.config variables
CONFIG proxy.config.http.congestion_control.default.<tag> <INT|STRING> <value>
The following "origin server connect attempts" configuration variables may
be affected by this congestion control feature:
proxy.config.http.connect_attempts_max_retries
proxy.config.http.connect_attempts_max_retries_dead_server
proxy.config.http.connect_attempts_rr_retries
proxy.config.http.connect_attempts_timeout
proxy.config.http.down_server.cache_time
proxy.config.http.down_server.abort_threshold
For a request to a server that does not have an applicable rule in
congestion.config, the values for these "origins server connect attempts"
variables are used by TS. Otherwise, the corresponding values specified
in congestion.config will override them.
<Alarm Changes>
Add two new alarm types to Traffic Manager:
1) MGMT_SIGNAL_HTTP_CONGESTED_SERVER
used to indicate a congested server
2) MGMT_SIGNAL_HTTP_ALLEVIATED_SERVER
used to indicate a congested server is no longer congested
These alarms are not processed like the other Traffic Manager alarms.
Whenever these alarms are signalled (even if they are repeat alarms)
*only* an SNMP trap will be sent. Note, that this means that
potentially, users can be flooded with SNMP traps if a congested
server is always signalling an alarm.
<Web UI Enhancement>
For configuration purposes, we will add a new "Congestion Control" tab to the
"Configure -> Networking -> Connection Management" section of the web UI.
Within this tab users can:
1. enable/disable the congestion control feature
2. edit the congestion.config file (which will be displayed in a html text box)
<Command-Line Interface Enhancement>
Use the traffic_line command-line interface to retrieve the congestion
statistics and monitoring information.
1. "traffic_line -r <statistic_name>"
Returns the value of the statistic specified
2. "traffic_line -q"
Returns a list of currently congested sites (one site per line);
for each congested site, displays the information in the following format:
'<time>|<rule #>|<hostname>|<ip_address>|<scheme>|<prefix>|<congestion reason>|<F#>|<M#>'
- time : congestion detected time
in seconds since 00:00:00 UTC, January 1, 1970.
- rule #
- hostname
- ip address
- scheme: per_ip or per_host
- prefix (if none, leave blank)
- congestion reason: M or F
M - congestion caused by exceeding max connections,
F - congestion caused by OS response timeout/failure
- F# number of congested requests because of F
- M# number of congested requests because of M
NOTE: In order to use "traffic_line -q", raf must be enabled and have a
raf port specified. These are the default values used.
CONFIG proxy.config.raf.enabled INT 1
CONFIG proxy.config.raf.port INT 9000
If the raf port conflicts with another port, then change it by:
traffic_line -s "proxy.config.raf.port" <new-port>
Engineering description of feature:
-----------------------------------
Data structure and algorithm:
Congestion Control Database (in memory and disk):
Using Multicache Implementation
CongestEntry{
unsigned int ip;
int hostname_offset;
int prefix_offset;
int last_failure;
char fail_history[17];
unsigned int congestion_scheme; // per_ip | per_host;
unsigned int congested; //0 | 1
short max_connection;
short num_connection;
short max_connection_failures;
unsigned long num_congested; // reserved for per server stat.
};
For each server, TS uses an array of 17 entries to record the number of
connection failures. Each entry is 16-bit long and records the number of
connection failures for 1/16 of the fail_window, for example, for a
fail_window=240 seconds, the granularity of recording is 240/16=15 seconds.
That is, the first entry records the number of connection failures from time
t+0 to t+15, and the second entry records the number of connection failures
from time t+16 to t+30, and so on. TS will mark the server as congested if
the sum of the 9 entries is greater than the specified max_connection_failures.
Note that this algorithm will count number of failures in the past 240 to 255
seconds. For higher accuracy, we need to increase the number of entries.
Operation:
----------
The following is an overview of the operation of this congestion control
feature in TS. After parsing a valid request, a TS calls its HostDB module
to get the HostDBInfo record for the host name. If the host name has more
than one IP address, then TS selects one of them as usual.
Then, TS uses the hostname, the selected IP address, and request URL to lookup
for the first matched rule in congestion.config.
TS will lookup the CongestionDB
case 1: "congested" is true and
(current time <= "last_failure" + "proxy_retry_window"):
TS sends to the client a Retry-After response.
case 2: "congested" is true and
(current time > "last_failure" + "proxy_retry_window"):
TS makes a connection to this congested server.
case 3: "congested" is false and
(current connections >= max_connections)
TS sends to the client a Retry-After response.
case 4: "congested" is false and
(current connections < max_connections)
TS makes a connection to this non-congested server.
Various timeouts and max_retry numbers are set up according to the matched
rule in congestion.config.
If a connection failure is detected, TS updates the CongestionDB record. If it
is case 2, and the connection succeeds, we need to mark the server live again.
Problematic Behavior:
---------------------
For a host with multiple nicknames, we night mis-calculate the number
of failures.
For example,
www.berkeley.edu is a nickname for amber.berkeley.edu
amber.berkeley.edu has address 128.32.25.12.
In this simple case, if you only specify www.berkeley.edu in the rule and use
per_host scheme, we will miss the info when request use amber.berkeley.edu
as the hostname.
Another problem is with the granularity of connection failure recording. In
some cases, TS will mark servers congested which is actually not congested
according to the rules in congestion.config.
TS will not be able to distinguish a original server busy from TS itself
is busy.
Implementation Limits:
---------------------
1. granularity of connection failure records (17 entries).
2. maximum number of failures can be recorded (1<<16 = 65536).
3. potentail performance hit beause of updating congestion info
(need to take locks to update the info)
Modules need to be touched:
---------------------------
ControlMatcher
one new primary field is added ---- host_regex
HttpSM / HttpTransact
for apparent reasons
KNOWN PROBLEM
=============
(1) The config filename for congestion control must be congestion.config,
this is a known bug for TS
(2) The test case:
proxy.config.http.congestion_control.enabled INT 2
is not implemented, due to the limited time for coding
TEST DESCRIPTIONS
=================
Test description:
-----------------
(1) Enable congestion control and specify rules for a few server in
congestion.config. Then run (existing) tests to verify the
"origin server connect attempts" configuration variable are still
working for servers that are not specified in congestion.config.
(2) Enable congestion control and in congestion.config, specify a rule
for a server that can be controlled (up/down) and has only one IP
address. Verify TS follow the rule for the server by sending requests
for the server thru TS and controlling whether the server is up and
down.
(3) Specify a prefix rule and a dest_host rule on the same host, and
control the service specified by the prefix, check if prefix rule is
in effect.
(4) Repeat test (2) with a server with more than one IP address. Both
congestion schemes (per_ip and per_host) should be tested.
(5) Test all possiable conbinations of rules. Check the error_page and
error logs.
(6) Kill one of the servers that connected to TS, hence the server is
"congested". Ensure the congested alarm is signaled (check the
WebUI) and a SNMP trap is sent Re-start the dead
server, hence the server is alive. Ensure the alleviated alarm is
signaled and a SNMP trap is sent.
Test tool:
----------
syntest could be a good candidate for functional tests.
for load test, try jtest combined with syntest.
Test configurations:
--------------------
Change Log:
===========
Removed Feature(s):
*** Saving congestion control information to the disk.
New/Modified Feature(s):
*** traffic_line -q output format (short):
short format
<time>|<rule #>|<hostname>|<ip_address>|<scheme>|<prefix>|<congestion reason>|<F#>|<M#>
long format
<time>|<rule #>|<hostname>|<ip_address>|<scheme>|<prefix>|<congestion reason>|<F#>|<M#>|<local/GMT time>|<key>|<last_failure>|<num_fail_events>|<internal_ref count>|<num_connections>
- time : congestion detected time
in seconds since 00:00:00 UTC, January 1, 1970.
- rule #
- hostname
- ip address
- scheme: per_ip or per_host
- prefix (if none, leave blank)
- congestion reason: M or F
M - congestion caused by exceeding max connections,
F - congestion caused by OS response timeout/failure
- F# number of congested requests because of F
- M# number of congested requests because of M
- key: the internal key in congestion control table
- local/GMT time: YYYY/MM/DD hh:mm:ss
CONFIG proxy.config.http.congestion_control.localtime INT 1 //localtime format
CONFIG proxy.config.http.congestion_control.localtime INT 0 //GMT format
*** telnet localhost <Raf port>
0 congest list
-- list congested servers at the moment (short format)
0 congest list long [0-4]
-- list congested servers at the moment (long format)
0 query deadhosts
-- list congested servers at the moment
0 congest remove key=XXXXXXXXXXXXXX {key=XXXXXXXXXXXXXX}
-- remove the entries whose keys are listed
manual activate the congested server
0 congest remove host=<hostname>[/prefix]
0 congest remove ip=<xxx.xxx.xxx.xxx>[/prefix]
0 congest remove all
-- remove all entries in the congestion control internal table