blob: 143befd1b73bfd9218ff8852ce63c03cb11a967f [file]
mod_proxy_beacon
Self-registering reverse-proxy balancer membership: backend servers announce
themselves to a front-end proxy over UDP datagrams, and the proxy adds, enables,
and evicts them as live balancer members. See
docs/manual/mod/mod_proxy_beacon.xml for the user-facing directive reference;
this file describes the architecture.
The transport is plain APR UDP sockets -- small, periodic, fire-and-forget
announcements with no third-party dependency.
Dependencies:
Requires mod_watchdog and mod_proxy_balancer. No external library: the UDP
transport uses APR sockets and authentication uses SipHash from APR-util (no
OpenSSL dependency).
Transport (unicast UDP, not multicast):
Each backend sends a point-to-point UDP datagram (apr_socket_sendto) to the
proxy's configured address. This is deliberately UNICAST -- unlike
mod_heartbeat, which multicasts to a group. Multicast is filtered on most
networks and does not traverse the public Internet, so a cross-host control
plane must be unicast. The socket-handling pattern is adapted from
mod_heartmonitor's unicast receive path.
Datagrams are fire-and-forget. The design tolerates UDP loss and reordering
without a reconnect or framing layer: a lost announcement just delays an update
by one interval (announcements are periodic and idempotent), and an out-of-order
datagram is rejected by the per-url monotonic timestamp (see Authentication).
Roles:
Backend = sender (ProxyBeaconAddress): sends to the proxy, advertising its
own routable URL (ProxyBeaconAdvertise) periodically.
Proxy = receiver (ProxyBeaconListen): binds a stable address, receives
announcements, and manages balancer://<name> (ProxyBeaconBalancer).
The proxy is the fixed unicast rendezvous; each backend is configured with its
address. One proxy receiver serves many backend senders. A leading scheme in
an address (e.g. tcp://) is accepted and ignored.
ProxyBeaconListen's host and port are both optional: an omitted host/port (or
no argument at all) is inherited from the server's own address/port. Since UDP
and TCP are independent port spaces, the beacon socket can share the server's
service port without colliding with its TCP listener -- so a backend can beacon
to the proxy at its real service endpoint. (The listener binds in an
unprivileged watchdog child, so privileged ports like 80/443 cannot be shared
this way.) The sender's ProxyBeaconAddress targets the proxy and must be given
in full.
Where it runs:
All background work runs in a single mod_watchdog SINGLETON callback -- exactly
one child process owns the socket (so only one process binds the receive port
and performs membership changes). The watchdog fires every ~100ms:
STARTING open the UDP socket; receiver binds, sender resolves its dest.
RUNNING sender: send a throttled announcement (ProxyBeaconInterval).
receiver: drain the socket, then run a throttled (~1s) eviction
sweep.
STOPPING close the socket.
Because this relies on a singleton watchdog child, it is inactive under the
prefork MPM.
Adding/removing members (reuses the balancer-manager path):
Runtime worker *removal* does not exist in httpd, so membership is managed
entirely as status-flag changes via mod_proxy_balancer's exported
balancer_manage() optional function -- the same code the balancer-manager web
UI calls. The receiver synthesizes a minimal request_rec (no real client
request exists in the watchdog thread; see beacon_make_fake_request) and calls:
add b=<name> b_nwrkr=<url> b_wyes=1 (worker created, disabled)
enable b=<name> w=<url> w_status_D=0 (cleared -> serves traffic)
evict b=<name> w=<url> w_status_D=1 (disabled -> out of rotation)
Adds bump the balancer's shm "updated" timestamp, so every other child picks
up the new worker through the normal ap_proxy_sync_balancer() path -- the
change made in the singleton propagates fleet-wide. The target balancer must
have spare slots (ProxySet growth / BalancerGrowth).
Per-backend state (last-seen time, added/evicted flags) is kept in a hash in
the singleton's own pool -- no shared memory is needed, since the singleton is
the only process that adds or evicts. A re-announcement after eviction
re-enables the worker; the flags make these transitions idempotent.
Message format:
Plain ASCII, space-separated key=value tokens, one datagram per announcement:
BEACON url=http://host:port host=<h> pid=<n> seq=<n> ts=<usec> mac=<hex>
url= is the routable backend origin the proxy adds as a BalancerMember. ts=
(microseconds since the epoch) and mac= are present only when a shared secret
is configured (see below). host=/pid=/seq= are informational.
A UDP datagram is delivered whole, so the receiver reads each message into a
fixed stack buffer and NUL-terminates it (buf[len] = '\0') before parsing.
The sender transmits strlen(msg) bytes (no trailing NUL on the wire).
Authentication (ProxyBeaconSecret, optional but recommended):
Without a secret the channel is unauthenticated: anyone who can reach the
receiver port could announce a URL and hijack client traffic (the proxy logs a
warning at startup in this case). A UDP source address is trivially spoofable,
so authentication matters at least as much here as it would over a connection.
With ProxyBeaconSecret set identically on the proxy and all backends, each
announcement is signed with a SipHash-2-4 MAC over the message prefix (the key
is derived from the passphrase via apr_md5, as in mod_session_crypto). The
receiver recomputes the MAC (constant-time compare) and rejects anything that
does not match -- before the URL is parsed or acted on. Replay protection has
two parts, both keyed on the signed ts= (microseconds): (1) a freshness window
(ProxyBeaconMaxSkew) rejects messages whose timestamp is too far from now,
which assumes proxy and backend clocks are roughly synchronized (NTP); (2) a
per-url high-water mark rejects any ts= that does not strictly exceed the last
one accepted for that url, so a byte-identical replay (same ts, same MAC) --
e.g. replaying a dead backend's last announcement to keep it from being evicted
-- is dropped even within the freshness window. Microsecond granularity ensures
genuine sub-second announcements always advance. seq= is NOT used for replay
(it resets when a backend restarts); the wall-clock ts= does not.
Announcements are authenticated, not encrypted -- the payload is operational
metadata, not secret. For transport confidentiality, DTLS would be a separate,
orthogonal future layer.