tree 5f3cce7b4d89cdc537b5518dd326ddaf03e0a054
parent 708a6dec8905d7671d3bf70d9124472e52d30cdc
author Pawas Chhokra <pchhokra@linkedin.com> 1608837654 -0800
committer GitHub <noreply@github.com> 1608837654 -0800
gpgsig -----BEGIN PGP SIGNATURE-----
 
 wsBcBAABCAAQBQJf5OoWCRBK7hj4Ov3rIwAAdHIIAI5nOmJerGgn4LvY8CzUbNEY
 14P5WmSAkwmS3PfgXdd1Bwb9XjJa4tnajJkBm8gSrro8HvF9sEX+tsmB6gVSjpYZ
 y9YUKnAXkHCWPtrxox3JVT/i4XCDqOvgEdqA1G06EtMYzhH/dldxx8cklIibOq4T
 OBgOdH7YefnIepD860yDAVzUJjqdpnnBnqwlljBcFhmaBkn5GKH8WmUX3pWppouO
 MGeU4Wi0LAiMbRYbMLOigz52LgW2oWfXZyJvf2myp0l3vzdjMpOU8qVW2FpJH8mS
 rjFKAZuAlgeFjYT2uG0Cg1n3iZG9DYTMiaqBRhxDXnqSxZN3RXydbsQFAKYD6hI=
 =0BRP
 -----END PGP SIGNATURE-----
 

SAMZA-2605: Make Standby Container Requests Rack Aware  (#1446)

Feature: The aim of this feature is to make all standby container requests rack aware such that all active containers and their corresponding standby containers are always on different racks. This helps with decreased downtime of applications during rack failures.

One of the requirements of this feature is that the value of job.standbytasks.replication.factor is at max 2 for the rack awareness functionality to be honored.

Changes: This PR uses the FaultDomainManager interface for Yarn to request for rack aware nodes while making standby container requests.

Usage Instructions: For a job with host affinity and standby containers, set the config cluster-manager.fault-domain-aware.standby.enabled to true to enable this feature.