blob: d1c3c9f20c1a623ce12449adf23a2ef9148df7fe [file] [log] [blame]
Hadoop YARN Change Log
Release 2.6.5 - 2016-10-08
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
OPTIMIZATIONS
YARN-4690. Skip object allocation in FSAppAttempt#getResourceUsage when
possible (Ming Ma via sjlee)
YARN-5262. Optimize sending RMNodeFinishedContainersPulledByAMEvent for
every AM heartbeat (Rohith Sharma K S via jlowe)
YARN-5483. Optimize RMAppAttempt#pullJustFinishedContainers (sandflee via
jlowe)
YARN-5009. NMLeveldbStateStoreService database can grow substantially
leading to longer recovery times (jlowe)
BUG FIXES
YARN-2046. Out of band heartbeats are sent only on container kill and
possibly too early (Ming Ma via jlowe)
YARN-4722. AsyncDispatcher logs redundant event queue sizes (Jason Lowe via
sjlee)
YARN-4761. NMs reconnecting with changed capabilities can lead to wrong cluster
resource calculations on fair scheduler. (Sangjin Lee via zxu)
YARN-4785. inconsistent value type of the "type" field for LeafQueueInfo in
response of RM REST API. (Varun Vasudev via junping-du)
YARN-4773. Log aggregation performs extraneous filesystem operations when
rolling log aggregation is disabled (Jun Gong via jlowe)
YARN-4459. container-executor should only kill process groups (Jun Gong
via jlowe)
YARN-5206. RegistrySecurity includes id:pass in exception text if
considered invalid (Steve Loughran via jlowe)
YARN-5197. RM leaks containers if running container disappears from
node update. Contributed by Jason Lowe.
YARN-4393. Fix intermittent test failure for TestResourceLocalizationService
#testFailedDirsResourceRelease (Varun Saxana via rohithsharmaks)
YARN-5353. ResourceManager can leak delegation tokens when they are shared across
apps. (Jason Lowe via Varun Saxena)
YARN-5462. TestNodeStatusUpdater.testNodeStatusUpdaterRetryAndNMShutdown
fails intermittently (Eric Badger via jlowe)
YARN-4573. Fix test failure in TestRMAppTransitions#testAppRunningKill and
testAppKilledKilled. (Takashi Ohnishi via rohithsharmaks)
YARN-4794. Deadlock in NMClientImpl (Jian He via Rohith Sharma K S)
Release 2.6.4 - 2016-02-11
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
YARN-2975. FSLeafQueue app lists are accessed without required locks. (kasha)
YARN-3857. Memory leak in ResourceManager with SIMPLE mode. (mujunchao via zxu)
YARN-3535. Scheduler must re-request container resources when RMContainer transitions
from ALLOCATED to KILLED. (rohithsharma and peng.zhang via asuresh)
YARN-4452. NPE when submit Unmanaged application. (Naganarasimha G R
via junping_du)
YARN-3893. Both RM in active state when Admin#transitionToActive failure
from refeshAll() (Bibin A Chundatt via rohithsharmaks)
YARN-3697. FairScheduler: ContinuousSchedulingThread can fail to shutdown.
(Zhihai Xu via kasha)
YARN-4546. ResourceManager crash due to scheduling opportunity overflow.
(Jason Lowe via junping_du)
YARN-2902. Killing a container that is localizing can orphan resources in
the DOWNLOADING state (Varun Saxena via jlowe)
YARN-4354. Public resource localization fails with NPE. (Jason Lowe via
junping_du)
YARN-4380. TestResourceLocalizationService.testDownloadingResourcesOnContainerKill
fails intermittently. (Varun Saxena via ozawa)
YARN-4180. AMLauncher does not retry on failures when talking to NM.
(adhoot)
YARN-3849. Too much of preemption activity causing continuos killing of
containers across queues. (Sunil G via wangda)
YARN-3842. NMProxy should retry on NMNotYetReadyException.
(Robert Kanter via kasha)
YARN-3695. ServerProxy (NMProxy, etc.) shouldn't retry forever for non
network exception. (Raju Bairishetti via jianhe)
YARN-4414. Nodemanager connection errors are retried at multiple levels
(Chang Li via jlowe)
YARN-4581. AHS writer thread leak makes RM crash while RM is recovering.
(sandflee via junping_du)
YARN-4598. Invalid event: RESOURCE_FAILED at
CONTAINER_CLEANEDUP_AFTER_KILL (tangshangwen via jlowe)
Release 2.6.3 - 2015-12-17
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
YARN-2859. ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster.
(Vinod Kumar Vavilapalli via xgong)
YARN-4312. TestSubmitApplicationWithRMHA fails on branch-2.7 and branch-2.6
as some of the test cases time out. (Varun Saxena via ozawa)
YARN-4320. TestJobHistoryEventHandler fails as AHS in MiniYarnCluster no longer
binds to default port 8188. (Varun Saxena via ozawa)
YARN-4326. Fix TestDistributedShell timeout as AHS in MiniYarnCluster no longer
binds to default port 8188. (Meng Ding via wangda)
YARN-4241. Fix typo of property name in yarn-default.xml.
(Anthony Rojas via aajisaka)
YARN-4344. NMs reconnecting with changed capabilities can lead to wrong
cluster resource calculations (Varun Vasudev via jlowe)
YARN-3925. ContainerLogsUtils#getContainerLogFile fails to read container
log files from full disks. (zhihai xu via jlowe)
YARN-3878. AsyncDispatcher can hang while stopping if it is configured for
draining events on stop. (Varun Saxena via kasha)
YARN-4365. FileSystemNodeLabelStore should check for root dir existence on
startup (Kuhu Shukla via jlowe)
YARN-4348. ZKRMStateStore.syncInternal shouldn't wait for sync completion for
avoiding blocking ZK's event thread. (ozawa)
YARN-4424. Fix deadlock in RMAppImpl. (Jian he via wangda)
YARN-4434. NodeManager Disk Checker parameter documentation is not correct.
(Weiwei Yang via aajisaka)
Release 2.6.2 - 2015-10-28
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
YARN-3727. For better error recovery, check if the directory exists before
using it for localization. (Zhihai Xu via jlowe)
YARN-4092. Fixed UI redirection to print useful messages when both RMs are
in standby mode. (Xuan Gong via jianhe)
YARN-4101. RM should print alert messages if Zookeeper and Resourcemanager
gets connection issue. (Xuan Gong via jianhe)
OPTIMIZATIONS
BUG FIXES
YARN-2019. Retrospect on decision of making RM crashed if any exception throw
in ZKRMStateStore. (Jian He via junping_du)
YARN-4087. Followup fixes after YARN-2019 regarding RM behavior when
state-store error occurs. (Jian He via xgong)
YARN-3554. Default value for maximum nodemanager connect wait time is too
high (Naganarasimha G R via jlowe)
YARN-4005. Completed container whose app is finished is possibly not
removed from NMStateStore. (Jun Gong via jianhe)
YARN-3780. Should use equals when compare Resource in RMNodeImpl#ReconnectNodeTransition.
(zhihai xu via devaraj)
YARN-3802. Two RMNodes for the same NodeId are used in RM sometimes
after NM is reconnected. (zhihai xu via xgong)
YARN-3194. RM should handle NMContainerStatuses sent by NM while
registering if NM is Reconnected node (Rohith via jlowe)
YARN-3896. RMNode transitioned from RUNNING to REBOOTED because its response id
has not been reset synchronously. (Jun Gong via rohithsharmaks)
YARN-3798. ZKRMStateStore shouldn't create new session without occurrance of
SESSIONEXPIED. (ozawa and Varun Saxena)
Release 2.6.1 - 2015-09-23
INCOMPATIBLE CHANGES
NEW FEATURES
YARN-3249. Add a 'kill application' button to Resource Manager's Web UI.
(Ryu Kobayashi via ozawa)
IMPROVEMENTS
YARN-3230. Clarify application states on the web UI. (Jian He via wangda)
YARN-1809. Synchronize RM and TimeLineServer Web-UIs. (Zhijie Shen and
Xuan Gong via jianhe)
YARN-3092. Created a common ResourceUsage class to track labeled resource
usages in Capacity Scheduler. (Wangda Tan via jianhe)
YARN-3098. Created common QueueCapacities class in Capacity Scheduler to
track capacities-by-labels of queues. (Wangda Tan via jianhe)
YARN-2301. Improved yarn container command. (Naganarasimha G R via jianhe)
YARN-3978. Configurably turn off the saving of container info in Generic AHS
(Eric Payne via jeagles)
YARN-3248. Display count of nodes blacklisted by apps in the web UI.
(Varun Vasudev via xgong)
OPTIMIZATIONS
BUG FIXES
YARN-2856. Fixed RMAppImpl to handle ATTEMPT_KILLED event at ACCEPTED state
on app recovery. (Rohith Sharmaks via jianhe)
YARN-2816. NM fail to start with NPE during container recovery (Zhihai Xu
via jlowe)
YARN-2414. RM web UI: app page will crash if app is failed before any
attempt has been created (Wangda Tan via jlowe)
YARN-2865. Fixed RM to always create a new RMContext when transtions from
StandBy to Active. (Rohith Sharmaks via jianhe)
YARN-2906. CapacitySchedulerPage shows HTML tags for a queue's Active Users.
(Jason Lowe via jianhe)
YARN-2905. AggregatedLogsBlock page can infinitely loop if the aggregated
log file is corrupted (Varun Saxena via jlowe)
YARN-2894. Fixed a bug regarding application view acl when RM fails over.
(Rohith Sharmaks via jianhe)
YARN-2874. Dead lock in "DelegationTokenRenewer" which blocks RM to execute
any further apps. (Naganarasimha G R via kasha)
YARN-2910. FSLeafQueue can throw ConcurrentModificationException.
(Wilfred Spiegelenburg via kasha)
YARN-2917. Fixed potential deadlock when system.exit is called in AsyncDispatcher
(Rohith Sharmaks via jianhe)
YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie)
(Jian He via jlowe)
YARN-1984. LeveldbTimelineStore does not handle db exceptions properly
(Varun Saxena via jlowe)
YARN-2952. Fixed incorrect version check in StateStore. (Rohith Sharmaks
via jianhe)
YARN-2340. Fixed NPE when queue is stopped during RM restart.
(Rohith Sharmaks via jianhe)
YARN-2992. ZKRMStateStore crashes due to session expiry. (Karthik Kambatla
via jianhe)
YARN-2922. ConcurrentModificationException in CapacityScheduler's LeafQueue.
(Rohith Sharmaks via ozawa)
YARN-2997. Fixed NodeStatusUpdater to not send alreay-sent completed
container statuses on heartbeat. (Chengbing Liu via jianhe)
YARN-3011. Possible IllegalArgumentException in ResourceLocalizationService
might lead NM to crash. (Varun Saxena via jianhe)
YARN-3103. AMRMClientImpl does not update AMRM token properly. (Jason Lowe
via jianhe)
YARN-3094. Reset timer for liveness monitors after RM recovery. (Jun Gong
via jianhe)
YARN-2246. Made the proxy tracking URL always be
http(s)://proxy addr:port/proxy/<appId> to avoid duplicate sections. (Devaraj
K via zjshen)
YARN-3207. Secondary filter matches entites which do not have the key being
filtered for. (Zhijie Shen via xgong)
YARN-3238. Connection timeouts to nodemanagers are retried at
multiple levels (Jason Lowe via xgong)
YARN-3239. WebAppProxy does not support a final tracking url which has
query fragments and params (Jian He via jlowe)
YARN-3222. Fixed RMNode to send scheduler events in sequential order when a
node reconnects. (Rohith Sharma K S via jianhe)
YARN-3231. FairScheduler: Changing queueMaxRunningApps interferes with pending
jobs. (Siqi Li via kasha)
YARN-3242. Asynchrony in ZK-close can lead to ZKRMStateStore watcher receiving
events for old client. (Zhihai Xu via kasha)
YARN-3227. Timeline renew delegation token fails when RM user's TGT is expired
(Zhijie Shen via xgong)
YARN-3287. Made TimelineClient put methods do as the correct login context.
(Daryn Sharp and Jonathan Eagles via zjshen)
YARN-3267. Timelineserver applies the ACL rules after applying the limit on
the number of records (Chang Li via jeagles)
YARN-3369. Missing NullPointer check in AppSchedulingInfo causes RM to die.
(Brahma Reddy Battula via wangda)
YARN-3393. Getting application(s) goes wrong when app finishes before
starting the attempt. (Zhijie Shen via xgong)
YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token
renewal of applications part of a bigger workflow. (Daryn Sharp via vinodkv)
YARN-3493. RM fails to come up with error "Failed to load/recover state"
when mem settings are changed. (Jian He via wangda)
YARN-3487. CapacityScheduler scheduler lock obtained unnecessarily when
calling getQueue (Jason Lowe via wangda)
YARN-3024. LocalizerRunner should give DIE action when all resources are
localized. (Chengbing Liu via xgong)
YARN-3464. Race condition in LocalizerRunner kills localizer before
localizing all resources. (Zhihai Xu via kasha)
YARN-3641. NodeManager: stopRecoveryStore() shouldn't be skipped when
exceptions happen in stopping NM's sub-services. (Junping Du via jlowe)
YARN-3526. ApplicationMaster tracking URL is incorrectly redirected
on a QJM cluster. (Weiwei Yang via xgong)
YARN-2766. Made ApplicationHistoryManager return a sorted list of apps,
attempts and containers. (Robert Kanter via zjshen)
YARN-3700. Made generic history service load a number of latest applications
according to the parameter or the configuration. (Xuan Gong via zjshen)
YARN-2900. Application (Attempt and Container) Not Found in AHS results
in InternalServer Error (500). (Zhijie Shen and Mit Desai via xgong)
YARN-3725. App submission via REST API is broken in secure mode due to
Timeline DT service address is empty. (Zhijie Shen via wangda)
YARN-3585. NodeManager cannot exit on SHUTDOWN event triggered and NM
recovery is enabled (Rohith Sharmaks via jlowe)
YARN-3832. Resource Localization fails on a cluster due to existing cache
directories (Brahma Reddy Battula via jlowe)
YARN-3850. NM fails to read files from full disks which can lead to
container logs being lost and other issues (Varun Saxena via jlowe)
YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when
Node is connected/disconnected (Bibin A Chundatt via jlowe)
YARN-2637. Fixed max-am-resource-percent calculation in CapacityScheduler
when activating applications. (Craig Welch via jianhe)
YARN-3733. Fix DominantRC#compare() does not work as expected if
cluster resource is empty. (Rohith Sharmaks via wangda)
YARN-2920. Changed CapacityScheduler to kill containers on nodes where
node labels are changed. (Wangda Tan via jianhe)
YARN-2978. Fixed potential NPE while getting queue info. (Varun Saxena via
jianhe)
YARN-3099. Capacity Scheduler LeafQueue/ParentQueue should use ResourceUsage
to track used-resources-by-label.(Wangda Tan via jianhe)
YARN-2694. Ensure only single node label specified in ResourceRequest.
(Wangda Tan via jianhe)
YARN-3124. Fixed CS LeafQueue/ParentQueue to use QueueCapacities to track
capacities-by-label. (Wangda Tan via jianhe)
YARN-2918. RM should not fail on startup if queue's configured labels do
not exist in cluster-node-labels. (Wangda Tan via jianhe)
YARN-3999. RM hangs on draing events. (Jian He via xgong)
YARN-4047. ClientRMService getApplications has high scheduler lock contention.
(Jason Lowe via jianhe)
YARN-1884. Added nodeHttpAddress into ContainerReport and fixed the link to NM
web page. (Xuan Gong via zjshen)
YARN-3171. Sort by Application id, AppAttempt and ContainerID doesn't work
in ATS / RM web ui. (Naganarasimha G R via xgong)
YARN-3740. Fixed the typo in the configuration name:
APPLICATION_HISTORY_PREFIX_MAX_APPS. (Xuan Gong via zjshen)
YARN-3544. Got back AM logs link on the RM web UI for a completed app.
(Xuan Gong via zjshen)
YARN-2890. MiniYarnCluster should turn on timeline service if
configured to do so. (Mit Desai via hitesh)
YARN-3154. Added additional APIs in LogAggregationContext to avoid aggregating
running logs of application when rolling is enabled. (Xuan Gong via vinodkv)
Release 2.6.0 - 2014-11-18
INCOMPATIBLE CHANGES
NEW FEATURES
YARN-1964. Create Docker analog of the LinuxContainerExecutor in YARN. (Abin
Shahab via raviprak)
YARN-2131. Add a way to format the RMStateStore. (Robert Kanter via kasha)
YARN-1367. Changed NM to not kill containers on NM resync if RM work-preserving
restart is enabled. (Anubhav Dhoot via jianhe)
YARN-1366. Changed AMRMClient to re-register with RM and send outstanding requests
back to RM on work-preserving RM restart. (Rohith via jianhe)
YARN-2181. Added preemption info to logs and RM web UI. (Wangda Tan via
jianhe)
YARN-1354. Recover applications upon nodemanager restart. (Jason Lowe via
junping_du)
YARN-1337. Recover containers upon nodemanager restart. (Jason Lowe via
junping_du)
YARN-2277. Added cross-origin support for the timeline server web services.
(Jonathan Eagles via zjshen)
YARN-2378. Added support for moving applications across queues in
CapacityScheduler. (Subramaniam Venkatraman Krishnan via jianhe)
YARN-2411. Support simple user and group mappings to queues. (Ram Venkatesh
via jianhe)
YARN-2174. Enable HTTPs for the writer REST API of TimelineServer.
(Zhijie Shen via jianhe)
YARN-2393. FairScheduler: Add the notion of steady fair share.
(Wei Yan via kasha)
YARN-2395. FairScheduler: Preemption timeout should be configurable per
queue. (Wei Yan via kasha)
YARN-2394. FairScheduler: Configure fairSharePreemptionThreshold per queue.
(Wei Yan via kasha)
YARN-415. Capture aggregate memory allocation at the app-level for chargeback.
(Eric Payne & Andrey Klochkov via jianhe)
YARN-2440. Enabled Nodemanagers to limit the aggregate cpu usage across all
containers to a preconfigured limit. (Varun Vasudev via vinodkv)
YARN-2033. Merging generic-history into the Timeline Store
(Zhijie Shen via junping_du)
YARN-611. Added an API to let apps specify an interval beyond which AM
failures should be ignored towards counting max-attempts. (Xuan Gong via
vinodkv)
YARN-2531. Added a configuration for admins to be able to override app-configs
and enforce/not-enforce strict control of per-container cpu usage. (Varun
Vasudev via vinodkv)
YARN-1250. Generic history service should support application-acls. (Zhijie Shen
via junping_du)
YARN-2569. Added the log handling APIs for the long running services. (Xuan
Gong via zjshen)
YARN-2102. Added the concept of a Timeline Domain to handle read/write ACLs
on Timeline service event data. (Zhijie Shen via vinodkv)
YARN-2581. Passed LogAggregationContext to NM via ContainerTokenIdentifier.
(Xuan Gong via zjshen)
YARN-2613. Support retry in NMClient for rolling-upgrades. (Jian He via
junping_du)
YARN-1063. Augmented Hadoop common winutils to have the ability to create
containers as domain users. (Remus Rusanu via vinodkv)
YARN-2446. Augmented Timeline service APIs to start taking in domains as a
parameter while posting entities and events. (Zhijie Shen via vinodkv)
YARN-2468. Enhanced NodeManager to support log handling APIs (YARN-2569) for
use by long running services. (Xuan Gong via vinodkv)
YARN-1051. Add a system for creating reservations of cluster capacity.
(see breakdown below)
YARN-913. Add a way to register long-lived services in a YARN cluster.
(stevel)
YARN-2493. Added user-APIs for using node-labels. (Wangda Tan via vinodkv)
YARN-2544. Added admin-API objects for using node-labels. (Wangda Tan via
vinodkv)
YARN-2494. Added NodeLabels Manager internal API and implementation. (Wangda
Tan via vinodkv)
YARN-2501. Enhanced AMRMClient library to support requests against node
labels. (Wangda Tan via vinodkv)
YARN-2656. Made RM web services authentication filter support proxy user.
(Varun Vasudev and Zhijie Shen via zjshen)
YARN-2496. Enhanced Capacity Scheduler to have basic support for allocating
resources based on node-labels. (Wangda Tan via vinodkv)
YARN-2500. Enhaced ResourceManager to support schedulers allocating resources
based on node-labels. (Wangda Tan via vinodkv)
YARN-2504. Enhanced RM Admin CLI to support management of node-labels.
(Wangda Tan via vinodkv)
YARN-1972. Added a secure container-executor for Windows. (Remus Rusanu via
vinodkv)
YARN-2198. Remove the need to run NodeManager as privileged account for
Windows Secure Container Executor. (Remus Rusanu via jianhe)
YARN-2647. Added a queue CLI for getting queue information. (Sunil Govind via
vinodkv)
YARN-2632. Document NM Restart feature. (Junping Du and Vinod Kumar
Vavilapalli via jlowe)
YARN-2505. Supported get/add/remove/change labels in RM REST API. (Craig Welch
via zjshen)
IMPROVEMENTS
YARN-2242. Improve exception information on AM launch crashes. (Li Lu
via junping_du)
YARN-2274. FairScheduler: Add debug information about cluster capacity,
availability and reservations. (kasha)
YARN-2228. Augmented TimelineServer to load pseudo authentication filter when
authentication = simple. (Zhijie Shen via vinodkv)
YARN-1341. Recover NMTokens upon nodemanager restart. (Jason Lowe via
junping_du)
YARN-2208. AMRMTokenManager need to have a way to roll over AMRMToken. (xgong)
YARN-2323. FairShareComparator creates too many Resource objects (Hong Zhiguo
via Sandy Ryza)
YARN-2045. Data persisted in NM should be versioned (Junping Du via jlowe)
YARN-2013. The diagnostics is always the ExitCodeException stack when the container
crashes. (Tsuyoshi OZAWA via junping_du)
YARN-2295. Refactored DistributedShell to use public APIs of protocol records.
(Li Lu via jianhe)
YARN-1342. Recover container tokens upon nodemanager restart. (Jason Lowe via
devaraj)
YARN-2214. FairScheduler: preemptContainerPreCheck() in FSParentQueue delays
convergence towards fairness. (Ashwin Shankar via kasha)
YARN-2211. Persist AMRMToken master key in RMStateStore for RM recovery.
(Xuan Gong via jianhe)
YARN-2328. FairScheduler: Verify update and continuous scheduling threads are
stopped when the scheduler is stopped. (kasha)
YARN-2347. Consolidated RMStateVersion and NMDBSchemaVersion into Version in
yarn-server-common. (Junping Du via zjshen)
YARN-1994. Expose YARN/MR endpoints on multiple interfaces. (Craig Welch,
Milan Potocnik, Arpit Agarwal via xgong)
YARN-2343. Improve NMToken expire exception message. (Li Lu via jianhe)
YARN-2370. Fix comment in o.a.h.y.server.resourcemanager.schedulerAppSchedulingInfo
(Wenwu Peng via junping_du)
YARN-2298. Move TimelineClient to yarn-common project (Zhijie Shen via
junping_du)
YARN-2288. Made persisted data in LevelDB timeline store be versioned. (Junping Du
via zjshen)
YARN-2352. FairScheduler: Collect metrics on duration of critical methods that
affect performance. (kasha)
YARN-2212. ApplicationMaster needs to find a way to update the AMRMToken
periodically. (xgong)
YARN-2026. Fair scheduler: Consider only active queues for computing fairshare.
(Ashwin Shankar via kasha)
YARN-1954. Added waitFor to AMRMClient(Async). (Tsuyoshi Ozawa via zjshen)
YARN-2302. Refactor TimelineWebServices. (Zhijie Shen via junping_du)
YARN-2337. ResourceManager sets ClientRMService in RMContext multiple times.
(Zhihai Xu via kasha)
YARN-2138. Cleaned up notifyDone* APIs in RMStateStore. (Varun Saxena via
jianhe)
YARN-2373. Changed WebAppUtils to use Configuration#getPassword for
accessing SSL passwords. (Larry McCay via jianhe)
YARN-2317. Updated the document about how to write YARN applications. (Li Lu via
zjshen)
YARN-2399. FairScheduler: Merge AppSchedulable and FSSchedulerApp into
FSAppAttempt. (kasha)
YARN-1370. Fair scheduler to re-populate container allocation state.
(Anubhav Dhoot via kasha)
YARN-2197. Add a link to YARN CHANGES.txt in the left side of doc
(Akira AJISAKA via aw)
YARN-1918. Typo in description and error message for
'yarn.resourcemanager.cluster-id' (Anandha L Ranganathan via aw)
YARN-2389. Added functionality for schedulers to kill all applications in a
queue. (Subramaniam Venkatraman Krishnan via jianhe)
YARN-1326. RM should log using RMStore at startup time.
(Tsuyoshi Ozawa via kasha)
YARN-2182. Updated ContainerId#toString() to append RM Epoch number.
(Tsuyoshi OZAWA via jianhe)
YARN-2406. Move RM recovery related proto to
yarn_server_resourcemanager_recovery.proto. (Tsuyoshi Ozawa via jianhe)
YARN-1506. Changed RMNode/SchedulerNode to update resource with event
notification. (Junping Du via jianhe)
YARN-2509. Enable Cross Origin Filter for timeline server only and not all
Yarn servers (Mit Desai via jeagles)
YARN-2511. Allowed all origins by default when CrossOriginFilter is
enabled. (Jonathan Eagles via zjshen)
YARN-2508. Cross Origin configuration parameters prefix are not honored
(Mit Desai via jeagles)
YARN-2512. Allowed pattern matching for origins in CrossOriginFilter.
(Jonathan Eagles via zjshen)
YARN-2507. Documented CrossOriginFilter configurations for the timeline
server. (Jonathan Eagles via zjshen)
YARN-2515. Updated ConverterUtils#toContainerId to parse epoch.
(Tsuyoshi OZAWA via jianhe)
YARN-2448. Changed ApplicationMasterProtocol to expose RM-recognized resource
types to the AMs. (Varun Vasudev via vinodkv)
YARN-2538. Added logs when RM sends roll-overed AMRMToken to AM. (Xuan Gong
via zjshen)
YARN-2229. Changed the integer field of ContainerId to be long type.
(Tsuyoshi OZAWA via jianhe)
YARN-2547. Cross Origin Filter throws UnsupportedOperationException upon
destroy (Mit Desai via jeagles)
YARN-2557. Add a parameter "attempt_Failures_Validity_Interval" into
DistributedShell. (xgong)
YARN-2001. Added a time threshold for RM to wait before starting container
allocations after restart/failover. (Jian He via vinodkv)
YARN-1372. Ensure all completed containers are reported to the AMs across
RM restart. (Anubhav Dhoot via jianhe)
YARN-2539. FairScheduler: Set the default value for maxAMShare to 0.5.
(Wei Yan via kasha)
YARN-1959. Fix headroom calculation in FairScheduler.
(Anubhav Dhoot via kasha)
YARN-2577. Clarify ACL delimiter and how to configure ACL groups only
(Miklos Christine via aw)
YARN-2372. There are Chinese Characters in the FairScheduler's document
(Fengdong Yu via aw)
YARN-668. Changed NMTokenIdentifier/AMRMTokenIdentifier/ContainerTokenIdentifier
to use protobuf object as the payload. (Junping Du via jianhe)
YARN-1769. CapacityScheduler: Improve reservations (Thomas Graves via
jlowe)
YARN-2627. Added the info logs of attemptFailuresValidityInterval and number
of previous failed attempts. (Xuan Gong via zjshen)
YARN-2562. Changed ContainerId#toString() to be more readable. (Tsuyoshi
OZAWA via jianhe)
YARN-2615. Changed ClientToAMTokenIdentifier/RM(Timeline)DelegationTokenIdentifier
to use protobuf as payload. (Junping Du via jianhe)
YARN-2629. Made the distributed shell use the domain-based timeline ACLs.
(zjshen)
YARN-2583. Modified AggregatedLogDeletionService to be able to delete rolling
aggregated logs. (Xuan Gong via zjshen)
YARN-2312. Deprecated old ContainerId#getId API and updated MapReduce to
use ContainerId#getContainerId instead. (Tsuyoshi OZAWA via jianhe)
YARN-2621. Simplify the output when the user doesn't have the access for
getDomain(s). (Zhijie Shen via jianhe)
YARN-1879. Marked Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
for RM fail over. (Tsuyoshi OZAWA via jianhe)
YARN-2676. Enhanced Timeline auth-filter to support proxy users. (Zhijie Shen
via vinodkv)
YARN-2673. Made timeline client put APIs retry if ConnectException happens.
(Li Lu via zjshen)
YARN-2582. Fixed Log CLI and Web UI for showing aggregated logs of LRS. (Xuan
Gong via zjshen)
YARN-90. NodeManager should identify failed disks becoming good again
(Varun Vasudev via jlowe)
YARN-2709. Made timeline client getDelegationToken API retry if ConnectException
happens. (Li Lu via zjshen)
YARN-2682. Updated WindowsSecureContainerExecutor to not use
DefaultContainerExecutor#getFirstApplicationDir and use getWorkingDir()
instead. (Zhihai Xu via jianhe)
YARN-2209. Replaced AM resync/shutdown command with corresponding exceptions and
made related MR changes. (Jian He via zjshen)
YARN-2703. Added logUploadedTime into LogValue for better display. (Xuan Gong
via zjshen)
YARN-2704. Changed ResourceManager to optionally obtain tokens itself for the
sake of localization and log-aggregation for long-running services. (Jian He
via vinodkv)
YARN-2502. Changed DistributedShell to support node labels. (Wangda Tan via
jianhe)
YARN-2760. Remove 'experimental' from FairScheduler docs. (Harsh J via kasha)
YARN-2503. Added node lablels in web UI. (Wangda Tan via jianhe)
YARN-2779. Fixed ResourceManager to not require delegation tokens for
communicating with Timeline Service. (Zhijie Shen via vinodkv)
YARN-2778. Moved node-lables' reports to the yarn nodes CLI from the admin
CLI. (Wangda Tan via vinodkv)
YARN-2770. Added functionality to renew/cancel TimeLineDelegationToken.
(Zhijie Shen via jianhe)
YARN-2818. Removed the now unnecessary user entity injection from Timeline
service given we now have domains. (Zhijie Shen via vinodkv)
YARN-2635. TestRM, TestRMRestart, TestClientToAMTokens should run
with both CS and FS. (Wei Yan and kasha via kasha)
YARN-2375. Allow enabling/disabling timeline server per framework.
(Mit Desai via jeagles)
OPTIMIZATIONS
BUG FIXES
YARN-2251. Avoid negative elapsed time in JHS/MRAM web UI and services.
(Zhijie Shen via junping_du)
YARN-2088. Fixed a bug in GetApplicationsRequestPBImpl#mergeLocalToBuilder.
(Binglin Chang via jianhe)
YARN-2260. Fixed ResourceManager's RMNode to correctly remember containers
when nodes resync during work-preserving RM restart. (Jian He via vinodkv)
YARN-2264. Fixed a race condition in DrainDispatcher which may cause random
test failures. (Li Lu via jianhe)
YARN-2219. Changed ResourceManager to avoid AMs and NMs getting exceptions
after RM recovery but before scheduler learns about apps and app-attempts.
(Jian He via vinodkv)
YARN-2244. FairScheduler missing handling of containers for unknown
application attempts. (Anubhav Dhoot via kasha)
YARN-2321. NodeManager web UI can incorrectly report Pmem enforcement
(Leitao Guo via jlowe)
YARN-2273. NPE in ContinuousScheduling thread when we lose a node.
(Wei Yan via kasha)
YARN-2313. Livelock can occur in FairScheduler when there are lots of
running apps (Tsuyoshi Ozawa via Sandy Ryza)
YARN-2147. client lacks delegation token exception details when
application submit fails (Chen He via jlowe)
YARN-1796. container-executor shouldn't require o-r permissions (atm)
YARN-2354. DistributedShell may allocate more containers than client
specified after AM restarts. (Li Lu via jianhe)
YARN-2051. Fix bug in PBimpls and add more unit tests with reflection.
(Binglin Chang via junping_du)
YARN-2374. Fixed TestDistributedShell#testDSShell failure due to hostname
dismatch. (Varun Vasudev via jianhe)
YARN-2359. Application hangs when it fails to launch AM container.
(Zhihai Xu via kasha)
YARN-2388. Fixed TestTimelineWebServices failure due to HADOOP-10791. (zjshen)
YARN-2008. Fixed CapacityScheduler to calculate headroom based on max available
capacity instead of configured max capacity. (Craig Welch via jianhe)
YARN-2400. Fixed TestAMRestart fails intermittently. (Jian He via xgong)
YARN-2361. RMAppAttempt state machine entries for KILLED state has duplicate
event entries. (Zhihai Xu via kasha)
YARN-2070. Made DistributedShell publish the short user name to the timeline
server. (Robert Kanter via zjshen)
YARN-2397. Avoided loading two authentication filters for RM and TS web
interfaces. (Varun Vasudev via zjshen)
YARN-2409. RM ActiveToStandBy transition missing stoping previous rmDispatcher.
(Rohith via jianhe)
YARN-2249. Avoided AM release requests being lost on work preserving RM
restart. (Jian He via zjshen)
YARN-2034. Description for yarn.nodemanager.localizer.cache.target-size-mb
is incorrect (Chen He via jlowe)
YARN-1919. Potential NPE in EmbeddedElectorService#stop.
(Tsuyoshi Ozawa via kasha)
YARN-2424. LCE should support non-cgroups, non-secure mode (Chris Douglas
via aw)
YARN-2434. RM should not recover containers from previously failed attempt
when AM restart is not enabled (Jian He via jlowe)
YARN-2035. FileSystemApplicationHistoryStore should not make working dir
when it already exists. (Jonathan Eagles via zjshen)
YARN-2405. NPE in FairSchedulerAppsBlock. (Tsuyoshi Ozawa via kasha)
YARN-2449. Fixed the bug that TimelineAuthenticationFilterInitializer
is not automatically added when hadoop.http.filter.initializers is not
configured. (Varun Vasudev via zjshen)
YARN-2450. Fix typos in log messages. (Ray Chiang via hitesh)
YARN-2447. RM web service app submission doesn't pass secrets correctly.
(Varun Vasudev via jianhe)
YARN-2462. TestNodeManagerResync#testBlockNewContainerRequestsOnStartAndResync
should have a test timeout (Eric Payne via jlowe)
YARN-2431. NM restart: cgroup is not removed for reacquired containers
(jlowe)
YARN-2519. Credential Provider related unit tests failed on Windows.
(Xiaoyu Yao via cnauroth)
YARN-2526. SLS can deadlock when all the threads are taken by AMSimulators.
(Wei Yan via kasha)
YARN-1458. FairScheduler: Zero weight can lead to livelock.
(Zhihai Xu via kasha)
YARN-2459. RM crashes if App gets rejected for any reason
and HA is enabled. (Jian He and Mayank Bansal via xgong)
YARN-2158. Fixed TestRMWebServicesAppsModification#testSingleAppKill test
failure. (Varun Vasudev via jianhe)
YARN-2534. FairScheduler: Potential integer overflow calculating totalMaxShare.
(Zhihai Xu via kasha)
YARN-2541. Fixed ResourceManagerRest.apt.vm table syntax error. (jianhe)
YARN-2484. FileSystemRMStateStore#readFile/writeFile should close
FSData(In|Out)putStream in final block (Tsuyoshi OZAWA via jlowe)
YARN-2456. Possible livelock in CapacityScheduler when RM is recovering apps.
(Jian He via xgong)
YARN-2542. Fixed NPE when retrieving ApplicationReport from TimeLineServer.
(Zhijie Shen via jianhe)
YARN-2528. Relaxed http response split vulnerability protection for the origins
header and made it accept multiple origins in CrossOriginFilter. (Jonathan
Eagles via zjshen)
YARN-2549. TestContainerLaunch fails due to classpath problem with hamcrest
classes. (cnauroth)
YARN-2529. Generic history service RPC interface doesn't work when service
authorization is enabled. (Zhijie Shen via jianhe)
YARN-2558. Updated ContainerTokenIdentifier#read/write to use
ContainerId#getContainerId. (Tsuyoshi OZAWA via jianhe)
YARN-2559. Fixed NPE in SystemMetricsPublisher when retrieving
FinalApplicationStatus. (Zhijie Shen via jianhe)
YARN-1779. Fixed AMRMClient to handle AMRMTokens correctly across
ResourceManager work-preserving-restart or failover. (Jian He via vinodkv)
YARN-2363. Submitted applications occasionally lack a tracking URL (jlowe)
YARN-2561. MR job client cannot reconnect to AM after NM restart. (Junping
Du via jlowe)
YARN-2563. Fixed YarnClient to call getTimeLineDelegationToken only if the
Token is not present. (Zhijie Shen via jianhe)
YARN-2568. Fixed the potential test failures due to race conditions when RM
work-preserving recovery is enabled. (Jian He via zjshen)
YARN-2565. Fixed RM to not use FileSystemApplicationHistoryStore unless
explicitly set. (Zhijie Shen via jianhe)
YARN-2460. Remove obsolete entries from yarn-default.xml (Ray Chiang via
aw)
YARN-2452. TestRMApplicationHistoryWriter fails with FairScheduler.
(Zhihai Xu via kasha)
YARN-2453. TestProportionalCapacityPreemptionPolicy fails with
FairScheduler. (Zhihai Xu via kasha)
YARN-2540. FairScheduler: Queue filters not working on scheduler page in
RM UI. (Ashwin Shankar via kasha)
YARN-2584. TestContainerManagerSecurity fails on trunk. (Jian He via
junping_du)
YARN-2252. Intermittent failure of
TestFairScheduler.testContinuousScheduling.
(Ratandeep Ratti and kasha via kasha)
YARN-2161. Fix build on macosx: YARN parts (Binglin Chang via aw)
YARN-2596. TestWorkPreservingRMRestart fails with FairScheduler. (kasha)
YARN-2546. Made REST API for application creation/submission use numeric and
boolean types instead of the string of them. (Varun Vasudev via zjshen)
YARN-2523. ResourceManager UI showing negative value for "Decommissioned
Nodes" field (Rohith via jlowe)
YARN-2608. FairScheduler: Potential deadlocks in loading alloc files and
clock access. (Wei Yan via kasha)
YARN-2606. Application History Server tries to access hdfs before doing
secure login (Mit Desai via jeagles)
YARN-2610. Hamlet should close table tags. (Ray Chiang via kasha)
YARN-2387. Resource Manager crashes with NPE due to lack of
synchronization (Mit Desai via jlowe)
YARN-2594. Potential deadlock in RM when querying
ApplicationResourceUsageReport. (Wangda Tan via kasha)
YARN-2602. Fixed possible NPE in ApplicationHistoryManagerOnTimelineStore.
(Zhijie Shen via jianhe)
YARN-2630. Prevented previous AM container status from being acquired by the
current restarted AM. (Jian He via zjshen)
YARN-2617. Fixed NM to not send duplicate container status whose app is not
running. (Jun Gong via jianhe)
YARN-2624. Resource Localization fails on a cluster due to existing cache
directories (Anubhav Dhoot via jlowe)
YARN-2527. Fixed the potential NPE in ApplicationACLsManager and added test
cases for it. (Benoy Antony via zjshen)
YARN-2628. Capacity scheduler with DominantResourceCalculator carries out
reservation even though slots are free. (Varun Vasudev via jianhe)
YARN-2685. Fixed a bug in CommonNodeLabelsManager that caused wrong resource
tracking per label when a host runs multiple node-managers. (Wangda Tan via
vinodkv)
YARN-2699. Fixed a bug in CommonNodeLabelsManager that caused tests to fail
when using ephemeral ports on NodeIDs. (Wangda Tan via vinodkv)
YARN-2705. Fixed bugs in ResourceManager node-label manager that were causing
test-failures: added a dummy in-memory labels-manager. (Wangda Tan via
vinodkv)
YARN-2715. Fixed ResourceManager to respect common configurations for proxy
users/groups beyond just the YARN level config. (Zhijie Shen via vinodkv)
YARN-2743. Fixed a bug in ResourceManager that was causing RMDelegationToken
identifiers to be tampered and thus causing app submission failures in
secure mode. (Jian He via vinodkv)
YARN-3251. Fixed a deadlock in CapacityScheduler when computing
absoluteMaxAvailableCapacity in LeafQueue (Craig Welch via wangda)
BREAKDOWN OF YARN-1051 SUBTASKS AND RELATED JIRAS
YARN-1707. Introduce APIs to add/remove/resize queues in the
CapacityScheduler. (Carlo Curino and Subru Krishnan via curino)
YARN-2475. Logic for responding to capacity drops for the
ReservationSystem. (Carlo Curino and Subru Krishnan via curino)
YARN-1708. Public YARN APIs for creating/updating/deleting
reservations. (Subru Krishnan and Carlo Curino via subru)
YARN-1709. In-memory data structures used to track resources over
time to enable reservations. (Subru Krishnan and Carlo Curino via
subru)
YARN-1710. Logic to find allocations within a Plan that satisfy
user ReservationRequest(s). (Carlo Curino and Subru Krishnan via
curino)
YARN-1711. Policy to enforce instantaneous and over-time quotas
on user reservations. (Carlo Curino and Subru Krishnan via curino)
YARN-1712. Plan follower that synchronizes the current state of reservation
subsystem with the scheduler. (Subru Krishnan and Carlo Curino via subru)
YARN-2080. Integrating reservation system with ResourceManager and
client-RM protocol. (Subru Krishnan and Carlo Curino via subru)
MAPREDUCE-6103. Adding reservation APIs to MR resource manager
delegate. (Subru Krishnan and Carlo Curino via subru)
YARN-2576. Fixing compilation, javadocs and audit issues to pass
test patch in branch. (Subru Krishnan and Carlo Curino via subru)
YARN-2611. Fixing jenkins findbugs warning and TestRMWebServicesCapacitySched
for branch YARN-1051. (Subru Krishnan and Carlo Curino via subru)
YARN-2644. Fixed CapacityScheduler to return up-to-date headroom when
AM allocates. (Craig Welch via jianhe)
YARN-1857. CapacityScheduler headroom doesn't account for other AM's running.
(Chen He and Craig Welch via jianhe)
YARN-2649. Fixed TestAMRMRPCNodeUpdates test failure. (Ming Ma via jianhe)
YARN-2662. TestCgroupsLCEResourcesHandler leaks file descriptors. (cnauroth)
BREAKDOWN OF YARN-913 SUBTASKS AND RELATED JIRAS
YARN-2652 Add hadoop-yarn-registry package under hadoop-yarn. (stevel)
YARN-2668 yarn-registry JAR won't link against ZK 3.4.5. (stevel)
YARN-2689 TestSecureRMRegistryOperations failing on windows:
secure ZK won't start (stevel)
YARN-2692 ktutil test hanging on some machines/ktutil versions (stevel)
YARN-2700 TestSecureRMRegistryOperations failing on windows: auth problems
(stevel)
YARN-2677 registry punycoding of usernames doesn't fix all usernames to be
DNS-valid (stevel)
YARN-2768 Improved Yarn Registry service record structure (stevel)
---
YARN-2598 GHS should show N/A instead of null for the inaccessible information
(Zhijie Shen via mayank)
YARN-2671. Fixed ApplicationSubmissionContext to still set resource for
backward compatibility. (Wangda Tan via zjshen)
YARN-2667. Fix the release audit warning caused by hadoop-yarn-registry
(Yi Liu via jlowe)
YARN-2651. Spun off LogRollingInterval from LogAggregationContext. (Xuan Gong
via zjshen)
YARN-2377. Localization exception stack traces are not passed as
diagnostic info (Gera Shegalov via jlowe)
YARN-2308. Changed CapacityScheduler to explicitly throw exception if the
queue to which the apps were submitted is changed across RM restart.
(Craig Welch & Chang Li via jianhe)
YARN-2566. DefaultContainerExecutor should pick a working directory randomly.
(Zhihai Xu via kasha)
YARN-2588. Standby RM fails to transitionToActive if previous
transitionToActive failed with ZK exception. (Rohith Sharmaks via jianhe)
YARN-2701. Potential race condition in startLocalizer when using
LinuxContainerExecutor. (Xuan Gong via jianhe)
YARN-2717. Avoided duplicate logging when container logs are not found. (Xuan
Gong via zjshen)
YARN-2720. Windows: Wildcard classpath variables not expanded against
resources contained in archives. (Craig Welch via cnauroth)
YARN-2721. Suppress NodeExist exception thrown by ZKRMStateStore when it
retries creating znode. (Jian He via zjshen)
YARN-2732. Fixed syntax error in SecureContainer.apt.vm. (Jian He via zjshen)
YARN-2724. Skipped uploading a local log file to HDFS if exception is raised
when opening it. (Xuan Gong via zjshen)
YARN-1915. Fixed a race condition that client could use the ClientToAMToken
to contact with AM before AM actually receives the ClientToAMTokenMasterKey.
(Jason Lowe via jianhe)
YARN-2314. Disable ContainerManagementProtocolProxy cache by default to
prevent creating thousands of threads in a large cluster. (Jason Lowe via
jianhe)
YARN-2723. Fix rmadmin -replaceLabelsOnNode does not correctly parse port.
(Naganarasimha G R via xgong)
YARN-2734. Skipped sub-folders in the local log dir when aggregating logs.
(Xuan Gong via zjshen)
YARN-2726. CapacityScheduler should explicitly log when an accessible
label has no capacity. (Wangda Tan via xgong)
YARN-2591. Fixed AHSWebServices to return FORBIDDEN(403) if the request user
doesn't have access to the history data. (Zhijie Shen via jianhe)
YARN-2279. Add UTs to cover timeline server authentication.
(Zhijie Shen via xgong)
YARN-2758. Update TestApplicationHistoryClientService to use the new generic
history store. (Zhijie Shen via xgong)
YARN-2741. Made NM web UI serve logs on the drive other than C: on Windows. (Craig
Welch via zjshen)
YARN-2747. Fixed the test failure of TestAggregatedLogFormat when native I/O is
enabled. (Xuan Gong via zjshen)
YARN-2769. Fixed the problem that timeline domain is not set in distributed shell
AM when using shell_command on Windows. (Varun Vasudev via zjshen)
YARN-2755. NM fails to clean up usercache_DEL_<timestamp> dirs after
YARN-661 (Siqi Li via jlowe)
YARN-2698. Moved some node label APIs to be correctly placed in client
protocol. (Wangda Tan via vinodkv)
YARN-2789. Re-instated the NodeReport.newInstance private unstable API
modified in YARN-2698 so that tests in YARN frameworks don't break. (Wangda
Tan via vinodkv)
YARN-2707. Potential null dereference in FSDownload (Gera Shegalov via
jlowe)
YARN-2711. Fixed TestDefaultContainerExecutor#testContainerLaunchError failure on
Windows. (Varun Vasudev via zjshen)
YARN-2790. Fixed a NodeManager bug that was causing log-aggregation to fail
beyond HFDS delegation-token expiry even when RM is a proxy-user (YARN-2704).
(Jian He via vinodkv)
YARN-2785. Fixed intermittent TestContainerResourceUsage failure. (Varun Vasudev
via zjshen)
YARN-2730. DefaultContainerExecutor runs only one localizer at a time
(Siqi Li via jlowe)
YARN-2798. Fixed YarnClient to populate the renewer correctly for Timeline
delegation tokens. (Zhijie Shen via vinodkv)
YARN-2788. Fixed backwards compatiblity issues with log-aggregation feature
that were caused when adding log-upload-time via YARN-2703. (Xuan Gong via
vinodkv)
YARN-2795. Fixed ResourceManager to not crash loading node-label data from
HDFS in secure mode. (Wangda Tan via vinodkv)
YARN-1922. Fixed NodeManager to kill process-trees correctly in the presence
of races between the launch and the stop-container call and when root
processes crash. (Billie Rinaldi via vinodkv)
YARN-2752. Made ContainerExecutor append "nice -n" arg only when priority
adjustment flag is set. (Xuan Gong via zjshen)
YARN-2010. Handle app-recovery failures gracefully.
(Jian He and Karthik Kambatla via kasha)
YARN-2804. Fixed Timeline service to not fill the logs with JAXB bindings
exceptions. (Zhijie Shen via vinodkv)
YARN-2767. Added a test case to verify that http static user cannot kill or submit
apps in the secure mode. (Varun Vasudev via zjshen)
YARN-2805. Fixed ResourceManager to load HA configs correctly before kerberos
login. (Wangda Tan via vinodkv)
YARN-2579. Fixed a deadlock issue when EmbeddedElectorService and
FatalEventDispatcher try to transition RM to StandBy at the same time.
(Rohith Sharmaks via jianhe)
YARN-2813. Fixed NPE from MemoryTimelineStore.getDomains. (Zhijie Shen via xgong)
YARN-2812. TestApplicationHistoryServer is likely to fail on less powerful machine.
(Zhijie Shen via xgong)
YARN-2744. Fixed CapacityScheduler to validate node-labels correctly against
queues. (Wangda Tan via vinodkv)
YARN-2823. Fixed ResourceManager app-attempt state machine to inform
schedulers about previous finished attempts of a running appliation to avoid
expectation mismatch w.r.t transferred containers. (Jian He via vinodkv)
YARN-2810. TestRMProxyUsersConf fails on Windows VMs. (Varun Vasudev via xgong)
YARN-2824. Fixed Capacity Scheduler to not crash when some node-labels are
not mapped to queues by making default capacities per label to be zero.
(Wangda Tan via vinodkv)
YARN-2827. Fixed bugs in "yarn queue" CLI. (Wangda Tan via vinodkv).
YARN-2803. MR distributed cache not working correctly on Windows after
NodeManager privileged account changes. (Craig Welch via cnauroth)
YARN-2753. Fixed a bunch of bugs in the NodeLabelsManager classes. (Zhihai xu
via vinodkv)
YARN-2825. Container leak on NM (Jian He via jlowe)
YARN-2819. NPE in ATS Timeline Domains when upgrading from 2.4 to 2.6.
(Zhijie Shen via xgong)
YARN-2826. Fixed user-groups mappings' refresh bug caused by YARN-2826.
(Wangda Tan via vinodkv)
YARN-2607. Fixed issues in TestDistributedShell. (Wangda Tan via vinodkv)
YARN-2830. Add backwords compatible ContainerId.newInstance constructor.
(jeagles via acmurthy)
YARN-2834. Fixed ResourceManager to ignore token-renewal failures on recovery
consistent with the (somewhat incorrect) behaviour in the non-recovery case.
(Jian He via vinodkv)
YARN-2841. RMProxy should retry EOFException. (Jian He via xgong)
YARN-2843. Fixed NodeLabelsManager to trim inputs for hosts and labels so
as to make them work correctly. (Wangda Tan via vinodkv)
YARN-2794. Fixed log messages about distributing system-credentials. (Jian He via
zjshen)
YARN-2846. Incorrect persist exit code for running containers in
reacquireContainer() that interrupted by NodeManager restart. (Junping Du
via jlowe)
YARN-2853. Fixed a bug in ResourceManager causing apps to hang when the user
kill request races with ApplicationMaster finish. (Jian He via vinodkv)
Release 2.5.2 - 2014-11-19
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
Release 2.5.1 - 2014-09-05
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
Release 2.5.0 - 2014-08-11
INCOMPATIBLE CHANGES
NEW FEATURES
YARN-1757. NM Recovery. Auxiliary service support. (Jason Lowe via kasha)
YARN-1864. Fair Scheduler Dynamic Hierarchical User Queues (Ashwin Shankar
via Sandy Ryza)
YARN-1362. Distinguish between nodemanager shutdown for decommission vs shutdown
for restart. (Jason Lowe via junping_du)
YARN-1338. Recover localized resource cache state upon nodemanager restart
(Jason Lowe via junping_du)
YARN-1368. Added core functionality of recovering container state into
schedulers after ResourceManager Restart so as to preserve running work in
the cluster. (Jian He via vinodkv)
YARN-1702. Added kill app functionality to RM web services. (Varun Vasudev
via vinodkv)
YARN-1339. Recover DeletionService state upon nodemanager restart. (Jason Lowe
via junping_du)
YARN-1365. Changed ApplicationMasterService to allow an app to re-register
after RM restart. (Anubhav Dhoot via jianhe)
YARN-2052. Embedded an epoch number in container id to ensure the uniqueness
of container id after RM restarts. (Tsuyoshi OZAWA via jianhe)
YARN-1713. Added get-new-app and submit-app functionality to RM web services.
(Varun Vasudev via vinodkv)
YARN-2233. Implemented ResourceManager web-services to create, renew and
cancel delegation tokens. (Varun Vasudev via vinodkv)
YARN-2247. Made RM web services authenticate users via kerberos and delegation
token. (Varun Vasudev via zjshen)
IMPROVEMENTS
YARN-1479. Invalid NaN values in Hadoop REST API JSON response (Chen He via
jeagles)
YARN-1736. FS: AppSchedulable.assignContainer's priority argument is
redundant. (Naren Koneru via kasha)
YARN-1678. Fair scheduler gabs incessantly about reservations (Sandy Ryza)
YARN-1561. Fix a generic type warning in FairScheduler. (Chen He via junping_du)
YARN-1429. *nix: Allow a way for users to augment classpath of YARN daemons.
(Jarek Jarcec Cecho via kasha)
YARN-1520. update capacity scheduler docs to include necessary parameters
(Chen He via jeagles)
YARN-1845. Elapsed time for failed tasks that never started is wrong
(Rushabh S Shah via jeagles)
YARN-1136. Replace junit.framework.Assert with org.junit.Assert (Chen He
via jeagles)
YARN-1889. In Fair Scheduler, avoid creating objects on each call to
AppSchedulable comparator (Hong Zhiguo via Sandy Ryza)
YARN-1923. Make Fair Scheduler resource ratio calculations terminate faster
(Anubhav Dhoot via Sandy Ryza)
YARN-1870. FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo.
(Fengdong Yu via junping_du)
YARN-1970. Prepare YARN codebase for JUnit 4.11. (cnauroth)
YARN-483. Improve documentation on log aggregation in yarn-default.xml
(Akira Ajisaka via Sandy Ryza)
YARN-2036. Document yarn.resourcemanager.hostname in ClusterSetup (Ray
Chiang via Sandy Ryza)
YARN-766. TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format
issue in trunk. (Contributed by Siddharth Seth)
YARN-1982. Renamed the daemon name to be TimelineServer instead of History
Server and deprecated the old usage. (Zhijie Shen via vinodkv)
YARN-1987. Wrapper for leveldb DBIterator to aid in handling database exceptions.
(Jason Lowe via kasha)
YARN-1751. Improve MiniYarnCluster for log aggregation testing (Ming Ma
via jlowe)
YARN-1981. Nodemanager version is not updated when a node reconnects (Jason
Lowe via jeagles)
YARN-1938. Added kerberos login for the Timeline Server. (Zhijie Shen via
vinodkv)
YARN-2017. Merged some of the common scheduler code. (Jian He via vinodkv)
YARN-2049. Added delegation-token support for the Timeline Server. (Zhijie
Shen via vinodkv)
YARN-1936. Added security support for the Timeline Client. (Zhijie Shen via
vinodkv)
YARN-1937. Added owner-only ACLs support for Timeline Client and server.
(Zhijie Shen via vinodkv)
YARN-2012. Fair Scheduler: allow default queue placement rule to take an
arbitrary queue (Ashwin Shankar via Sandy Ryza)
YARN-2059. Added admin ACLs support to Timeline Server. (Zhijie Shen via
vinodkv)
YARN-2073. Fair Scheduler: Add a utilization threshold to prevent preempting
resources when cluster is free (Karthik Kambatla via Sandy Ryza)
YARN-2071. Modified levelDB store permissions to be readable only by the
server user. (Zhijie Shen via vinodkv)
YARN-2107. Refactored timeline classes into o.a.h.y.s.timeline package. (Vinod
Kumar Vavilapalli via zjshen)
YARN-596. Use scheduling policies throughout the queue hierarchy to decide
which containers to preempt (Wei Yan via Sandy Ryza)
YARN-2054. Better defaults for YARN ZK configs for retries and retry-inteval
when HA is enabled. (kasha)
YARN-1877. Document yarn.resourcemanager.zk-auth and its scope.
(Robert Kanter via kasha)
YARN-2115. Replaced RegisterNodeManagerRequest's ContainerStatus with a new
NMContainerStatus which has more information that is needed for
work-preserving RM-restart. (Jian He via vinodkv)
YARN-1474. Make sechedulers services. (Tsuyoshi Ozawa via kasha)
YARN-1913. With Fair Scheduler, cluster can logjam when all resources are
consumed by AMs (Wei Yan via Sandy Ryza)
YARN-2061. Revisit logging levels in ZKRMStateStore. (Ray Chiang via kasha)
YARN-1977. Add tests on getApplicationRequest with filtering start time range. (junping_du)
YARN-2122. In AllocationFileLoaderService, the reloadThread should be created
in init() and started in start(). (Robert Kanter via kasha)
YARN-2132. ZKRMStateStore.ZKAction#runWithRetries doesn't log the exception
it encounters. (Vamsee Yarlagadda via kasha)
YARN-2030. Augmented RMStateStore with state machine.(Binglin Chang via jianhe)
YARN-1424. RMAppAttemptImpl should return the
DummyApplicationResourceUsageReport for all invalid accesses.
(Ray Chiang via kasha)
YARN-2091. Add more values to ContainerExitStatus and pass it from NM to
RM and then to app masters (Tsuyoshi OZAWA via bikas)
YARN-2125. Changed ProportionalCapacityPreemptionPolicy to log CSV in debug
level. (Wangda Tan via jianhe)
YARN-2159. Better logging in SchedulerNode#allocateContainer.
(Ray Chiang via kasha)
YARN-2191. Added a new test to ensure NM will clean up completed applications
in the case of RM restart. (Wangda Tan via jianhe)
YARN-2195. Clean a piece of code in ResourceRequest. (Wei Yan via devaraj)
YARN-2074. Changed ResourceManager to not count AM preemptions towards app
failures. (Jian He via vinodkv)
YARN-2192. TestRMHA fails when run with a mix of Schedulers.
(Anubhav Dhoot via kasha)
YARN-2109. Fix TestRM to work with both schedulers. (Anubhav Dhoot via kasha)
YARN-2072. RM/NM UIs and webservices are missing vcore information.
(Nathan Roberts via tgraves)
YARN-2152. Added missing information into ContainerTokenIdentifier so that
NodeManagers can report the same to RM when RM restarts. (Jian He via vinodkv)
YARN-2171. Improved CapacityScheduling to not lock on nodemanager-count when
AMs heartbeat in. (Jason Lowe via vinodkv)
YARN-614. Changed ResourceManager to not count disk failure, node loss and
RM restart towards app failures. (Xuan Gong via jianhe)
YARN-2224. Explicitly enable vmem check in
TestContainersMonitor#testContainerKillOnMemoryOverflow.
(Anubhav Dhoot via kasha)
YARN-2022. Preempting an Application Master container can be kept as least priority
when multiple applications are marked for preemption by
ProportionalCapacityPreemptionPolicy (Sunil G via mayank)
YARN-2241. ZKRMStateStore: On startup, show nicer messages if znodes already
exist. (Robert Kanter via kasha)
YARN-1408 Preemption caused Invalid State Event: ACQUIRED at KILLED and caused
a task timeout for 30mins. (Sunil G via mayank)
YARN-2300. Improved the documentation of the sample requests for RM REST API -
submitting an app. (Varun Vasudev via zjshen)
OPTIMIZATIONS
BUG FIXES
YARN-1718. Fix a couple isTerminals in Fair Scheduler queue placement rules
(Sandy Ryza)
YARN-1790. Fair Scheduler UI not showing apps table (bc Wong via Sandy Ryza)
YARN-1784. TestContainerAllocation assumes CapacityScheduler.
(Robert Kanter via kasha)
YARN-1940. deleteAsUser() terminates early without deleting more files on
error (Rushabh S Shah via jlowe)
YARN-1865. ShellScriptBuilder does not check for some error conditions.
(Remus Rusanu via ivanmi)
YARN-738. TestClientRMTokens is failing irregularly while running all yarn
tests (Ming Ma via jlowe)
YARN-2018. TestClientRMService.testTokenRenewalWrongUser fails after
HADOOP-10562 (Ming Ma via Arpit Agarwal)
YARN-2011. Fix typo and warning in TestLeafQueue (Chen He via junping_du)
YARN-2042. String shouldn't be compared using == in
QueuePlacementRule#NestedUserQueue#getQueueForApp (Chen He via Sandy Ryza)
YARN-2050. Fix LogCLIHelpers to create the correct FileContext (Ming Ma
via jlowe)
YARN-2089. FairScheduler: QueuePlacementPolicy and QueuePlacementRule
are missing audience annotations. (Zhihai Xu via kasha)
YARN-2096. Race in TestRMRestart#testQueueMetricsOnRMRestart.
(Anubhav Dhoot via kasha)
YARN-2105. Fix TestFairScheduler after YARN-2012. (Ashwin Shankar via
Sandy Ryza)
YARN-2112. Fixed yarn-common's pom.xml to include jackson dependencies so
that both Timeline Server and client can access them. (Zhijie Shen via
vinodkv)
YARN-1868. YARN status web ui does not show correctly in IE 11.
(Chuan Liu via cnauroth)
YARN-2103. Inconsistency between viaProto flag and initial value of
SerializedExceptionProto.Builder (Binglin Chang via junping_du)
YARN-1550. NPE in FairSchedulerAppsBlock#render. (Anubhav Dhoot via kasha)
YARN-2119. DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT.
(Anubhav Dhoot via kasha)
YARN-2118. Fixed the type mismatch in Map#containsKey check of
TimelineWebServices#injectOwnerInfo. (Ted Yu via zjshen)
YARN-2117. Fixed the issue that secret file reader is potentially not
closed in TimelineAuthenticationFilterInitializer. (Chen He via zjshen)
YARN-2121. Fixed NPE handling in Timeline Server's TimelineAuthenticator.
(Zhijie Shen via vinodkv)
YARN-2128. FairScheduler: Incorrect calculation of amResource usage.
(Wei Yan via kasha)
YARN-2124. Fixed NPE in ProportionalCapacityPreemptionPolicy. (Wangda Tan
via jianhe)
YARN-2148. TestNMClient failed due more exit code values added and passed
to AM (Wangda Tan via bikas)
YARN-2075. Fixed the test failure of TestRMAdminCLI. (Kenji Kikushima via
zjshen)
YARN-2155. FairScheduler: Incorrect threshold check for preemption.
(Wei Yan via kasha)
YARN-1885. Fixed a bug that RM may not send application-clean-up signal
to NMs where the completed applications previously ran in case of RM restart.
(Wangda Tan via jianhe)
YARN-2167. LeveldbIterator should get closed in
NMLeveldbStateStoreService#loadLocalizationState() within finally block
(Junping Du via jlowe)
YARN-2187. FairScheduler: Disable max-AM-share check by default.
(Robert Kanter via kasha)
YARN-2111. In FairScheduler.attemptScheduling, we don't count containers
as assigned if they have 0 memory but non-zero cores (Sandy Ryza)
YARN-2204. TestAMRestart#testAMRestartWithExistingContainers assumes
CapacityScheduler. (Robert Kanter via kasha)
YARN-2163. WebUI: Order of AppId in apps table should be consistent with
ApplicationId.compareTo(). (Wangda Tan via raviprak)
YARN-2104. Scheduler queue filter failed to work because index of queue
column changed. (Wangda Tan via jlowe)
YARN-2201. Made TestRMWebServicesAppsModification be independent of the
changes on yarn-default.xml. (Varun Vasudev via zjshen)
YARN-2216 TestRMApplicationHistoryWriter sometimes fails in trunk.
(Zhijie Shen via xgong)
YARN-2216 YARN-2065 AM cannot create new containers after restart
(Jian He via stevel)
YARN-2232. Fixed ResourceManager to allow DelegationToken owners to be able
to cancel their own tokens in secure mode. (Varun Vasudev via vinodkv)
YARN-2250. FairScheduler.findLowestCommonAncestorQueue returns null when
queues not identical (Krisztian Horvath via Sandy Ryza)
YARN-2158. Improved assertion messages of TestRMWebServicesAppsModification.
(Varun Vasudev via zjshen)
YARN-2269. Remove External links from YARN UI. (Craig Welch via xgong)
YARN-2270. Made TestFSDownload#testDownloadPublicWithStatCache be skipped
when there’s no ancestor permissions. (Akira Ajisaka via zjshen)
YARN-2319. Made the MiniKdc instance start/close before/after the class of
TestRMWebServicesDelegationTokens. (Wenwu Peng via zjshen)
YARN-2335. Annotate all hadoop-sls APIs as @Private. (Wei Yan via kasha)
YARN-1726. ResourceSchedulerWrapper broken due to AbstractYarnScheduler.
(Wei Yan via kasha)
Release 2.4.1 - 2014-06-23
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
YARN-1892. Improved some logs in the scheduler. (Jian He via zjshen)
YARN-1696. Added documentation for ResourceManager fail-over. (Karthik
Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA via vinodkv)
YARN-1701. Improved default paths of the timeline store and the generic
history store. (Tsuyoshi Ozawa via zjshen)
YARN-1962. Changed Timeline Service client configuration to be off by default
given the non-readiness of the feature yet. (Mohammad Kamrul Islam via
vinodkv)
OPTIMIZATIONS
BUG FIXES
YARN-1898. Made Standby RM links conf, stacks, logLevel, metrics, jmx, logs
and static not be redirected to Active RM. (Xuan Gong via zjshen)
YARN-1837. Fixed TestMoveApplication#testMoveRejectedByScheduler failure.
(Hong Zhiguo via jianhe)
YARN-1905. TestProcfsBasedProcessTree must only run on Linux. (cnauroth)
YARN-1883. TestRMAdminService fails due to inconsistent entries in
UserGroups (Mit Desai via jeagles)
YARN-1908. Fixed DistributedShell to not fail in secure clusters. (Vinod
Kumar Vavilapalli and Jian He via vinodkv)
YARN-1910. Fixed a race condition in TestAMRMTokens that causes the test to
fail more often on Windows. (Xuan Gong via vinodkv)
YARN-1920. Fixed TestFileSystemApplicationHistoryStore failure on windows.
(Vinod Kumar Vavilapalli via zjshen)
YARN-1914. Fixed resource-download on NodeManagers to skip permission
verification of public cache files in Windows+local file-system environment.
(Varun Vasudev via vinodkv)
YARN-1903. Set exit code and diagnostics when container is killed at
NEW/LOCALIZING state. (Zhijie Shen via jianhe)
YARN-1924. Made ZKRMStateStore updateApplication(Attempt)StateInternal work
when Application(Attempt) state hasn't been stored before. (Jian He via
zjshen)
YARN-1926. Changed DistributedShell to use appIDs as unique identifiers for
HDFS paths and thus avoid test failures on Windows. (Varun Vasudev via
vinodkv)
YARN-1833. TestRMAdminService Fails in trunk and branch-2 (Mit Desai via
jeagles)
YARN-1907. TestRMApplicationHistoryWriter#testRMWritingMassiveHistory
intermittently fails. (Mit Desai via kihwal)
YARN-1933. Fixed test issues with TestAMRestart and TestNodeHealthService.
(Jian He via vinodkv)
YARN-1928. Fixed a race condition in TestAMRMRPCNodeUpdates which caused it
to fail occassionally. (Zhijie Shen via vinodkv)
YARN-1934. Fixed a potential NPE in ZKRMStateStore caused by handling
Disconnected event from ZK. (Karthik Kambatla via jianhe)
YARN-1931. Private API change in YARN-1824 in 2.4 broke compatibility
with previous releases (Sandy Ryza via tgraves)
YARN-1750. TestNodeStatusUpdater#testNMRegistration is incorrect in test
case. (Wangda Tan via junping_du)
YARN-1947. TestRMDelegationTokens#testRMDTMasterKeyStateOnRollingMasterKey
is failing intermittently. (Jian He via junping_du)
YARN-1281. Fixed TestZKRMStateStoreZKClientConnections to not fail
intermittently due to ZK-client timeouts. (Tsuyoshi Ozawa via vinodkv)
YARN-1932. Javascript injection on the job status page (Mit Desai via
jlowe)
YARN-1975. Used resources shows escaped html in CapacityScheduler and
FairScheduler page (Mit Desai via jlowe)
YARN-1929. Fixed a deadlock in ResourceManager that occurs when failover
happens right at the time of shutdown. (Karthik Kambatla via vinodkv)
YARN-1201. TestAMAuthorization fails with local hostname cannot be resolved.
(Wangda Tan via junping_du)
YARN-1861. Fixed a bug in RM to reset leader-election on fencing that was
causing both RMs to be stuck in standby mode when automatic failover is
enabled. (Karthik Kambatla and Xuan Gong via vinodkv)
YARN-1957. Consider the max capacity of the queue when computing the ideal
capacity for preemption. (Carlo Curino via cdouglas)
YARN-1986. In Fifo Scheduler, node heartbeat in between creating app and
attempt causes NPE (Hong Zhiguo via Sandy Ryza)
YARN-1976. Fix yarn application CLI to print the scheme of the tracking url
of failed/killed applications. (Junping Du via jianhe)
YARN-2016. Fix a bug in GetApplicationsRequestPBImpl to add the missed fields
to proto. (Junping Du via jianhe)
YARN-2053. Fixed a bug in AMS to not add null NMToken into NMTokens list from
previous attempts for work-preserving AM restart. (Wangda Tan via jianhe)
YARN-2066. Wrong field is referenced in GetApplicationsRequestPBImpl#mergeLocalToBuilder()
(Hong Zhiguo via junping_du)
YARN-2081. Fixed TestDistributedShell failure after YARN-1962. (Zhiguo Hong
via zjshen)
Release 2.4.0 - 2014-04-07
INCOMPATIBLE CHANGES
NEW FEATURES
YARN-930. Bootstrapping ApplicationHistoryService module. (vinodkv)
YARN-947. Implementing the data objects to be used by the History reader
and writer interfaces. (Zhijie Shen via vinodkv)
YARN-934. Defined a Writer Interface for HistoryStorage. (Zhijie Shen via
vinodkv)
YARN-925. Defined a Reader Interface for HistoryStorage. (Mayank Bansal via
vinodkv)
YARN-978. Created ApplicationAttemptReport. (Mayank Bansal via vinodkv)
YARN-956. Added a testable in-memory HistoryStorage. (Mayank Bansal via
vinodkv)
YARN-975. Added a file-system implementation for HistoryStorage. (Zhijie Shen
via vinodkv)
YARN-1123. Added a new ContainerReport and its Protobuf implementation. (Mayank
Bansal via vinodkv)
YARN-979. Added more APIs for getting information about ApplicationAttempts
and Containers from ApplicationHistoryProtocol. (Mayank Bansal and Zhijie Shen
via vinodkv)
YARN-953. Changed ResourceManager to start writing history data. (Zhijie Shen
via vinodkv)
YARN-1266. Implemented PB service and client wrappers for
ApplicationHistoryProtocol. (Mayank Bansal via vinodkv)
YARN-955. Implemented ApplicationHistoryProtocol handler. (Mayank Bansal via
vinodkv)
YARN-1242. Changed yarn scripts to be able to start ApplicationHistoryServer
as an individual process. (Mayank Bansal via vinodkv)
YARN-954. Implemented web UI for the ApplicationHistoryServer and wired it into
the HistoryStorage. (Zhijie Shen via vinodkv)
YARN-967. Added the client and CLI interfaces for obtaining ApplicationHistory
data. (Mayank Bansal via vinodkv)
YARN-1023. Added Webservices REST APIs support for Application History. (Zhijie
Shen via vinodkv)
YARN-1413. Implemented serving of aggregated-logs in the ApplicationHistory
server. (Mayank Bansal via vinodkv)
YARN-1633. Defined user-facing entity, entity-info and event objects related
to Application Timeline feature. (Zhijie Shen via vinodkv)
YARN-1611. Introduced the concept of a configuration provider which can be
used by ResourceManager to read configuration locally or from remote systems
so as to help RM failover. (Xuan Gong via vinodkv)
YARN-1659. Defined the ApplicationTimelineStore store as an abstraction for
implementing different storage impls for storing timeline information.
(Billie Rinaldi via vinodkv)
YARN-1634. Added a testable in-memory implementation of
ApplicationTimelineStore. (Zhijie Shen via vinodkv)
YARN-1461. Added tags for YARN applications and changed RM to handle them.
(Karthik Kambatla via zjshen)
YARN-1636. Augmented Application-history server's web-services to also expose
new APIs for retrieving and storing timeline information. (Zhijie Shen via
vinodkv)
YARN-1490. Introduced the ability to make ResourceManager optionally not kill
all containers when an ApplicationMaster exits. (Jian He via vinodkv)
YARN-1041. Added the ApplicationMasterProtocol API for applications to use the
ability in ResourceManager to optionally not kill containers when the
ApplicationMaster exits. (Jian He via vinodkv)
YARN-1566. Changed Distributed Shell to retain containers across application
attempts. (Jian He via vinodkv)
YARN-1635. Implemented a Leveldb based ApplicationTimelineStore. (Billie
Rinaldi via zjshen)
YARN-1637. Implemented a client library for Java users to post timeline
entities and events. (zjshen)
YARN-1496. Protocol additions to allow moving apps between queues (Sandy
Ryza)
YARN-1498. Common scheduler changes for moving apps between queues (Sandy
Ryza)
YARN-1504. RM changes for moving apps between queues (Sandy Ryza)
YARN-1499. Fair Scheduler changes for moving apps between queues (Sandy
Ryza)
YARN-1497. Command line additions for moving apps between queues (Sandy
Ryza)
YARN-1588. Enhanced RM and the scheduling protocol to also send NMTokens of
transferred containers from previous app-attempts to new AMs after YARN-1490.
(Jian He via vinodkv)
YARN-1717. Enabled periodically discarding old data in LeveldbTimelineStore.
(Billie Rinaldi via zjshen)
YARN-1690. Made DistributedShell send timeline entities+events. (Mayank Bansal
via zjshen)
YARN-1775. Enhanced ProcfsBasedProcessTree to optionally add the ability to
use smaps for obtaining used memory information. (Rajesh Balamohan via
vinodkv)
YARN-1838. Enhanced timeline service getEntities API to get entities from a
given entity ID or insertion timestamp. (Billie Rinaldi via zjshen)
IMPROVEMENTS
YARN-1007. Enhance History Reader interface for Containers. (Mayank Bansal via
devaraj)
YARN-974. Added more information to RMContainer to be collected and recorded in
Application-History. (Zhijie Shen via vinodkv)
YARN-987. Added ApplicationHistoryManager responsible for exposing reports to
all clients. (Mayank Bansal via vinodkv)
YARN-1630. Introduce timeout for async polling operations in YarnClientImpl
(Aditya Acharya via Sandy Ryza)
YARN-1617. Remove ancient comment and surround LOG.debug in
AppSchedulingInfo.allocate (Sandy Ryza)
YARN-1639. Modified RM HA configuration handling to have a way of not
requiring separate configuration files for each RM. (Xuan Gong via vinodkv)
YARN-1668. Modified RM HA handling of admin-acls to be available across RM
failover by making using of a remote configuration-provider. (Xuan Gong via
vinodkv)
YARN-1667. Modified RM HA handling of super users (with proxying ability) to
be available across RM failover by making using of a remote
configuration-provider. (Xuan Gong via vinodkv)
YARN-1285. Changed the default value of yarn.acl.enable in yarn-default.xml
to be consistent with what exists (false) in the code and documentation.
(Kenji Kikushima via vinodkv)
YARN-1669. Modified RM HA handling of protocol level service-ACLS to
be available across RM failover by making using of a remote
configuration-provider. (Xuan Gong via vinodkv)
YARN-1665. Simplify the configuration of RM HA by having better default
values. (Xuan Gong via vinodkv)
YARN-1660. Simplified the RM HA configuration to accept and be able to simply
depend just on configuration properties of the form
yarn.resourcemanager.hostname.RMID and use the default ports for all service
addresses. (Xuan Gong via vinodkv)
YARN-1493. Changed ResourceManager and Scheduler interfacing to recognize
app-attempts separately from apps. (Jian He via vinodkv)
YARN-1459. Changed ResourceManager to depend its service initialization
on the configuration-provider mechanism during startup too. (Xuan Gong via
vinodkv)
YARN-1706. Created an utility method to dump timeline records to JSON
strings. (zjshen)
YARN-1641. ZK store should attempt a write periodically to ensure it is
still Active. (kasha)
YARN-1531. True up yarn command documentation (Akira Ajisaka via kasha)
YARN-1345. Remove FINAL_SAVING state from YarnApplicationAttemptState
(Zhijie Shen via jianhe)
YARN-1676. Modified RM HA handling of user-to-group mappings to
be available across RM failover by making using of a remote
configuration-provider. (Xuan Gong via vinodkv)
YARN-1666. Modified RM HA handling of include/exclude node-lists to be
available across RM failover by making using of a remote
configuration-provider. (Xuan Gong via vinodkv)
YARN-1171. Add default queue properties to Fair Scheduler documentation
(Naren Koneru via Sandy Ryza)
YARN-1470. Add audience annotations to MiniYARNCluster. (Anubhav Dhoot
via kasha)
YARN-1732. Changed types of related-entities and primary-filters in the
timeline-service to be sets instead of maps. (Billie Rinaldi via vinodkv)
YARN-1687. Renamed user-facing records for the timeline-service to be simply
named after 'timeline' instead of 'apptimeline'. (Zhijie Shen via vinodkv)
YARN-1749. Updated application-history related configs to reflect the latest
reality and to be consistently named. (Zhijie Shen via vinodkv)
YARN-1301. Added the INFO level log of the non-empty blacklist additions
and removals inside ApplicationMasterService. (Tsuyoshi Ozawa via zjshen)
YARN-1528. Allow setting auth for ZK connections. (kasha)
YARN-1704. Modified LICENSE and NOTICE files to reflect newly used levelDB
related libraries. (Billie Rinaldi via vinodkv)
YARN-1765. Added test cases to verify that killApplication API works across
ResourceManager failover. (Xuan Gong via vinodkv)
YARN-1730. Implemented simple write-locking in the LevelDB based timeline-
store. (Billie Rinaldi via vinodkv)
YARN-986. Changed client side to be able to figure out the right RM Delegation
token for the right ResourceManager when HA is enabled. (Karthik Kambatla via
vinodkv)
YARN-1761. Modified RMAdmin CLI to check whether HA is enabled or not before
it executes any of the HA admin related commands. (Xuan Gong via vinodkv)
YARN-1780. Improved logging in the Timeline client and server. (Zhijie Shen
via vinodkv)
YARN-1525. Web UI should redirect to active RM when HA is enabled. (Cindy Li
via kasha)
YARN-1781. Modified NodeManagers to allow admins to specify max disk
utilization for local disks so as to be able to offline full disks. (Varun
Vasudev via vinodkv)
YARN-1410. Added tests to validate that clients can fail-over to a new RM
after getting an application-ID but before submission and can still submit to
the newly active RM with no issues. (Xuan Gong via vinodkv)
YARN-1764. Modified YarnClient to correctly handle failover of ResourceManager
after the submitApplication call goes through. (Xuan Gong via vinodkv)
YARN-1389. Made ApplicationClientProtocol and ApplicationHistoryProtocol
expose analogous getApplication(s)/Attempt(s)/Container(s) APIs. (Mayank
Bansal via zjshen)
YARN-1658. Modified web-app framework to let standby RMs redirect
web-service calls to the active RM. (Cindy Li via vinodkv)
YARN-1824. Improved NodeManager and clients to be able to handle cross
platform application submissions. (Jian He via vinodkv)
YARN-1512. Enhanced CapacityScheduler to be able to decouple scheduling from
node-heartbeats. (Arun C Murthy via vinodkv)
YARN-1570. Fixed formatting of the lines in YarnCommands.apt.vm docs source.
(Akira Ajisaka via vinodkv)
YARN-1536. Cleanup: Get rid of ResourceManager#get*SecretManager() methods
and use the RMContext methods instead. (Anubhav Dhoot via kasha)
YARN-1850. Introduced the ability to optionally disable sending out timeline-
events in the TimelineClient. (Zhijie Shen via vinodkv)
YARN-1452. Added documentation about the configuration and usage of generic
application history and the timeline data service. (Zhijie Shen via vinodkv)
YARN-1891. Added documentation for NodeManager health-monitoring. (Varun
Vasudev via vinodkv)
YARN-1017. Added documentation for ResourceManager Restart.(jianhe)
OPTIMIZATIONS
YARN-1771. Reduce the number of NameNode operations during localization of
public resources using a cache. (Sangjin Lee via cdouglas)
BUG FIXES
YARN-935. Correcting pom.xml to build applicationhistoryserver module
successfully. (Zhijie Shen via vinodkv)
YARN-962. Fixed bug in application-history proto file and renamed it be just
a client proto file. (Zhijie Shen via vinodkv)
YARN-984. Renamed the incorrectly named applicationhistoryservice.records.pb.impl
package to be applicationhistoryservice.records.impl.pb. (Devaraj K via vinodkv)
YARN-1534. Fixed failure of test TestAHSWebApp. (Shinichi Yamashita via vinodkv)
YARN-1555. Fixed test failures in applicationhistoryservice.* (Vinod Kumar
Vavilapalli via mayank)
YARN-1594. Updated pom.xml of applicationhistoryservice sub-project according to
YARN-888. (Vinod Kumar Vavilapalli via zjshen)
YARN-1596. Fixed Javadoc warnings on branch YARN-321. (Vinod Kumar Vavilapalli
via zjshen)
YARN-1597. Fixed Findbugs warnings on branch YARN-321. (Vinod Kumar Vavilapalli
via zjshen)
YARN-1595. Made enabling history service configurable and fixed test failures on
branch YARN-321. (Vinod Kumar Vavilapalli via zjshen)
YARN-1605. Fixed formatting issues in the new module on branch YARN-321. (Vinod
Kumar Vavilapalli via zjshen)
YARN-1625. Fixed RAT warnings after YARN-321 merge. (Shinichi Yamashita via
vinodkv)
YARN-1613. Fixed the typo with the configuration name
YARN_HISTORY_SERVICE_ENABLED. (Akira Ajisaka via vinodkv)
YARN-1618. Fix invalid RMApp transition from NEW to FINAL_SAVING (kasha)
YARN-1600. RM does not startup when security is enabled without spnego
configured (Haohui Mai via jlowe)
YARN-1642. RMDTRenewer#getRMClient should use ClientRMProxy (kasha)
YARN-1632. TestApplicationMasterServices should be under
org.apache.hadoop.yarn.server.resourcemanager package (Chen He via jeagles)
YARN-1673. Fix option parsing in YARN's application CLI after it is broken
by YARN-967. (Mayank Bansal via vinodkv)
YARN-1684. Fixed history server heap size in yarn script. (Billie Rinaldi
via zjshen)
YARN-1166. Fixed app-specific and attempt-specific QueueMetrics to be
triggered by accordingly app event and attempt event.
YARN-1689. Made RMAppAttempt get killed when RMApp is at ACCEPTED. (Vinod
Kumar Vavilapalli via zjshen)
YARN-1661. Fixed DS ApplicationMaster to write the correct exit log. (Vinod
Kumar Vavilapalli via zjshen)
YARN-1672. YarnConfiguration is missing a default for
yarn.nodemanager.log.retain-seconds (Naren Koneru via kasha)
YARN-1698. Fixed default TimelineStore in code to match what is documented
in yarn-default.xml (Zhijie Shen via vinodkv)
YARN-1697. NodeManager reports negative running containers (Sandy Ryza)
YARN-1719. Fixed the root path related Jersey warnings produced in
ATSWebServices. (Billie Rinaldi via zjshen)
YARN-1692. ConcurrentModificationException in fair scheduler AppSchedulable
(Sangjin Lee via Sandy Ryza)
YARN-1578. Fixed reading incomplete application attempt and container data
in FileSystemApplicationHistoryStore. (Shinichi Yamashita via zjshen)
YARN-1417. Modified RM to generate container-tokens not at creation time, but
at allocation time so as to prevent RM from shelling out containers with
expired tokens. (Omkar Vinit Joshi and Jian He via vinodkv)
YARN-1553. Modified YARN and MR to stop using HttpConfig.isSecure() and
instead rely on the http policy framework. And also fix some bugs related
to https handling in YARN web-apps. (Haohui Mai via vinodkv)
YARN-1721. When moving app between queues in Fair Scheduler, grab lock on
FSSchedulerApp (Sandy Ryza)
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is
turned on (Sandy Ryza)
YARN-1590. Fixed ResourceManager, web-app proxy and MR JobHistoryServer to
expand _HOST properly in their kerberos principles. (Mohammad Kamrul Islam
va vinodkv)
YARN-1428. Fixed RM to write the final state of RMApp/RMAppAttempt to the
application history store in the transition to the final state. (Contributed
by Zhijie Shen)
YARN-713. Fixed ResourceManager to not crash while building tokens when DNS
issues happen transmittently. (Jian He via vinodkv)
YARN-1398. Fixed a deadlock in ResourceManager between users requesting
queue-acls and completing containers. (vinodkv)
YARN-1071. Enabled ResourceManager to recover cluster metrics
numDecommissionedNMs after restarting. (Jian He via zjshen)
YARN-1742. Fixed javadoc of configuration parameter
DEFAULT_NM_MIN_HEALTHY_DISKS_FRACTION. (Akira Ajisaka via vinodkv)
YARN-1686. Fixed NodeManager to properly handle any errors during
re-registration after a RESYNC and thus avoid hanging. (Rohith Sharma via
vinodkv)
YARN-1734. Fixed ResourceManager to update the configurations when it
transits from standby to active mode so as to assimilate any changes that
happened while it was in standby mode. (Xuan Gong via vinodkv)
YARN-1760. TestRMAdminService assumes CapacityScheduler. (kasha)
YARN-1758. Fixed ResourceManager to not mandate the presence of site specific
configuration files and thus fix failures in downstream tests. (Xuan Gong via
vinodkv)
YARN-1748. Excluded core-site.xml from hadoop-yarn-server-tests package's jar
and thus avoid breaking downstream tests. (Sravya Tirukkovalur via vinodkv)
YARN-1729. Made TimelineWebServices deserialize the string primary- and
secondary-filters param into the JSON-compatible object. (Billie Rinaldi via
zjshen)
YARN-1766. Fixed a bug in ResourceManager to use configuration loaded from the
configuration-provider when booting up. (Xuan Gong via vinodkv)
YARN-1768. Fixed error message being too verbose when killing a non-existent
application. (Tsuyoshi OZAWA via raviprak)
YARN-1774. FS: Submitting to non-leaf queue throws NPE. (Anubhav Dhoot and
Karthik Kambatla via kasha)
YARN-1783. Fixed a bug in NodeManager's status-updater that was losing
completed container statuses when NodeManager is forced to resync by the
ResourceManager. (Jian He via vinodkv)
YARN-1787. Fixed help messages for applicationattempt and container
sub-commands in bin/yarn. (Zhijie Shen via vinodkv)
YARN-1793. Fixed ClientRMService#forceKillApplication not killing unmanaged
application. (Karthik Kambatla via jianhe)
YARN-1788. Fixed a bug in ResourceManager to set the apps-completed and
apps-killed metrics correctly for killed applications. (Varun Vasudev via
vinodkv)
YARN-1821. NPE on registerNodeManager if the request has containers for
UnmanagedAMs. (kasha)
YARN-1800. Fixed NodeManager to gracefully handle RejectedExecutionException
in the public-localizer thread-pool. (Varun Vasudev via vinodkv)
YARN-1444. Fix CapacityScheduler to deal with cases where applications
specify host/rack requests without off-switch request. (Wangda Tan via
acmurthy)
YARN-1812. Fixed ResourceManager to synchrously renew tokens after recovery
and thus recover app itself synchronously and avoid races with resyncing
NodeManagers. (Jian He via vinodkv)
YARN-1816. Fixed ResourceManager to get RMApp correctly handle
ATTEMPT_FINISHED event at ACCEPTED state that can happen after RM restarts.
(Jian He via vinodkv)
YARN-1789. ApplicationSummary does not escape newlines in the app name
(Tsuyoshi OZAWA via jlowe)
YARN-1830. Fixed TestRMRestart#testQueueMetricsOnRMRestart failure due to
race condition when app is submitted. (Zhijie Shen via jianhe)
YARN-1685. Fixed few bugs related to handling of containers' log-URLs on
ResourceManager and history-service. (Zhijie Shen via vinodkv)
YARN-1206. Fixed AM container log to show on NM web page after application
finishes if log-aggregation is disabled. (Rohith Sharmaks via jianhe)
YARN-1591. Fixed AsyncDispatcher to handle interrupts on shutdown in a sane
manner and thus fix failure of TestResourceTrackerService. (Tsuyoshi Ozawa
via vinodkv)
YARN-1839. Fixed handling of NMTokens in ResourceManager such that containers
launched by AMs running on the same machine as the AM are correctly
propagated. (Jian He via vinodkv)
YARN-1640. Fixed manual failover of ResourceManagers to work correctly in
secure clusters. (Xuan Gong via vinodkv)
YARN-1855. Made Application-history server to be optional in MiniYARNCluster
and thus avoid the failure of TestRMFailover#testRMWebAppRedirect. (Zhijie
Shen via vinodkv)
YARN-1859. Fixed WebAppProxyServlet to correctly handle applications absent
on the ResourceManager. (Zhijie Shen via vinodkv)
YARN-1811. Fixed AMFilters in YARN to correctly accept requests from either
web-app proxy or the RMs when HA is enabled. (Robert Kanter via vinodkv)
YARN-1670. Fixed a bug in log-aggregation that can cause the writer to write
more log-data than the log-length that it records. (Mit Desai via vinodk)
YARN-1849. Fixed NPE in ResourceTrackerService#registerNodeManager for UAM
(Karthik Kambatla via jianhe )
YARN-1863. Fixed test failure in TestRMFailover after YARN-1859. (Xuan Gong
via vinodkv)
YARN-1854. Fixed test failure in TestRMHA#testStartAndTransitions. (Rohith
Sharma KS via vinodkv)
YARN-1776. Fixed DelegationToken renewal to survive RM failover. (Zhijie
Shen via jianhe)
YARN-1577. Made UnmanagedAMLauncher do launchAM after the attempt reaches
the LAUNCHED state. (Jian He via zjshen)
YARN-1785. FairScheduler treats app lookup failures as ERRORs.
(bc Wong via kasha)
YARN-1752. Fixed ApplicationMasterService to reject unregister request if
AM did not register before. (Rohith Sharma via jianhe)
YARN-1846. TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler.
(Robert Kanter via kasha)
YARN-1705. Reset cluster-metrics on transition to standby. (Rohith via kasha)
YARN-1852. Fixed RMAppAttempt to not resend AttemptFailed/AttemptKilled
events to already recovered Failed/Killed RMApps. (Rohith via jianhe)
YARN-1866. Fixed an issue with renewal of RM-delegation tokens on restart or
fail-over. (Jian He via vinodkv)
YARN-1867. Fixed a bug in ResourceManager that was causing invalid ACL checks
in the web-services after fail-over. (Vinod Kumar Vavilapalli)
YARN-1521. Mark Idempotent/AtMostOnce annotations to the APIs in
ApplicationClientProtcol, ResourceManagerAdministrationProtocol and
ResourceTrackerProtocol so that they work in HA scenario. (Xuan Gong
via jianhe)
YARN-1873. Fixed TestDistributedShell failure when the test cases are out of
order. (Mit Desai via zjshen)
YARN-1893. Mark AtMostOnce annotation to ApplicationMasterProtocol#allocate.
(Xuan Gong via jianhe)
Release 2.3.1 - UNRELEASED
INCOMPATIBLE CHANGES
NEW FEATURES
IMPROVEMENTS
OPTIMIZATIONS
BUG FIXES
Release 2.3.0 - 2014-02-18
INCOMPATIBLE CHANGES
NEW FEATURES
YARN-649. Added a new NM web-service to serve container logs in plain text
over HTTP. (Sandy Ryza via vinodkv)
YARN-1021. Yarn Scheduler Load Simulator. (ywskycn via tucu)
YARN-1010. FairScheduler: decouple container scheduling from nodemanager
heartbeats. (Wei Yan via Sandy Ryza)
YARN-1253. Changes to LinuxContainerExecutor to run containers as a single
dedicated user in non-secure mode. (rvs via tucu)
YARN-1027. Implement RMHAProtocolService (Karthik Kambatla via bikas)
YARN-1068. Add admin support for HA operations (Karthik Kambatla via
bikas)
YARN-311. RM/scheduler support for dynamic resource configuration.
(Junping Du via llu)
YARN-1392. Allow sophisticated app-to-queue placement policies in the Fair
Scheduler (Sandy Ryza)
YARN-1447. Common PB type definitions for container resizing. (Wangda Tan
via Sandy Ryza)
YARN-1448. AM-RM protocol changes to support container resizing (Wangda Tan
via Sandy Ryza)
YARN-312. Introduced ResourceManagerAdministrationProtocol changes to support
changing resources on node. (Junping Du via vinodkv)
YARN-1028. Added FailoverProxyProvider capability to ResourceManager to help
with RM failover. (Karthik Kambatla via vinodkv)
YARN-1029. Added embedded leader election in the ResourceManager. (Karthik
Kambatla via vinodkv)
YARN-1033. Expose RM active/standby state to Web UI and REST API (kasha)
IMPROVEMENTS
YARN-305. Fair scheduler logs too many "Node offered to app" messages.
(Lohit Vijayarenu via Sandy Ryza)
YARN-1258. Allow configuring the Fair Scheduler root queue (Sandy Ryza)
YARN-1288. Make Fair Scheduler ACLs more user friendly (Sandy Ryza)
YARN-1315. TestQueueACLs should also test FairScheduler (Sandy Ryza)
YARN-1335. Move duplicate code from FSSchedulerApp and FiCaSchedulerApp
into SchedulerApplication (Sandy Ryza)
YARN-1333. Support blacklisting in the Fair Scheduler (Tsuyoshi Ozawa via
Sandy Ryza)
YARN-1109. Demote NodeManager "Sending out status for container" logs to
debug (haosdent via Sandy Ryza)
YARN-1321. Changed NMTokenCache to support both singleton and an instance
usage. (Alejandro Abdelnur via vinodkv)
YARN-1388. Fair Scheduler page always displays blank fair share (Liyin Liang
via Sandy Ryza)
YARN-7. Support CPU resource for DistributedShell. (Junping Du via llu)
YARN-905. Add state filters to nodes CLI (Wei Yan via Sandy Ryza)
YARN-1098. Separate out RM services into Always On and Active (Karthik
Kambatla via bikas)
YARN-353. Add Zookeeper-based store implementation for RMStateStore.
(Bikas Saha, Jian He and Karthik Kambatla via hitesh)
YARN-819. ResourceManager and NodeManager should check for a minimum allowed
version (Robert Parker via jeagles)
YARN-425. coverage fix for yarn api (Aleksey Gorshkov via jeagles)
YARN-1199. Make NM/RM Versions Available (Mit Desai via jeagles)
YARN-1232. Configuration to support multiple RMs (Karthik Kambatla via
bikas)
YARN-465. fix coverage org.apache.hadoop.yarn.server.webproxy (Aleksey
Gorshkov and Andrey Klochkov via jlowe)
YARN-976. Document the meaning of a virtual core. (Sandy Ryza)
YARN-1182. MiniYARNCluster creates and inits the RM/NM only on start()
(Karthik Kambatla via Sandy Ryza)
HADOOP-9598. Improve code coverage of RMAdminCLI (Aleksey Gorshkov and
Andrey Klochkov via jeagles)
YARN-1306. Clean up hadoop-sls sample-conf according to YARN-1228 (Wei Yan
via Sandy Ryza)
YARN-891. Modified ResourceManager state-store to remember completed
applications so that clients can get information about them post RM-restart.
(Jian He via vinodkv)
YARN-1290. Let continuous scheduling achieve more balanced task assignment
(Wei Yan via Sandy Ryza)
YARN-786. Expose application resource usage in RM REST API (Sandy Ryza)
YARN-1323. Set HTTPS webapp address along with other RPC addresses in HAUtil
(Karthik Kambatla via Sandy Ryza)
YARN-1121. Changed ResourceManager's state-store to drain all events on
shut-down. (Jian He via vinodkv)
YARN-1387. RMWebServices should use ClientRMService for filtering
applications (Karthik Kambatla via Sandy Ryza)
YARN-1222. Make improvements in ZKRMStateStore for fencing (Karthik
Kambatla via bikas)
YARN-709. Added tests to verify validity of delegation tokens and logging of
appsummary after RM restart. (Jian He via vinodkv)
YARN-1210. Changed RM to start new app-attempts on RM restart only after
ensuring that previous AM exited or after expiry time. (Omkar Vinit Joshi via
vinodkv)
YARN-674. Fixed ResourceManager to renew DelegationTokens on submission
asynchronously to work around potential slowness in state-store. (Omkar Vinit
Joshi via vinodkv)
YARN-584. In scheduler web UIs, queues unexpand on refresh. (Harshit
Daga via Sandy Ryza)
YARN-1303. Fixed DistributedShell to not fail with multiple commands separated
by a semi-colon as shell-command. (Xuan Gong via vinodkv)
YARN-1423. Support queue placement by secondary group in the Fair Scheduler
(Ted Malaska via Sandy Ryza)
YARN-1314. Fixed DistributedShell to not fail with multiple arguments for a
shell command separated by spaces. (Xuan Gong via vinodkv)
YARN-1239. Modified ResourceManager state-store implementations to start
storing version numbers. (Jian He via vinodkv)
YARN-1241. In Fair Scheduler, maxRunningApps does not work for non-leaf
queues. (Sandy Ryza)
YARN-1318. Promoted AdminService to an Always-On service and merged it into
RMHAProtocolService. (Karthik Kambatla via vinodkv)
YARN-1332. In TestAMRMClient, replace assertTrue with assertEquals where
possible (Sebastian Wong via Sandy Ryza)
YARN-1403. Separate out configuration loading from QueueManager in the Fair
Scheduler (Sandy Ryza)
YARN-1181. Augment MiniYARNCluster to support HA mode (Karthik Kambatla)
YARN-546. Allow disabling the Fair Scheduler event log (Sandy Ryza)
YARN-807. When querying apps by queue, iterating over all apps is
inefficient and limiting (Sandy Ryza)
YARN-1378. Implemented a cleaner of old finished applications from the RM
state-store. (Jian He via vinodkv)
YARN-1481. Move internal services logic from AdminService to ResourceManager.
(vinodkv via kasha)
YARN-1491. Upgrade JUnit3 TestCase to JUnit 4 (Chen He via jeagles)
YARN-408. Change CapacityScheduler to not disable delay-scheduling by default.
(Mayank Bansal via vinodkv)
YARN-1325. Modified RM HA configuration validation to also ensure that
multiple RMs are configured. (Xuan Gong via vinodkv)
YARN-1311. Fixed app specific scheduler-events' names to be app-attempt
based. (vinodkv via jianhe)
YARN-1485. Modified RM HA configuration validation to also ensure that
service-address configuration are configured for every RM. (Xuan Gong via
vinodkv)
YARN-1435. Modified Distributed Shell to accept either the command or the
custom script. (Xuan Gong via zjshen)
YARN-1446. Changed client API to retry killing application till RM
acknowledges so as to account for RM crashes/failover. (Jian He via vinodkv)
YARN-1307. Redesign znode structure for Zookeeper based RM state-store for
better organization and scalability. (Tsuyoshi OZAWA via vinodkv)
YARN-1172. Convert SecretManagers in RM to services (Tsuyoshi OZAWA via kasha)
YARN-1523. Use StandbyException instead of RMNotYetReadyException (kasha)
YARN-1541. Changed ResourceManager to invalidate ApplicationMaster host/port
information once an AM crashes. (Jian He via vinodkv)
YARN-1482. Modified WebApplicationProxy to make it work across ResourceManager
fail-over. (Xuan Gong via vinodkv)
YARN-1568. Rename clusterid to clusterId in ActiveRMInfoProto (kasha)
YARN-1579. ActiveRMInfoProto fields should be optional (kasha)
YARN-888. Cleaned up POM files so that non-leaf modules don't include any
dependencies and thus compact the dependency list for leaf modules.
(Alejandro Abdelnur via vinodkv)
YARN-1567. In Fair Scheduler, allow empty queues to change between leaf and
parent on allocation file reload (Sandy Ryza)
YARN-1616. RMFatalEventDispatcher should log the cause of the event (kasha)
YARN-1624. QueuePlacementPolicy format is not easily readable via a JAXB
parser (Aditya Acharya via Sandy Ryza)
YARN-1623. Include queue name in RegisterApplicationMasterResponse (Sandy
Ryza)
YARN-1573. ZK store should use a private password for root-node-acls.
(kasha).
OPTIMIZATIONS
BUG FIXES
YARN-1284. LCE: Race condition leaves dangling cgroups entries for killed
containers. (Alejandro Abdelnur via Sandy Ryza)
YARN-1283. Fixed RM to give a fully-qualified proxy URL for an application
so that clients don't need to do scheme-mangling. (Omkar Vinit Joshi via
vinodkv)
YARN-879. Fixed tests w.r.t o.a.h.y.server.resourcemanager.Application.
(Junping Du via devaraj)
YARN-1265. Fair Scheduler chokes on unhealthy node reconnect (Sandy Ryza)
YARN-1044. used/min/max resources do not display info in the scheduler page
(Sangjin Lee via Sandy Ryza)
YARN-1259. In Fair Scheduler web UI, queue num pending and num active apps
switched. (Robert Kanter via Sandy Ryza)
YARN-1295. In UnixLocalWrapperScriptBuilder, using bash -c can cause Text
file busy errors (Sandy Ryza)
YARN-1185. Fixed FileSystemRMStateStore to not leave partial files that
prevent subsequent ResourceManager recovery. (Omkar Vinit Joshi via vinodkv)
YARN-1331. yarn.cmd exits with NoClassDefFoundError trying to run rmadmin or
logs. (cnauroth)
YARN-1330. Fair Scheduler: defaultQueueSchedulingPolicy does not take effect
(Sandy Ryza)
YARN-1022. Unnecessary INFO logs in AMRMClientAsync (haosdent via bikas)
YARN-1349. yarn.cmd does not support passthrough to any arbitrary class.
(cnauroth)
YARN-1357. TestContainerLaunch.testContainerEnvVariables fails on Windows.
(Chuan Liu via cnauroth)
YARN-1358. TestYarnCLI fails on Windows due to line endings. (Chuan Liu via
cnauroth)
YARN-1343. NodeManagers additions/restarts are not reported as node updates
in AllocateResponse responses to AMs. (tucu)
YARN-1381. Same relaxLocality appears twice in exception message of
AMRMClientImpl#checkLocalityRelaxationConflict() (Ted Yu via Sandy Ryza)
YARN-1407. RM Web UI and REST APIs should uniformly use
YarnApplicationState (Sandy Ryza)
YARN-1438. Ensure container diagnostics includes exception from container
launch. (stevel via acmurthy)
YARN-1138. yarn.application.classpath is set to point to $HADOOP_CONF_DIR
etc., which does not work on Windows. (Chuan Liu via cnauroth)
YARN-461. Fair scheduler should not accept apps with empty string queue name.
(ywskycn via tucu)
YARN-1060. Two tests in TestFairScheduler are missing @Test annotation
(Niranjan Singh via Sandy Ryza)
YARN-1188. The context of QueueMetrics becomes default when using
FairScheduler (Tsuyoshi Ozawa via Sandy Ryza)
YARN-1268. TestFairScheduler.testContinuousScheduling is flaky (Sandy Ryza)
YARN-1300. SLS tests fail because conf puts YARN properties in
fair-scheduler.xml (Ted Yu via Sandy Ryza)
YARN-1183. MiniYARNCluster shutdown takes several minutes intermittently
(Andrey Klochkov via jeagles)
YARN-1305. RMHAProtocolService#serviceInit should handle HAUtil's
IllegalArgumentException (Tsuyoshi Ozawa via bikas)
YARN-1374. Changed ResourceManager to start the preemption policy monitors
as active services. (Karthik Kambatla via vinodkv)
YARN-1395. Distributed shell application master launched with debug flag can
hang waiting for external ls process. (cnauroth)
YARN-1400. yarn.cmd uses HADOOP_RESOURCEMANAGER_OPTS. Should be
YARN_RESOURCEMANAGER_OPTS. (Raja Aluri via cnauroth)
YARN-1401. With zero sleep-delay-before-sigkill.ms, no signal is ever sent
(Gera Shegalov via Sandy Ryza)
YARN-1411. HA config shouldn't affect NodeManager RPC addresses (Karthik
Kambatla via bikas)
YARN-1419. TestFifoScheduler.testAppAttemptMetrics fails intermittently
under jdk7 (Jonathan Eagles via jlowe)
YARN-744. Race condition in ApplicationMasterService.allocate .. It might
process same allocate request twice resulting in additional containers
getting allocated. (Omkar Vinit Joshi via bikas)
YARN-1425. TestRMRestart fails because MockRM.waitForState(AttemptId) uses
current attempt instead of the attempt passed as argument (Omkar Vinit
Joshi via bikas)
YARN-1053. Diagnostic message from ContainerExitEvent is ignored in
ContainerImpl (Omkar Vinit Joshi via bikas)
YARN-1320. Fixed Distributed Shell application to respect custom log4j
properties file. (Xuan Gong via vinodkv)
YARN-1416. Fixed a few invalid transitions in RMApp, RMAppAttempt and in some
tests. (Jian He via vinodkv)
YARN-895. Changed RM state-store to not crash immediately if RM restarts while
the state-store is down. (Jian He via vinodkv)
YARN-1454. Fixed test failure issue with TestRMRestart. (Karthik Kambatla
via vinodkv)
YARN-1450. Fixed test failure in TestUnmanagedAMLauncher by removing its
dependency on distributed-shell. (Binglin Chang via vinodkv)
YARN-1405. Fixed ResourceManager to not hang when init/start fails with an
exception w.r.t state-store. (Jian He via vinodkv)
YARN-1505. Fixed Webapplication proxy server to not hardcode its bind
address. (Xuan Gong via vinodkv)
YARN-1145. Fixed a potential file-handle leak in the web interface for
displaying aggregated logs. (Rohith Sharma via vinodkv)
YARN-1451. TestResourceManager relies on the scheduler assigning multiple
containers in a single node update. (Sandy Ryza via kasha)
YARN-1527. Fix yarn rmadmin command to print the correct usage info.
(Akira AJISAKA via jianhe)
YARN-1522. Fixed a race condition in the test TestApplicationCleanup that was
causing it to randomly fail. (Liyin Liang via vinodkv)
YARN-1549. Fixed a bug in ResourceManager's ApplicationMasterService that
was causing unamanged AMs to not finish correctly. (haosdent via vinodkv)
YARN-1559. Race between ServerRMProxy and ClientRMProxy setting
RMProxy#INSTANCE. (kasha and vinodkv via kasha)
YARN-1560. Fixed TestYarnClient#testAMMRTokens failure with null AMRM token.
(Ted Yu via jianhe)
YARN-1409. NonAggregatingLogHandler can throw RejectedExecutionException
(Tsuyoshi OZAWA via jlowe)
YARN-1293. Fixed TestContainerLaunch#testInvalidEnvSyntaxDiagnostics failure
caused by non-English system locale. (Tsuyoshi OZAWA via jianhe)
YARN-1574. RMDispatcher should be reset on transition to standby. (Xuan Gong
via kasha)
YARN-1598. HA-related rmadmin commands don't work on a secure cluster (kasha)
YARN-1603. Remove two *.orig files which were unexpectedly committed.
(Zhijie Shen via junping_du)
YARN-1601. 3rd party JARs are missing from hadoop-dist output. (tucu)
YARN-1351. Invalid string format in Fair Scheduler log warn message
(Konstantin Weitz via Sandy Ryza)
YARN-1608. LinuxContainerExecutor has a few DEBUG messages at INFO level
(kasha)
YARN-1606. Fix the default value of yarn.resourcemanager.zk-timeout-ms
in yarn-default.xml (kasha)
YARN-1607. TestRM relies on the scheduler assigning multiple containers in
a single node update (Sandy Ryza)
YARN-1575. Public localizer crashes with "Localized unkown resource"
(jlowe)
YARN-1629. IndexOutOfBoundsException in MaxRunningAppsEnforcer (Sandy Ryza)
YARN-1628. Fixed the test failure in TestContainerManagerSecurity. (Vinod
Kumar Vavilapalli via zjshen)
Release 2.2.0 - 2013-10-13
INCOMPATIBLE CHANGES
YARN-1229. Define constraints on Auxiliary Service names. Change
ShuffleHandler service name from mapreduce.shuffle to
mapreduce_shuffle (Xuan Gong via sseth)
NEW FEATURES
IMPROVEMENTS
YARN-1246. Added application finish-status to ApplicationSummary for the sake
of testing given ApplicationHistoryServer is not yet ready. (Arpit Gupta via
vinodkv)
YARN-899. Added back queue level administrator-acls so that there is no
regression w.r.t 1.x. (Xuan Gong via vinodkv)
YARN-1228. Clean up Fair Scheduler configuration loading. (Sandy Ryza)
YARN-1213. Restore config to ban submitting to undeclared pools in the
Fair Scheduler. (Sandy Ryza)
YARN-1277. Added a policy based configuration for http/https in common
HttpServer and using the same in YARN - related to per project https config
support via HADOOP-10022. (Suresh Srinivas and Omkar Vinit Joshi via vinodkv)
OPTIMIZATIONS
BUG FIXES
YARN-1128. FifoPolicy.computeShares throws NPE on empty list of Schedulables
(Karthik Kambatla via Sandy Ryza)
YARN-1214. Register ClientToken MasterKey in SecretManager after it is
saved (Jian He via bikas)
YARN-49. Improve distributed shell application to work on a secure cluster.
(Vinod Kumar Vavilapalli via hitesh)
YARN-1157. Fixed ResourceManager UI to behave correctly when apps like
distributed-shell do not set tracking urls. (Xuan Gong via vinodkv)
YARN-1221. With Fair Scheduler, reserved MB reported in RM web UI increases
indefinitely (Siqi Li via Sandy Ryza)
YARN-1247. test-container-executor has gotten out of sync with the changes to
container-executor. (rvs via tucu)
YARN-1070. Fixed race conditions in NodeManager during container-kill.
(Zhijie Shen via vinodkv)
YARN-1215. Yarn URL should include userinfo. (Chuan Liu via cnauroth)
YARN-1262. TestApplicationCleanup relies on all containers assigned in a
single heartbeat (Karthik Kambatla via Sandy Ryza)
YARN-1260. Added webapp.http.address to yarn-default.xml so that default
install with https enabled doesn't have broken link on NM UI. (Omkar Vinit
Joshi via vinodkv)
YARN-1141. Updating resource requests should be decoupled with updating
blacklist (Zhijie Shen via bikas)
YARN-876. Node resource is added twice when node comes back from unhealthy
to healthy. (Peng Zhang via Sandy Ryza)
YARN-890. Ensure CapacityScheduler doesn't round-up metric for available
resources. (Xuan Gong & Hitesh Shah via acmurthy)
YARN-621. Changed YARN web app to not add paths that can cause duplicate
additions of authenticated filters there by causing kerberos replay errors.
(Omkar Vinit Joshi via vinodkv)
YARN-1236. FairScheduler setting queue name in RMApp is not working.
(Sandy Ryza)
YARN-1256. NM silently ignores non-existent service in
StartContainerRequest (Xuan Gong via bikas)
YARN-1149. NM throws InvalidStateTransitonException: Invalid event:
APPLICATION_LOG_HANDLING_FINISHED at RUNNING (Xuan Gong via hitesh)
YARN-1271. "Text file busy" errors launching containers again
(Sandy Ryza)
YARN-1131. $yarn logs command should return an appropriate error message if
YARN application is still running. (Siddharth Seth via hitesh)
YARN-1219. FSDownload changes file suffix making FileUtil.unTar() throw
exception. (Shanyu Zhao via cnauroth)
YARN-1251. TestDistributedShell#TestDSShell failed with timeout. (Xuan Gong
via hitesh)
YARN-1167. Fixed Distributed Shell to not incorrectly show empty hostname
on RM UI. (Xuan Gong via vinodkv)
YARN-1254. Fixed NodeManager to not pollute container's credentials. (Omkar
Vinit Joshi via vinodkv)
YARN-1273. Fixed Distributed-shell to account for containers that failed
to start. (Hitesh Shah via vinodkv)
YARN-1032. Fixed NPE in RackResolver. (Lohit Vijayarenu via acmurthy)
YARN-1090. Fixed CS UI to better reflect applications as non-schedulable
and not as pending. (Jian He via acmurthy)
YARN-1274. Fixed NodeManager's LinuxContainerExecutor to create user, app-dir
and log-dirs correctly even when there are no resources to localize for the
container. (Siddharth Seth via vinodkv)
YARN-1278. Fixed NodeManager to not delete local resources for apps on resync
command from RM - a bug caused by YARN-1149. (Hitesh Shah via vinodkv)
YARN-1463. Tests should avoid starting http-server where possible or creates
spnego keytab/principals (vinodkv via kasha)
Release 2.1.1-beta - 2013-09-23
INCOMPATIBLE CHANGES
YARN-707. Added user information also in the YARN ClientToken so that AMs
can implement authorization based on incoming users. (Jason Lowe via vinodkv)
YARN-1170. YARN & MapReduce proto definitions fixed to specify protobuf
package as hadoop.yarn and hadoop.mapreduce respectively. (Binglin Chang
via acmurthy)
NEW FEATURES
IMPROVEMENTS
YARN-589. Expose a REST API for monitoring the fair scheduler (Sandy Ryza).
YARN-1074. Cleaned up YARN CLI application list to only display running
applications by default. (Xuan Gong via vinodkv)
YARN-1093. Corrections to Fair Scheduler documentation (Wing Yew Poon via
Sandy Ryza)
YARN-942. In Fair Scheduler documentation, inconsistency on which
properties have prefix (Akira Ajisaka via Sandy Ryza)
YARN-1083. Changed ResourceManager to fail when the expiry interval is less
than the configured node-heartbeat interval. (Zhijie Shen via vinodkv)
YARN-1081. Made a trivial change to YARN node CLI header to avoid potential
confusion. (Akira AJISAKA via vinodkv)
YARN-1034. Remove "experimental" in the Fair Scheduler documentation.
(Karthik Kambatla via Sandy Ryza)
YARN-1080. Improved help message for "yarn logs" command. (Xuan Gong via
vinodkv)
YARN-771. AMRMClient support for resource blacklisting (Junping Du via
bikas)
YARN-1117. Improved help messages for "yarn application" and "yarn node"
commands. (Xuan Gong via vinodkv)
YARN-1120. Made ApplicationConstants.Environment.USER definition OS neutral
as the corresponding value is now set correctly end-to-end. (Chuan Liu via
vinodkv)
YARN-1124. Modified YARN CLI application list to display new and submitted
applications together with running apps by default, following up YARN-1074.
(Xuan Gong via vinodkv)
YARN-1065. NM should provide AuxillaryService data to the container (Xuan
Gong via bikas)
YARN-758. Augment MockNM to use multiple cores (Karthik Kambatla via
Sandy Ryza)
YARN-696. Changed RMWebservice apps call to take in multiple application
states. (Trevor Lorimer via vinodkv)
YARN-910. Augmented auxiliary services to listen for container starts and
completions in addition to application events. (Alejandro Abdelnur via
vinodkv)
YARN-1137. Add support whitelist for system users to Yarn
container-executor.c. (rvs via tucu)
YARN-1001. Added a web-service to get statistics about per application-type
per state for consumption by downstream projects. (Zhijie Shen via vinodkv)
YARN-1203. Changed YARN web-app proxy to handle http and https URLs from
AM registration and finish correctly. (Omkar Vinit Joshi via vinodkv)
YARN-1204. Added separate configuration properties for https for RM and NM
without which servers enabled with https will also start on http ports.
(Omkar Vinit Joshi via vinodkv)
OPTIMIZATIONS
BUG FIXES
YARN-948. Changed ResourceManager to validate the release container list
before actually releasing them. (Omkar Vinit Joshi via vinodkv)
YARN-966. Fixed ContainerLaunch to not fail quietly when there are no
localized resources due to some other failure. (Zhijie Shen via vinodkv)
YARN-502. Fixed a state machine issue with RMNode inside ResourceManager
which was crashing scheduler. (Mayank Bansal via vinodkv)
YARN-573. Shared data structures in Public Localizer and Private Localizer
are not Thread safe. (Omkar Vinit Joshi via jlowe)
YARN-903. Changed ContainerManager to suppress unnecessary warnings when
stopping already stopped containers. (Omkar Vinit Joshi via vinodkv)
YARN-906. Fixed a bug in NodeManager where cancelling ContainerLaunch at
KILLING state causes that the container to hang. (Zhijie Shen via vinodkv)
YARN-994. HeartBeat thread in AMRMClientAsync does not handle runtime
exception correctly (Xuan Gong via bikas)
YARN-337. RM handles killed application tracking URL poorly (jlowe)
YARN-107. Fixed ResourceManager and clients to better handle
forceKillApplication on non-running and finished applications. (Xuan Gong
via vinodkv)
YARN-643. Fixed ResourceManager to remove all tokens consistently on app
finish. (Xuan Gong via vinodkv)
YARN-1006. Fixed broken rendering in the Nodes list web page on the RM web
UI. (Xuan Gong via vinodkv)
YARN-881. Priority#compareTo method seems to be wrong. (Jian He via bikas)
YARN-1082. Create base directories on HDFS after RM login to ensure RM
recovery doesn't fail in secure mode. (vinodkv via acmurthy)
YARN-1085. Modified YARN and MR2 web-apps to do HTTP authentication in
secure setup with kerberos. (Omkar Vinit Joshi via vinodkv)
YARN-1094. Fixed a blocker with RM restart code because of which RM crashes
when try to recover an existing app. (vinodkv)
YARN-1008. MiniYARNCluster with multiple nodemanagers, all nodes have same
key for allocations. (tucu)
YARN-981. Fixed YARN webapp so that /logs servlet works like before. (Jian He
via vinodkv)
YARN-602. Fixed NodeManager to not let users override some mandatory
environmental variables. (Kenji Kikushima via vinodkv)
YARN-1101. Active nodes can be decremented below 0 (Robert Parker
via tgraves)
YARN-1077. Fixed TestContainerLaunch test failure on Windows. (Chuan Liu via
vinodkv)
YARN-957. Fixed a bug in CapacityScheduler because of which requests that
need more than a node's total capability were incorrectly allocated on that
node causing apps to hang. (Omkar Vinit Joshi via vinodkv)
YARN-1107. Fixed a bug in ResourceManager because of which RM in secure mode
fails to restart. (Omkar Vinit Joshi via vinodkv)
YARN-1049. ContainerExistStatus should define a status for preempted
containers. (tucu)
YARN-1144. Unmanaged AMs registering a tracking URI should not be
proxy-fied. (tucu)
YARN-1152. Fixed a bug in ResourceManager that was causing clients to get
invalid client token key errors when an appliation is about to finish.
(Jason Lowe via vinodkv)
YARN-292. Fixed FifoScheduler and FairScheduler to make their applications
data structures thread safe to avoid RM crashing with
ArrayIndexOutOfBoundsException. (Zhijie Shen via vinodkv)
YARN-1025. ResourceManager and NodeManager do not load native libraries on
Windows. (cnauroth)
YARN-1176. RM web services ClusterMetricsInfo total nodes doesn't include
unhealthy nodes (Jonathan Eagles via tgraves)
YARN-1078. TestNodeManagerResync, TestNodeManagerShutdown, and
TestNodeStatusUpdater fail on Windows. (Chuan Liu via cnauroth)
YARN-1194. TestContainerLogsPage fails with native builds (Roman Shaposhnik
via jlowe)
YARN-1116. Populate AMRMTokens back to AMRMTokenSecretManager after RM
restarts (Jian He via bikas)
YARN-1189. NMTokenSecretManagerInNM is not being told when applications
have finished (Omkar Vinit Joshi via jlowe)
YARN-540. Race condition causing RM to potentially relaunch already
unregistered AMs on RM restart (Jian He via bikas)
YARN-1184. ClassCastException during preemption enforcement. (cdouglas)
Release 2.1.0-beta - 2013-08-22
INCOMPATIBLE CHANGES
YARN-396. Rationalize AllocateResponse in RM Scheduler API. (Zhijie Shen
via hitesh)
YARN-439. Flatten NodeHeartbeatResponse. (Xuan Gong via sseth)
YARN-440. Flatten RegisterNodeManagerResponse. (Xuan Gong via sseth)
YARN-536. Removed the unused objects ContainerStatus and ContainerStatus from
Container which also don't belong to the container. (Xuan Gong via vinodkv)
YARN-486. Changed NM's startContainer API to accept Container record given by
RM as a direct parameter instead of as part of the ContainerLaunchContext
record. (Xuan Gong via vinodkv)
YARN-444. Moved special container exit codes from YarnConfiguration to API
where they belong. (Sandy Ryza via vinodkv)
YARN-441. Removed unused utility methods for collections from two API
records. (Xuan Gong via vinodkv)
YARN-561. Modified NodeManager to set key information into the environment
of every container that it launches. (Xuan Gong via vinodkv)
YARN-579. Stop setting the Application Token in the AppMaster env, in
favour of the copy present in the container token field.
(Vinod Kumar Vavilapalli via sseth)
YARN-629. Make YarnRemoteException not be rooted at IOException. (Xuan Gong
via vinodkv)
YARN-633. Changed RMAdminProtocol api to throw IOException and
YarnRemoteException. (Xuan Gong via vinodkv)
YARN-632. Changed ContainerManager api to throw IOException and
YarnRemoteException. (Xuan Gong via vinodkv)
YARN-631. Changed ClientRMProtocol api to throw IOException and
YarnRemoteException. (Xuan Gong via vinodkv)
YARN-630. Changed AMRMProtocol api to throw IOException and
YarnRemoteException. (Xuan Gong via vinodkv)
YARN-615. Rename ContainerLaunchContext.containerTokens to tokens.
(Vinod Kumar Vavilapalli via sseth)
YARN-571. Remove user from ContainerLaunchContext. (Omkar Vinit Joshi via
vinodkv)
YARN-716. Making ApplicationID immutable. (Siddharth Seth via vinodkv)
YARN-684. ContainerManager.startContainer should use
ContainerTokenIdentifier instead of the entire Container.
(Vinod Kumar Vavilapalli via sseth)
YARN-735. Make ApplicationAttemptId, ContaienrId and NodeId immutable.
(Jian He via sseth)
YARN-749. Rename ResourceRequest.(get,set)HostName to
ResourceRequest.(get,set)ResourceName. (acmurthy)
YARN-720. container-log4j.properties should not refer to mapreduce
property names. (Zhijie Shen via sseth)
YARN-748. Moved BuilderUtils from yarn-common to yarn-server-common for
eventual retirement. (Jian He via vinodkv)
YARN-635. Renamed YarnRemoteException to YarnException. (Siddharth Seth via
vinodkv)
YARN-755. Renamed AllocateResponse.reboot to AllocateResponse.resync. (Bikas
Saha via vinodkv)
YARN-753. Added individual factory methods for all api protocol records and
converted the records to be abstract classes. (Jian He via vinodkv)
YARN-724. Moved ProtoBase from api.records to api.records.impl.pb. (Jian He
via vinodkv)
YARN-759. Create Command enum in AllocateResponse (bikas)
YARN-777. Removed unreferenced objects from .proto files. (Jian He via
vinodkv)
YARN-642. Removed health parameter from ResourceManager /nodes web-service
and cleaned the behaviour of the status parameter. (Sandy Ryza vid vinodkv)
YARN-530. Defined Service model strictly, implemented AbstractService for
robust subclassing and migrated yarn-common services. (Steve Loughran via
vinodkv)
YARN-746. Renamed Service.register() and Service.unregister() to
registerServiceListener() & unregisterServiceListener() respectively.
(Steve Loughran via vinodkv)
YARN-792. Moved NodeHealthStatus from yarn.api.record to
yarn.server.api.record. (Jian He via vinodkv)
YARN-806. Moved ContainerExitStatus from yarn.api to yarn.api.records. (Jian
He via vinodkv)
YARN-821. Renamed setFinishApplicationStatus to setFinalApplicationStatus in
FinishApplicationMasterRequest for consistency. (Jian He via vinodkv)
YARN-787. Removed minimum resource from RegisterApplicationMasterResponse.
(tucu via acmurthy)
YARN-829. Renamed RMTokenSelector to be RMDelegationTokenSelector. (Zhijie
Shen via vinodkv)
YARN-828. Removed the unsed YarnVersionAnnotation. (Zhijie Shen via vinodkv)
YARN-823. Moved RMAdmin from yarn.client to yarn.client.cli and renamed it to
be RMAdminCLI. (Jian He via vinodkv)
YARN-387. Renamed YARN protocols for consistency.
ClientRMProtocol -> ApplicationClientProtocol
AMRMProtocol -> ApplicationMasterProtocol
ContainerManager -> ContainerManagementProtocol
(vinodkv via acmurthy)
YARN-831. Removed minimum resource from GetNewApplicationResponse as a
follow-up to YARN-787. (Jian He via acmurthy)
YARN-824. Added static factory methods to hadoop-yarn-client interfaces.
(Jian He via acmurthy)
YARN-826. Moved Clock and SystemClock into yarn.util package. (Zhijie Shen
via vinodkv)
YARN-837. Moved yarn.ClusterInfo into MapReduce project as it doesn't belong
to YARN. (Zhijie Shen via vinodkv)
YARN-822. Renamed ApplicationToken to be AMRMToken, and similarly the
corresponding TokenSelector and SecretManager. (Omkar Vinit Joshi via vinodkv)
YARN-610. ClientToken is no longer set in the environment of the Containers.
(Omkar Vinit Joshi via vinodkv)
YARN-834. Fixed annotations for yarn-client module, reorganized packages and
clearly differentiated *Async apis. (Arun C Murthy and Zhijie Shen via
vinodkv)
YARN-840. Moved ProtoUtils to yarn.api.records.pb.impl. (Jian He via
acmurthy)
YARN-841. Move Auxiliary service to yarn-api, annotate and document it.
(vinodkv)
YARN-850. Rename getClusterAvailableResources to getAvailableResources in
AMRMClients (Jian He via bikas)
YARN-694. Starting to use NMTokens to authenticate all communication with
NodeManagers. (Omkar Vinit Joshi via vinodkv)
YARN-553. Replaced YarnClient.getNewApplication with
YarnClient.createApplication which provides a directly usable
ApplicationSubmissionContext to simplify the api. (Karthik Kambatla via
acmurthy)
YARN-851. Share NMTokens using NMTokenCache (api-based) between AMRMClient
and NMClient instead of memory based approach which is used currently. (Omkar
Vinit Joshi via vinodkv)
YARN-869. Move ResourceManagerAdministrationProtocol out of main YARN api.
(vinodkv via acmurthy)
YARN-791. Changed RM APIs and web-services related to nodes to ensure that
both are consistent with each other. (Sandy Ryza via vinodkv)
YARN-727. ClientRMProtocol.getAllApplications should accept ApplicationType as
a parameter. (Xuan Gong via hitesh)
YARN-701. Use application tokens irrespective of secure or non-secure
mode. (vinodkv via acmurthy)
YARN-918. Remove ApplicationAttemptId from
RegisterApplicationMasterRequestProto. (vinodkv via acmurthy)
YARN-926. Modified ContainerManagerProtcol APIs to take in requests for
multiple containers. (Jian He via vinodkv)
NEW FEATURES
YARN-482. FS: Extend SchedulingMode to intermediate queues.
(kkambatl via tucu)
YARN-45. Add protocol for schedulers to request containers back from
ApplicationMasters. (Carlo Curino, cdouglas)
YARN-563. Add the concept of an application-type for each application.
(Mayank Bansal via vinodkv)
HADOOP-8562. Enhancements to support Hadoop on Windows Server and Windows
Azure environments. (See breakdown of tasks below for subtasks and
contributors)
YARN-422. Add a NM Client library to help application-writers. (Zhijie Shen
via vinodkv)
YARN-392. Make it possible to specify hard locality constraints in resource
requests. (sandyr via tucu)
YARN-326. Add multi-resource scheduling to the fair scheduler.
(sandyr via tucu)
YARN-398. Make it possible to specify hard locality constraints in resource
requests for CapacityScheduler. (acmurthy)
YARN-781. Exposing LOGDIR in all containers' environment which should be used
by containers for logging purposes. (Jian He via vinodkv)
IMPROVEMENTS
YARN-347. Node CLI should show CPU info besides memory in node status.
(Junping Du via llu)
YARN-365. Change NM heartbeat handling to not generate a scheduler event
on each heartbeat. (Xuan Gong via sseth)
YARN-380. Fix yarn node -status output to be better readable. (Omkar Vinit
Joshi via vinodkv)
YARN-410. Fixed RM UI so that the new lines diagnostics for a failed app on
the per-application page are translated to html line breaks. (Omkar Vinit
Joshi via vinodkv)
YARN-198. Added a link to RM pages from the NodeManager web app. (Jian He
via vinodkv)
YARN-237. Refreshing the RM page forgets how many rows I had in my
Datatables (jian he via bobby)
YARN-481. Add AM Host and RPC Port to ApplicationCLI Status Output
(Chris Riccomini via bikas)
YARN-297. Improve hashCode implementations for PB records. (Xuan Gong via
hitesh)
YARN-417. Create AMRMClient wrapper that provides asynchronous callbacks.
(Sandy Ryza via bikas)
YARN-497. Yarn unmanaged-am launcher jar does not define a main class in
its manifest (Hitesh Shah via bikas)
YARN-469. Make scheduling mode in FS pluggable. (kkambatl via tucu)
YARN-450. Define value for * in the scheduling protocol (Zhijie Shen via
bikas)
YARN-475. Remove a unused constant in the public API -
ApplicationConstants.AM_APP_ATTEMPT_ID_ENV. (Hitesh Shah via vinodkv)
YARN-309. Changed NodeManager to obtain heart-beat interval from the
ResourceManager. (Xuan Gong via vinodkv)
YARN-447. Move ApplicationComparator in CapacityScheduler to use comparator
in ApplicationId. (Nemon Lou via vinodkv)
YARN-381. Improve fair scheduler docs. (Sandy Ryza via tomwhite)
YARN-458. YARN daemon addresses must be placed in many different configs.
(sandyr via tucu)
YARN-193. Scheduler.normalizeRequest does not account for allocation
requests that exceed maximumAllocation limits (Zhijie Shen via bikas)
YARN-479. NM retry behavior for connection to RM should be similar for
lost heartbeats (Jian He via bikas)
YARN-495. Changed NM reboot behaviour to be a simple resync - kill all
containers and re-register with RM. (Jian He via vinodkv)
YARN-514. Delayed store operations should not result in RM unavailability
for app submission (Zhijie Shen via bikas)
YARN-586. Fixed a typo in ApplicationSubmissionContext#setApplicationId.
(Zhijie Shen via vinodkv)
YARN-542. Changed the default global AM max-attempts value to be not one.
(Zhijie Shen via vinodkv)
YARN-583. Moved application level local resources to be localized under the
filecache sub-directory under application directory. (Omkar Vinit Joshi via
vinodkv)
YARN-581. Added a test to verify that app delegation tokens are restored
after RM restart. (Jian He via vinodkv)
YARN-577. Add application-progress also to ApplicationReport. (Hitesh Shah
via vinodkv)
YARN-595. Refactor fair scheduler to use common Resources. (Sandy Ryza
via tomwhite)
YARN-562. Modified NM to reject any containers allocated by a previous
ResourceManager. (Jian He via vinodkv)
YARN-591. Moved RM recovery related records out of public API as they do not
belong there. (vinodkv)
YARN-599. Refactoring submitApplication in ClientRMService and RMAppManager
to separate out various validation checks depending on whether they rely on
RM configuration or not. (Zhijie Shen via vinodkv)
YARN-618. Modified RM_INVALID_IDENTIFIER to be -1 instead of zero. (Jian He
via vinodkv)
YARN-625. Move the utility method unwrapAndThrowException from
YarnRemoteExceptionPBImpl to RPCUtil. (Siddharth Seth via vinodkv)
YARN-645. Moved RMDelegationTokenSecretManager from yarn-server-common to
yarn-server-resourcemanager where it really belongs. (Jian He via vinodkv)
YARN-651. Changed PBClientImpls of ContainerManager and RMAdmin to throw
IOExceptions also. (Xuan Gong via vinodkv)
YARN-582. Changed ResourceManager to recover Application token and client
tokens for app attempt so that RM can be restarted while preserving current
applications. (Jian He via vinodkv)
YARN-568. Add support for work preserving preemption to the FairScheduler.
(Carlo Curino and Sandy Ryza via cdouglas)
YARN-598. Add virtual cores to queue metrics. (sandyr via tucu)
YARN-634. Modified YarnRemoteException to be not backed by PB and introduced
a separate SerializedException record. (Siddharth Seth via vinodkv)
YARN-663. Changed ResourceTracker API and LocalizationProtocol API to throw
YarnRemoteException and IOException. (Xuan Gong via vinodkv)
YARN-590. Added an optional mesage to be returned by ResourceMaanger when RM
asks an RM to shutdown/resync etc so that NMs can log this message locally
for better debuggability. (Mayank Bansal via vinodkv)
YARN-617. Made ContainerTokens to be used for validation at NodeManager
also in unsecure mode to prevent AMs from faking resource requirements in
unsecure mode. (Omkar Vinit Joshi via vinodkv)
YARN-708. Moved RecordFactory classes to hadoop-yarn-api, and put some
miscellaneous fixes to the interfaces. (Siddharth Seth via vinodkv)
YARN-711. Copied BuilderUtil methods in individual API records as
BuilderUtils is going to be dismantled. (Jian He via vinodkv)
YARN-714. Added NMTokens to be sent to AMs as part of heart-beat response.
(Omkar Vinit Joshi via vinodkv)
YARN-638. Modified ResourceManager to restore RMDelegationTokens after
restarting. (Jian He via vinodkv)
YARN-660. Improve AMRMClient with matching requests (bikas)
YARN-717. Put object creation factories for Token in the class itself and
remove useless derivations for specific tokens. (Jian He via vinodkv)
YARN-756. Move Preemption* records to yarn.api where they really belong.
(Jian He via vinodkv)
YARN-750. Allow for black-listing resources in YARN API and Impl in CS
(acmurthy via bikas)
YARN-877. Support resource blacklisting for FifoScheduler.
(Junping Du via llu)
YARN-686. Flatten NodeReport. (sandyr via tucu)
YARN-737. Throw some specific exceptions directly instead of wrapping them
in YarnException. (Jian He via sseth)
YARN-731. RPCUtil.unwrapAndThrowException should unwrap remote
RuntimeExceptions. (Zhijie Shen via sseth)
YARN-600. Hook up cgroups CPU settings to the number of virtual cores
allocated. (sandyr via tucu)
YARN-648. FS: Add documentation for pluggable policy. (kkambatl via tucu)
YARN-773. Moved YarnRuntimeException from package api.yarn to
api.yarn.exceptions. (Jian He via vinodkv)
YARN-692. Creating NMToken master key on RM and sharing it with NM as a part
of RM-NM heartbeat. (Omkar Vinit Joshi via vinodkv)
YARN-782. vcores-pcores ratio functions differently from vmem-pmem ratio in
misleading way. (sandyr via tucu)
YARN-803. factor out scheduler config validation from the ResourceManager
to each scheduler implementation. (tucu)
YARN-789. Enable zero capabilities resource requests in fair scheduler.
(tucu)
YARN-639. Modified Distributed Shell application to start using the new
NMClient library. (Zhijie Shen via vinodkv)
YARN-693. Modified RM to send NMTokens on allocate call so that AMs can then
use them for authentication with NMs. (Omkar Vinit Joshi via vinodkv)
YARN-752. In AMRMClient, automatically add corresponding rack requests for
requested nodes. (sandyr via tucu)
YARN-825. Fixed javadoc and annotations for yarn-common module. (vinodkv)
YARN-833. Moved Graph and VisualizeStateMachine into yarn.state package.
(Zhijie Shen via vinodkv)
YARN-805. Fix javadoc and annotations on classes in the yarn-api
package. (Jian He via sseth)
YARN-846. Move pb Impl classes from yarn-api to yarn-common. (Jian He via
vinodkv)
YARN-827. Need to make Resource arithmetic methods accessible (Jian He via
bikas)
YARN-866. Add test for class ResourceWeights. (ywskycn via tucu)
YARN-736. Add a multi-resource fair sharing metric. (sandyr via tucu)
YARN-883. Expose Fair Scheduler-specific queue metrics. (sandyr via tucu)
YARN-569. Add support for requesting and enforcing preemption requests via
a capacity monitor. (Carlo Curino, cdouglas)
YARN-521. Augment AM - RM client module to be able to request containers
only at specific locations (Sandy Ryza via bikas)
YARN-513. Create common proxy client for communicating with RM. (Xuan Gong
& Jian He via bikas)
YARN-927. Change ContainerRequest to not have more than 1 container count
and remove StoreContainerRequest (bikas)
YARN-922. Change FileSystemRMStateStore to use directories (Jian He via
bikas)
YARN-865. RM webservices can't query based on application Types. (Xuan Gong
via hitesh)
YARN-912. Move client facing exceptions to yarn-api module. (Mayank Bansal
via vinodkv)
YARN-84. Use Builder to build RPC server. (Brandon Li via szetszwo)
YARN-1046. Disable mem monitoring by default in MiniYARNCluster. (Karthik
Kambatla via Sandy Ryza)
YARN-1045. Improve toString implementation for PBImpls. (Jian He via sseth)
OPTIMIZATIONS
YARN-512. Log aggregation root directory check is more expensive than it
needs to be. (Maysam Yabandeh via jlowe)
YARN-719. Move RMIdentifier from Container to ContainerTokenIdentifier.
(Vinod Kumar Vavilapalli via sseth)
BUG FIXES
YARN-383. AMRMClientImpl should handle null rmClient in stop()
(Hitesh Shah via sseth)
YARN-385. Add missing fields - location and #containers to
ResourceRequestPBImpl's toString(). (Sandy Ryza via sseth)
YARN-377. Use the new StringUtils methods added by HADOOP-9252 and fix
TestContainersMonitor. (Chris Nauroth via szetszwo)
YARN-391. Formatting fixes for LCEResourceHandler classes.
(Steve Loughran via sseth)
YARN-390. ApplicationCLI and NodeCLI hard-coded platform-specific line
separator causes test failures on Windows. (Chris Nauroth via suresh)
YARN-406. Fix TestRackResolver to function in networks where "host1"
resolves to a valid host. (Hitesh Shah via sseth)
YARN-376. Fixes a bug which would prevent the NM knowing about completed
containers and applications. (Jason Lowe via sseth)
YARN-196. Nodemanager should be more robust in handling connection failure
to ResourceManager when a cluster is started (Xuan Gong via hitesh)
YARN-485. TestProcfsProcessTree#testProcessTree() doesn't wait long enough
for the process to die. (kkambatl via tucu)
YARN-71. Fix the NodeManager to clean up local-dirs on restart.
(Xuan Gong via sseth)
YARN-378. Fix RM to make the AM max attempts/retries to be configurable
per application by clients. (Zhijie Shen via vinodkv)
YARN-498. Unmanaged AM launcher does not set various constants in env for
an AM, also does not handle failed AMs properly. (Hitesh Shah via bikas)
YARN-496. Fair scheduler configs are refreshed inconsistently in
reinitialize. (Sandy Ryza via tomwhite)
YARN-474. Fix CapacityScheduler to trigger application-activation when
am-resource-percent configuration is refreshed. (Zhijie Shen via vinodkv)
YARN-209. Fix CapacityScheduler to trigger application-activation when
the cluster capacity changes. (Zhijie Shen via vinodkv)
YARN-24. Nodemanager fails to start if log aggregation enabled and
namenode unavailable. (sandyr via tucu)
YARN-515. Node Manager not getting the master key. (Robert Joseph Evans
via jlowe)
YARN-382. SchedulerUtils improve way normalizeRequest sets the resource
capabilities. (Zhijie Shen via bikas)
YARN-467. Modify public distributed cache to localize files such that no
local directory hits unix file count limits and thus prevent job failures.
(Omkar Vinit Joshi via vinodkv)
YARN-101. Fix NodeManager heartbeat processing to not lose track of completed
containers in case of dropped heartbeats. (Xuan Gong via vinodkv)
YARN-538. RM address DNS lookup can cause unnecessary slowness on every JHS
page load. (sandyr via tucu)
YARN-532. Change RMAdmin and Localization client protocol PB implementations
to implement closeable so that they can be stopped when needed via
RPC.stopProxy(). (Siddharth Seth via vinodkv)
YARN-99. Modify private distributed cache to localize files such that no
local directory hits unix file count limits and thus prevent job failures.
(Omkar Vinit Joshi via vinodkv)
YARN-112. Fixed a race condition during localization that fails containers.
(Omkar Vinit Joshi via vinodkv)
YARN-534. Change RM restart recovery to also account for AM max-attempts
configuration after the restart. (Jian He via vinodkv)
YARN-539. Addressed memory leak of LocalResource objects NM when a resource
localization fails. (Omkar Vinit Joshi via vinodkv)
YARN-319. Submitting a job to a fair scheduler queue for which the user
does not have permission causes the client to wait forever.
(shenhong via tomwhite)
YARN-412. Fixed FifoScheduler to check hostname of a NodeManager rather
than its host:port during scheduling which caused incorrect locality for
containers. (Roger Hoover via acmurthy)
YARN-500. Fixed YARN webapps to not roll-over ports when explicitly asked
to use non-ephemeral ports. (Kenji Kikushima via vinodkv)
YARN-518. Fair Scheduler's document link could be added to the hadoop 2.x
main doc page. (sandyr via tucu)
YARN-476. ProcfsBasedProcessTree info message confuses users.
(sandyr via tucu)
YARN-585. Fix failure in TestFairScheduler#testNotAllowSubmitApplication
caused by YARN-514. (Zhijie Shen via vinodkv)
YARN-547. Fixed race conditions in public and private resource localization
which used to cause duplicate downloads. (Omkar Vinit Joshi via vinodkv)
YARN-594. Update test and add comments in YARN-534 (Jian He via bikas)
YARN-549. YarnClient.submitApplication should wait for application to be
accepted by the RM (Zhijie Shen via bikas)
YARN-605. Fix failing unit test in TestNMWebServices when versionInfo has
parantheses like when running on a git checkout. (Hitesh Shah via vinodkv)
YARN-289. Fair scheduler allows reservations that won't fit on node.
(Sandy Ryza via tomwhite)
YARN-576. Modified ResourceManager to reject NodeManagers that don't satisy
minimum resource requirements. (Kenji Kikushima via vinodkv)
YARN-646. Fix two typos in Fair Scheduler user guide. (Dapeng Sun via atm)
YARN-507. Add interface visibility and stability annotations to FS
interfaces/classes. (kkambatl via tucu)
YARN-637. FS: maxAssign is not honored. (kkambatl via tucu)
YARN-655. Fair scheduler metrics should subtract allocated memory from
available memory. (sandyr via tucu)
YARN-628. Fix the way YarnRemoteException is being unrolled to extract out
the underlying exception. (Siddharth Seth via vinodkv)
YARN-695. Remove masterContainer and status unused fields from
ApplicationReportProto and fix bugs in ApplicationReportPBImpl. (Zhijie Shen
via vinodkv)
YARN-706. Fixed race conditions in TestFSDownload. (Zhijie Shen via vinodkv).
YARN-715. Fixed unit test failures - TestDistributedShell and
TestUnmanagedAMLauncher. (Vinod Kumar Vavilapalli via sseth)
YARN-578. Fixed NM to use SecureIOUtils for reading and aggregating logs.
(Omkar Vinit Joshi via vinodkv)
YARN-733. Fixed TestNMClient from failing occasionally. (Zhijie Shen via
vinodkv)
YARN-730. Fix NMClientAsync to remove completed containers. (Zhijie Shen
via acmurthy)
YARN-726. Fix queue & finish time fields in web-ui for ResourceManager.