[REEF-1482] Driver does not exit even if all the task exit normally

Currently, when all the tasks and all the evaluators are completed,
sometimes driver still doen't shut down and hungs there forever.
This happens intermittently. When there are many nodes like 500 nodes
in IMRU runs in Yarn cluster, the issue can happen in every 2 or 3 runs.

The investigation shows there is a potential dead lock in ResourceManagerStatus.
This PR is to resolve this issue by reducing the scope of code under locks.

With the fixes, I have tested 10 times with 500 nodes in cluster, there is no repro any more.

JIRA:
  [REEF-1482](https://issues.apache.org/jira/browse/REEF-1482)

Pull request:
  This closes #1162
2 files changed
tree: ef7476fe787cfb8e09508c2fdd20b0654b6498b9
  1. bin/
  2. dev/
  3. lang/
  4. website/
  5. .gitattributes
  6. .gitignore
  7. .travis.yml
  8. appveyor.yml
  9. Doxyfile
  10. HEADER
  11. LICENSE
  12. NOTICE
  13. pom.xml
  14. README.md
README.md

Apache REEF™

Apache REEF™ (Retainable Evaluator Execution Framework) is a library for developing portable applications for cluster resource managers such as Apache Hadoop YARN or Apache Mesos. For example, Microsoft Azure Stream Analytics is built on REEF and Hadoop.

Online Documentation

Detailed information on REEF can be found in the following places:

The developer mailing list is the best way to reach REEF‘s developers when the above aren’t sufficient.

Build Status

ComponentOSStatus
REEF JavaUbuntuBuild Status
REEF.NETWindowsBuild status

Building REEF

Java.NET
Build & run unit testsjava\BUILD.mdcs\BUILD.md

Releases

downloads NuGet package

License

Apache 2.0