(TWILL-63) Speed up application launch time

The general approach is better jar files management and to cache and reuse jar files created through
class dependency tracing. The changes are further broken down as follows:

1. Refactor jars generation
  - One jar containing the TwillLauncher (launcher.jar), created through dependency tracing.
    - This jar is the same for all applications.
  - One jar containing all twill classes (twill.jar), created through dependency tracing.
    - This jar is the same for all applications.
  - One jar containing the application class, created through dependency tracing.
    - This jar is generated based on the application being launched. It is reusable when launching the same app multiple times.
  - One jar containing user resources setup through TwillPreparer.
    - This jar is not reused between apps.
  - One jar containing runtime config needed by Twill
    - logback.xml, jvm opts, environment, classpaths, ... etc
2. Let YARN to expand jars when localizing to containers instead of expanding it programatically
  - This save time in jar expansion when multiple containers are running on the same host
3. Introduce a new configuration "twill.location.cache.dir" to enable jar caching and reuse
  - Currently only the launcher.jar, twill.jar and application jar will be cached and reuse when possible
  - Cache cleanup logic is also in place to remove files in cache directory that is no longer used by application
4. The ApplicationBundler is improved to allow more flexible usage

This closes #21 on Github.

Signed-off-by: Terence Yim <chtyim@apache.org>
30 files changed
tree: 4bb6c976f09f60bee227522c75b67b44169ebee6
  1. twill-api/
  2. twill-common/
  3. twill-core/
  4. twill-discovery-api/
  5. twill-discovery-core/
  6. twill-examples/
  7. twill-ext/
  8. twill-java8-test/
  9. twill-yarn/
  10. twill-zookeeper/
  11. .gitignore
  12. .reviewboardrc
  13. .travis.yml
  14. checkstyle.xml
  15. LICENSE
  16. NOTICE
  17. pom.xml
  18. README.md
README.md

What is Apache Twill?

Twill is an abstraction over Apache Hadoop® YARN that reduces the complexity of developing distributed applications, allowing developers to focus more on their business logic. Twill allows you to use YARN’s distributed capabilities with a programming model that is similar to running threads.

Getting Started

You can build and install the Apache Twill by:

    git clone https://git-wip-us.apache.org/repos/asf/twill.git
    cd twill
    mvn install

After the maven installation completes, you can include the artifact org.apache.twill:twill-yarn as a dependency on your other projects.

Export Control

This distribution includes cryptographic software. The country in which you currently reside may have restrictions on the import, possession, use, and/or re-export to another country, of encryption software. BEFORE using any encryption software, please check your country's laws, regulations and policies concerning the import, possession, or use, and re-export of encryption software, to see if this is permitted. See http://www.wassenaar.org/ for more information.

The U.S. Government Department of Commerce, Bureau of Industry and Security (BIS), has classified this software as Export Commodity Control Number (ECCN) 5D002.C.1, which includes information security software using or performing cryptographic functions with asymmetric algorithms. The form and manner of this Apache Software Foundation distribution makes it eligible for export under the License Exception ENC Technology Software Unrestricted (TSU) exception (see the BIS Export Administration Regulations, Section 740.13) for both object code and source code.

The following provides more details on the included cryptographic software:

Apache Twill uses the built-in java cryptography libraries for unique ID generation. See http://www.oracle.com/us/products/export/export-regulations-345813.html for more details on Java's cryptography features.