commit | 3f9a1a5222a3d0acd828e6d08997f01c1abfa42e | [log] [tgz] |
---|---|---|
author | Wing Yew Poon <wypoon@cloudera.com> | Mon Mar 16 17:46:29 2020 +0100 |
committer | Marco Gaido <mgaido@apache.org> | Mon Mar 16 17:46:29 2020 +0100 |
tree | b2dbfcb79433d70c0e27c65fff34d35cfe7b0326 | |
parent | 06a8d4f67c910d0cd8327b1547134d641d793acb [diff] |
[LIVY-752][THRIFT] Fix implementation of limits on connections. ## What changes were proposed in this pull request? `LivyThriftSessionManager` keeps a `ConcurrentHashMap[String, AtomicLong]` named `connectionsCount` to track the number of connections per user, etc. The `incrementConnectionsCount` and `decrementConnectionsCount` methods in `LivyThriftSessionManager` check that `connectionsCount` does not contain a key (instead of contains the key) before getting the value and incrementing or decrementing the count (leading to a `NullPointerException`). Even accounting for the incorrect condition, they do not use the `ConcurrentHashMap` correctly. There is a race -- a thread can get a count, find that it's within a limit, create a new session and then increment the count, while in the meantime, another thread could have incremented the count and so the limit is now actually exceeded. We increment all relevant counts optimistically before creating a new session, check if any limits are violated, and if so, decrement all incremented counts. ## How was this patch tested? Tested by deploying the change on a cluster and setting livy.server.thrift.limit.connections.per.user. Verified that the number of connections reaches but does not exceed the limit. Also, added basic unit tests that connection limits are enforced. Author: Wing Yew Poon <wypoon@cloudera.com> Closes #284 from wypoon/LIVY-752.
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in Apache Hadoop YARN.
Pull requests are welcomed! But before you begin, please check out the Contributing section on the Community page of our website.
Guides and documentation on getting started using Livy, example code snippets, and Livy API documentation can be found at livy.incubator.apache.org.
To build Livy, you will need:
Debian/Ubuntu:
maven
package or maven3 tarball)Redhat/CentOS:
maven
package or maven3 tarball)MacOS:
Required python packages for building Livy:
To run Livy, you will also need a Spark installation. You can get Spark releases at https://spark.apache.org/downloads.html.
Livy requires Spark 2.2+. You can switch to a different version of Spark by setting the SPARK_HOME
environment variable in the Livy server process, without needing to rebuild Livy.
Livy is built using Apache Maven. To check out and build Livy, run:
git clone https://github.com/apache/incubator-livy.git cd incubator-livy mvn package
By default Livy is built against Apache Spark 2.2.0, but the version of Spark used when running Livy does not need to match the version used to build Livy. Livy internally handles the differences between different Spark versions.
The Livy package itself does not contain a Spark distribution. It will work with any supported version of Spark without needing to rebuild.