| Performance Notes |
| ================= |
| The Python driver for Cassandra offers several methods for executing queries. |
| You can synchronously block for queries to complete using |
| :meth:`.Session.execute()`, you can obtain asynchronous request futures through |
| :meth:`.Session.execute_async()`, and you can attach a callback to the future |
| with :meth:`.ResponseFuture.add_callback()`. |
| |
| Examples of multiple request patterns can be found in the benchmark scripts included in the driver project. |
| |
| The choice of execution pattern will depend on the application context. For applications dealing with multiple |
| requests in a given context, the recommended pattern is to use concurrent asynchronous |
| requests with callbacks. For many use cases, you don't need to implement this pattern yourself. |
| :meth:`cassandra.concurrent.execute_concurrent` and :meth:`cassandra.concurrent.execute_concurrent_with_args` |
| provide this pattern with a synchronous API and tunable concurrency. |
| |
| Due to the GIL and limited concurrency, the driver can become CPU-bound pretty quickly. The sections below |
| discuss further runtime and design considerations for mitigating this limitation. |
| |
| PyPy |
| ---- |
| `PyPy <http://pypy.org>`_ is an alternative Python runtime which uses a JIT compiler to |
| reduce CPU consumption. This leads to a huge improvement in the driver performance, |
| more than doubling throughput for many workloads. |
| |
| Cython Extensions |
| ----------------- |
| `Cython <http://cython.org/>`_ is an optimizing compiler and language that can be used to compile the core files and |
| optional extensions for the driver. Cython is not a strict dependency, but the extensions will be built by default. |
| |
| See :doc:`installation` for details on controlling this build. |
| |
| multiprocessing |
| --------------- |
| All of the patterns discussed above may be used over multiple processes using the |
| `multiprocessing <http://docs.python.org/2/library/multiprocessing.html>`_ |
| module. Multiple processes will scale better than multiple threads, so if high throughput is your goal, |
| consider this option. |
| |
| Be sure to **never share any** :class:`~.Cluster`, :class:`~.Session`, |
| **or** :class:`~.ResponseFuture` **objects across multiple processes**. These |
| objects should all be created after forking the process, not before. |
| |
| For further discussion and simple examples using the driver with ``multiprocessing``, |
| see `this blog post <http://www.datastax.com/dev/blog/datastax-python-driver-multiprocessing-example-for-improved-bulk-data-throughput>`_. |