docs/performance.rst - cassandra-python-driver - Git at Google

 Performance Notes
 =================
 The Python driver for Cassandra offers several methods for executing queries.
 You can synchronously block for queries to complete using
 :meth:`.Session.execute()`, you can obtain asynchronous request futures through
 :meth:`.Session.execute_async()`, and you can attach a callback to the future
 with :meth:`.ResponseFuture.add_callback()`.

 Examples of multiple request patterns can be found in the benchmark scripts included in the driver project.

 The choice of execution pattern will depend on the application context. For applications dealing with multiple
 requests in a given context, the recommended pattern is to use concurrent asynchronous
 requests with callbacks. For many use cases, you don't need to implement this pattern yourself.
 :meth:`cassandra.concurrent.execute_concurrent` and :meth:`cassandra.concurrent.execute_concurrent_with_args`
 provide this pattern with a synchronous API and tunable concurrency.

 Due to the GIL and limited concurrency, the driver can become CPU-bound pretty quickly. The sections below
 discuss further runtime and design considerations for mitigating this limitation.

 PyPy
 ----
 `PyPy <http://pypy.org>`_ is an alternative Python runtime which uses a JIT compiler to
 reduce CPU consumption. This leads to a huge improvement in the driver performance,
 more than doubling throughput for many workloads.

 Cython Extensions
 -----------------
 `Cython <http://cython.org/>`_ is an optimizing compiler and language that can be used to compile the core files and
 optional extensions for the driver. Cython is not a strict dependency, but the extensions will be built by default.

 See :doc:`installation` for details on controlling this build.

 multiprocessing
 ---------------
 All of the patterns discussed above may be used over multiple processes using the
 `multiprocessing <http://docs.python.org/2/library/multiprocessing.html>`_
 module.  Multiple processes will scale better than multiple threads, so if high throughput is your goal,
 consider this option.

 Be sure to **never share any** :class:`~.Cluster`, :class:`~.Session`,
 **or** :class:`~.ResponseFuture` **objects across multiple processes**. These
 objects should all be created after forking the process, not before.

 For further discussion and simple examples using the driver with ``multiprocessing``,
 see `this blog post <http://www.datastax.com/dev/blog/datastax-python-driver-multiprocessing-example-for-improved-bulk-data-throughput>`_.
	Performance Notes
	=================
	The Python driver for Cassandra offers several methods for executing queries.
	You can synchronously block for queries to complete using
	:meth:`.Session.execute()`, you can obtain asynchronous request futures through
	:meth:`.Session.execute_async()`, and you can attach a callback to the future
	with :meth:`.ResponseFuture.add_callback()`.

	Examples of multiple request patterns can be found in the benchmark scripts included in the driver project.

	The choice of execution pattern will depend on the application context. For applications dealing with multiple
	requests in a given context, the recommended pattern is to use concurrent asynchronous
	requests with callbacks. For many use cases, you don't need to implement this pattern yourself.
	:meth:`cassandra.concurrent.execute_concurrent` and :meth:`cassandra.concurrent.execute_concurrent_with_args`
	provide this pattern with a synchronous API and tunable concurrency.

	Due to the GIL and limited concurrency, the driver can become CPU-bound pretty quickly. The sections below
	discuss further runtime and design considerations for mitigating this limitation.

	PyPy
	----
	`PyPy <http://pypy.org>`_ is an alternative Python runtime which uses a JIT compiler to
	reduce CPU consumption. This leads to a huge improvement in the driver performance,
	more than doubling throughput for many workloads.

	Cython Extensions
	-----------------
	`Cython <http://cython.org/>`_ is an optimizing compiler and language that can be used to compile the core files and
	optional extensions for the driver. Cython is not a strict dependency, but the extensions will be built by default.

	See :doc:`installation` for details on controlling this build.

	multiprocessing
	---------------
	All of the patterns discussed above may be used over multiple processes using the
	`multiprocessing <http://docs.python.org/2/library/multiprocessing.html>`_
	module. Multiple processes will scale better than multiple threads, so if high throughput is your goal,
	consider this option.

	Be sure to never share any :class:`~.Cluster`, :class:`~.Session`,
	or :class:`~.ResponseFuture` objects across multiple processes. These
	objects should all be created after forking the process, not before.

	For further discussion and simple examples using the driver with ``multiprocessing``,
	see `this blog post <http://www.datastax.com/dev/blog/datastax-python-driver-multiprocessing-example-for-improved-bulk-data-throughput>`_.