docs/source/predictionio.rst - predictionio-sdk-python - Git at Google

 predictionio Package Documentation
 ====================================

 .. automodule:: predictionio

 The SDK comprises of two clients:

 1. EventClient, it is for importing data into the PredictionIO platform.
 2. EngineClient, it is for querying PredictionIO Engine Instance, submit query
    and extract prediction results.

 The SDK also provides a FileExporter for you to write events to a JSON file
 in the same way as EventClient. The JSON file can be used by "pio import"
 for batch data import.

 Please read `PredictionIO Event API <http://docs.prediction.io/datacollection/eventapi/>`_ for explanation of
 how SDK can be used to import events.

 predictionio.EventClient Class
 ------------------------------

 .. autoclass:: EventClient
   :members:

   .. note::

     The "threads" parameter specifies the number of connection threads to
     the PredictionIO server. Minimum is 1. The client object will spawn
     out the specified number of threads. Each of them will establish a
     connection with the PredictionIO server and handle requests
     concurrently.

   .. note::

     If you ONLY use `blocking request methods`,
     setting "threads" to 1 is enough (higher number will not improve
     anything since every request will be blocking). However, if you want
     to take full advantage of
     `asynchronous request methods`, you should
     specify a larger number for "threads" to increase the performance of
     handling concurrent requests (although setting "threads" to 1 will still
     work). The optimal setting depends on your system and application
     requirement.


 predictionio.EngineClient Class
 ------------------------------

 .. autoclass:: EngineClient
    :members:


 predictionio.AsyncRequest Class
 ------------------------------

 .. autoclass:: AsyncRequest
    :members:

 predictionio.FileExporter Class
 -------------------------------

 .. autoclass:: FileExporter
    :members:


 predictionio SDK Usage Notes
 -------------------------

 Asynchronous Requests
 ^^^^^^^^^^^^^^^^^^^^^

 In addition to normal `blocking (synchronous) request methods`,
 this SDK also provides `non-blocking (asynchronous) request methods`.
 All methods
 prefixed with 'a' are asynchronous (eg, :meth:`~EventClient.aset_user`,
 :meth:`~EventClient.aset_item`).  Asynchronous requests are handled by separate
 threads in the background, so you can generate multiple requests at the same
 time without waiting for any of them to finish.  These methods return
 immediately without waiting for results, allowing your code to proceed to work
 on something else.  The concept is to break a normal blocking request (such as
 :meth:`~EventClient.set_user`) into two steps:

 1. generate the request (e.g., calling :meth:`~EngineClient.asend_query`);
 2. get the request's response by calling :meth:`~AsyncRequest.get_response`.

 This allows you to do other work between these two steps.

 .. note::
    In some cases you may not care whether the request is successful for performance or application-specific reasons, then you can simply skip step 2.

 .. note::
    If you do care about the request status or need to get the return data, then at a later time you will need to call :meth:`~Client.aresp` with the AsyncRequest object returned in step 1.
    Please refer to the documentation of :ref:`asynchronous request methods <async-methods-label>` for more details.

 For example, the following code first generates an asynchronous request to
 retrieve recommendations, then get the result at later time::

     >>> # Generates asynchronous request and return an AsyncRequest object
     >>> engine_client = EngineClient()
     >>> request = engine_client.asend_query(data={"uid": "1", "n" : 3})
     >>> <...you can do other things here...>
     >>> try:
     >>>    result = request.get_response() # check the request status and get the return data.
     >>> except:
     >>>    <log the error>


 Batch Import Data with EventClient
 ^^^^^^^^^^^^^^^^^^^^^

 When you import large amount of data at once, you may also use asynchronous
 request methods to generate lots of requests in the beginning and then check the
 status at a later time to minimize run time.

 For example, to import 100000 of user records::

   >>> # generate 100000 asynchronous requests and store the AsyncRequest objects
   >>> event_client = EventClient(access_key=<YOUR_ACCESS_KEY>)
   >>> for i in range(100000):
   >>>   event_client.aset_user(user_record[i].uid)
   >>>
   >>> <...you can do other things here...>
   >>>
   >>> # calling close will block until all requests are processed
   >>> event_client.close()

 Alternatively, you can use blocking requests to import large amount of data, but this has significantly lower performance::

   >>> for i in range(100000):
   >>>   try:
   >>>      client.set_user(user_record[i].uid)
   >>>   except:
   >>>      <log the error>

 Batch Import Data with FileExporter and "pio import"
 ^^^^^^^^^^^^^^^^^^^^^^^

 You can use FileExporter to create events and write to a JSON file which can
 be used by "pio import". Pleas see `Importing Data in Batch <http://docs.prediction.io/datacollection/batchimport/>`_ for more details.

 Note that this method is much faster than batch import with EventClient.
	predictionio Package Documentation
	====================================

	.. automodule:: predictionio

	The SDK comprises of two clients:

	1. EventClient, it is for importing data into the PredictionIO platform.
	2. EngineClient, it is for querying PredictionIO Engine Instance, submit query
	and extract prediction results.

	The SDK also provides a FileExporter for you to write events to a JSON file
	in the same way as EventClient. The JSON file can be used by "pio import"
	for batch data import.

	Please read `PredictionIO Event API <http://docs.prediction.io/datacollection/eventapi/>`_ for explanation of
	how SDK can be used to import events.

	predictionio.EventClient Class
	------------------------------

	.. autoclass:: EventClient
	:members:

	.. note::

	The "threads" parameter specifies the number of connection threads to
	the PredictionIO server. Minimum is 1. The client object will spawn
	out the specified number of threads. Each of them will establish a
	connection with the PredictionIO server and handle requests
	concurrently.

	.. note::

	If you ONLY use `blocking request methods`,
	setting "threads" to 1 is enough (higher number will not improve
	anything since every request will be blocking). However, if you want
	to take full advantage of
	`asynchronous request methods`, you should
	specify a larger number for "threads" to increase the performance of
	handling concurrent requests (although setting "threads" to 1 will still
	work). The optimal setting depends on your system and application
	requirement.


	predictionio.EngineClient Class
	------------------------------

	.. autoclass:: EngineClient
	:members:


	predictionio.AsyncRequest Class
	------------------------------

	.. autoclass:: AsyncRequest
	:members:

	predictionio.FileExporter Class
	-------------------------------

	.. autoclass:: FileExporter
	:members:


	predictionio SDK Usage Notes
	-------------------------

	Asynchronous Requests
	^^^^^^^^^^^^^^^^^^^^^

	In addition to normal `blocking (synchronous) request methods`,
	this SDK also provides `non-blocking (asynchronous) request methods`.
	All methods
	prefixed with 'a' are asynchronous (eg, :meth:`~EventClient.aset_user`,
	:meth:`~EventClient.aset_item`). Asynchronous requests are handled by separate
	threads in the background, so you can generate multiple requests at the same
	time without waiting for any of them to finish. These methods return
	immediately without waiting for results, allowing your code to proceed to work
	on something else. The concept is to break a normal blocking request (such as
	:meth:`~EventClient.set_user`) into two steps:

	1. generate the request (e.g., calling :meth:`~EngineClient.asend_query`);
	2. get the request's response by calling :meth:`~AsyncRequest.get_response`.

	This allows you to do other work between these two steps.

	.. note::
	In some cases you may not care whether the request is successful for performance or application-specific reasons, then you can simply skip step 2.

	.. note::
	If you do care about the request status or need to get the return data, then at a later time you will need to call :meth:`~Client.aresp` with the AsyncRequest object returned in step 1.
	Please refer to the documentation of :ref:`asynchronous request methods <async-methods-label>` for more details.

	For example, the following code first generates an asynchronous request to
	retrieve recommendations, then get the result at later time::

	>>> # Generates asynchronous request and return an AsyncRequest object
	>>> engine_client = EngineClient()
	>>> request = engine_client.asend_query(data={"uid": "1", "n" : 3})
	>>> <...you can do other things here...>
	>>> try:
	>>> result = request.get_response() # check the request status and get the return data.
	>>> except:
	>>> <log the error>


	Batch Import Data with EventClient
	^^^^^^^^^^^^^^^^^^^^^

	When you import large amount of data at once, you may also use asynchronous
	request methods to generate lots of requests in the beginning and then check the
	status at a later time to minimize run time.

	For example, to import 100000 of user records::

	>>> # generate 100000 asynchronous requests and store the AsyncRequest objects
	>>> event_client = EventClient(access_key=<YOUR_ACCESS_KEY>)
	>>> for i in range(100000):
	>>> event_client.aset_user(user_record[i].uid)
	>>>
	>>> <...you can do other things here...>
	>>>
	>>> # calling close will block until all requests are processed
	>>> event_client.close()

	Alternatively, you can use blocking requests to import large amount of data, but this has significantly lower performance::

	>>> for i in range(100000):
	>>> try:
	>>> client.set_user(user_record[i].uid)
	>>> except:
	>>> <log the error>

	Batch Import Data with FileExporter and "pio import"
	^^^^^^^^^^^^^^^^^^^^^^^

	You can use FileExporter to create events and write to a JSON file which can
	be used by "pio import". Pleas see `Importing Data in Batch <http://docs.prediction.io/datacollection/batchimport/>`_ for more details.

	Note that this method is much faster than batch import with EventClient.