| :py:mod:`airflow.providers.salesforce.hooks.salesforce` |
| ======================================================= |
| |
| .. py:module:: airflow.providers.salesforce.hooks.salesforce |
| |
| .. autoapi-nested-parse:: |
| |
| This module contains a Salesforce Hook which allows you to connect to your Salesforce instance, |
| retrieve data from it, and write that data to a file for other uses. |
| |
| .. note:: this hook also relies on the simple_salesforce package: |
| https://github.com/simple-salesforce/simple-salesforce |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.salesforce.hooks.salesforce.SalesforceHook |
| |
| |
| |
| |
| Attributes |
| ~~~~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.salesforce.hooks.salesforce.log |
| |
| |
| .. py:data:: log |
| |
| |
| |
| |
| .. py:class:: SalesforceHook(salesforce_conn_id = default_conn_name, session_id = None, session = None) |
| |
| Bases: :py:obj:`airflow.hooks.base.BaseHook` |
| |
| Creates new connection to Salesforce and allows you to pull data out of SFDC and save it to a file. |
| |
| You can then use that file with other Airflow operators to move the data into another data source. |
| |
| :param conn_id: The name of the connection that has the parameters needed to connect to Salesforce. |
| The connection should be of type `Salesforce`. |
| :param session_id: The access token for a given HTTP request session. |
| :param session: A custom HTTP request session. This enables the use of requests Session features not |
| otherwise exposed by `simple_salesforce`. |
| |
| .. note:: |
| A connection to Salesforce can be created via several authentication options: |
| |
| * Password: Provide Username, Password, and Security Token |
| * Direct Session: Provide a `session_id` and either Instance or Instance URL |
| * OAuth 2.0 JWT: Provide a Consumer Key and either a Private Key or Private Key File Path |
| * IP Filtering: Provide Username, Password, and an Organization ID |
| |
| If in sandbox, enter a Domain value of 'test'. |
| |
| .. py:attribute:: conn_name_attr |
| :annotation: = salesforce_conn_id |
| |
| |
| |
| .. py:attribute:: default_conn_name |
| :annotation: = salesforce_default |
| |
| |
| |
| .. py:attribute:: conn_type |
| :annotation: = salesforce |
| |
| |
| |
| .. py:attribute:: hook_name |
| :annotation: = Salesforce |
| |
| |
| |
| .. py:method:: get_connection_form_widgets() |
| :staticmethod: |
| |
| Returns connection widgets to add to connection form |
| |
| |
| .. py:method:: get_ui_field_behaviour() |
| :staticmethod: |
| |
| Returns custom field behaviour |
| |
| |
| .. py:method:: conn(self) |
| |
| Returns a Salesforce instance. (cached) |
| |
| |
| .. py:method:: get_conn(self) |
| |
| Returns a Salesforce instance. (cached) |
| |
| |
| .. py:method:: make_query(self, query, include_deleted = False, query_params = None) |
| |
| Make a query to Salesforce. |
| |
| :param query: The query to make to Salesforce. |
| :param include_deleted: True if the query should include deleted records. |
| :param query_params: Additional optional arguments |
| :return: The query result. |
| :rtype: dict |
| |
| |
| .. py:method:: describe_object(self, obj) |
| |
| Get the description of an object from Salesforce. |
| This description is the object's schema and |
| some extra metadata that Salesforce stores for each object. |
| |
| :param obj: The name of the Salesforce object that we are getting a description of. |
| :return: the description of the Salesforce object. |
| :rtype: dict |
| |
| |
| .. py:method:: get_available_fields(self, obj) |
| |
| Get a list of all available fields for an object. |
| |
| :param obj: The name of the Salesforce object that we are getting a description of. |
| :return: the names of the fields. |
| :rtype: list(str) |
| |
| |
| .. py:method:: get_object_from_salesforce(self, obj, fields) |
| |
| Get all instances of the `object` from Salesforce. |
| For each model, only get the fields specified in fields. |
| |
| All we really do underneath the hood is run: |
| SELECT <fields> FROM <obj>; |
| |
| :param obj: The object name to get from Salesforce. |
| :param fields: The fields to get from the object. |
| :return: all instances of the object from Salesforce. |
| :rtype: dict |
| |
| |
| .. py:method:: write_object_to_file(self, query_results, filename, fmt = 'csv', coerce_to_timestamp = False, record_time_added = False) |
| |
| Write query results to file. |
| |
| Acceptable formats are: |
| - csv: |
| comma-separated-values file. This is the default format. |
| - json: |
| JSON array. Each element in the array is a different row. |
| - ndjson: |
| JSON array but each element is new-line delimited instead of comma delimited like in `json` |
| |
| This requires a significant amount of cleanup. |
| Pandas doesn't handle output to CSV and json in a uniform way. |
| This is especially painful for datetime types. |
| Pandas wants to write them as strings in CSV, but as millisecond Unix timestamps. |
| |
| By default, this function will try and leave all values as they are represented in Salesforce. |
| You use the `coerce_to_timestamp` flag to force all datetimes to become Unix timestamps (UTC). |
| This is can be greatly beneficial as it will make all of your datetime fields look the same, |
| and makes it easier to work with in other database environments |
| |
| :param query_results: the results from a SQL query |
| :param filename: the name of the file where the data should be dumped to |
| :param fmt: the format you want the output in. Default: 'csv' |
| :param coerce_to_timestamp: True if you want all datetime fields to be converted into Unix timestamps. |
| False if you want them to be left in the same format as they were in Salesforce. |
| Leaving the value as False will result in datetimes being strings. Default: False |
| :param record_time_added: True if you want to add a Unix timestamp field |
| to the resulting data that marks when the data was fetched from Salesforce. Default: False |
| :return: the dataframe that gets written to the file. |
| :rtype: pandas.Dataframe |
| |
| |
| .. py:method:: object_to_df(self, query_results, coerce_to_timestamp = False, record_time_added = False) |
| |
| Export query results to dataframe. |
| |
| By default, this function will try and leave all values as they are represented in Salesforce. |
| You use the `coerce_to_timestamp` flag to force all datetimes to become Unix timestamps (UTC). |
| This is can be greatly beneficial as it will make all of your datetime fields look the same, |
| and makes it easier to work with in other database environments |
| |
| :param query_results: the results from a SQL query |
| :param coerce_to_timestamp: True if you want all datetime fields to be converted into Unix timestamps. |
| False if you want them to be left in the same format as they were in Salesforce. |
| Leaving the value as False will result in datetimes being strings. Default: False |
| :param record_time_added: True if you want to add a Unix timestamp field |
| to the resulting data that marks when the data was fetched from Salesforce. Default: False |
| :return: the dataframe. |
| :rtype: pandas.Dataframe |
| |
| |
| |