| :py:mod:`airflow.providers.apache.spark.hooks.spark_sql` |
| ======================================================== |
| |
| .. py:module:: airflow.providers.apache.spark.hooks.spark_sql |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.apache.spark.hooks.spark_sql.SparkSqlHook |
| |
| |
| |
| |
| .. py:class:: SparkSqlHook(sql, conf = None, conn_id = default_conn_name, total_executor_cores = None, executor_cores = None, executor_memory = None, keytab = None, principal = None, master = None, name = 'default-name', num_executors = None, verbose = True, yarn_queue = None) |
| |
| Bases: :py:obj:`airflow.hooks.base.BaseHook` |
| |
| This hook is a wrapper around the spark-sql binary. It requires that the |
| "spark-sql" binary is in the PATH. |
| |
| :param sql: The SQL query to execute |
| :param conf: arbitrary Spark configuration property |
| :param conn_id: connection_id string |
| :param total_executor_cores: (Standalone & Mesos only) Total cores for all executors |
| (Default: all the available cores on the worker) |
| :param executor_cores: (Standalone & YARN only) Number of cores per |
| executor (Default: 2) |
| :param executor_memory: Memory per executor (e.g. 1000M, 2G) (Default: 1G) |
| :param keytab: Full path to the file that contains the keytab |
| :param master: spark://host:port, mesos://host:port, yarn, or local |
| (Default: The ``host`` and ``port`` set in the Connection, or ``"yarn"``) |
| :param name: Name of the job. |
| :param num_executors: Number of executors to launch |
| :param verbose: Whether to pass the verbose flag to spark-sql |
| :param yarn_queue: The YARN queue to submit to |
| (Default: The ``queue`` value set in the Connection, or ``"default"``) |
| |
| .. py:attribute:: conn_name_attr |
| :annotation: = conn_id |
| |
| |
| |
| .. py:attribute:: default_conn_name |
| :annotation: = spark_sql_default |
| |
| |
| |
| .. py:attribute:: conn_type |
| :annotation: = spark_sql |
| |
| |
| |
| .. py:attribute:: hook_name |
| :annotation: = Spark SQL |
| |
| |
| |
| .. py:method:: get_conn() |
| |
| Returns connection for the hook. |
| |
| |
| .. py:method:: run_query(cmd = '', **kwargs) |
| |
| Remote Popen (actually execute the Spark-sql query) |
| |
| :param cmd: command to append to the spark-sql command |
| :param kwargs: extra arguments to Popen (see subprocess.Popen) |
| |
| |
| .. py:method:: kill() |
| |
| Kill Spark job |
| |
| |
| |