blob: 77d86aa6b4aa88ee5fdc30b643e3c529fbbf90f2 [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Qubole
======
`Qubole <https://www.qubole.com/>`__ is an open, simple, and secure data lake platform for machine learning, streaming and adhoc analytics.
Qubole delivers a Self-Service Platform for Big Data Analytics built on Amazon Web Services, Microsoft and Google Clouds.
Airflow provides operators to execute tasks (commands) on QDS and perform checks against Qubole Commands.
Also, there are provided sensors that waits for a file, folder or partition to be present in cloud storage and check for its presence via QDS APIs
Prerequisite Tasks
^^^^^^^^^^^^^^^^^^
.. include::/operators/_partials/prerequisite_tasks.rst
.. _howto/operator:QuboleOperator:
Execute tasks
^^^^^^^^^^^^^
To run following commands use
:class:`~airflow.providers.qubole.operators.qubole.QuboleOperator`.
Run Hive command
""""""""""""""""
To run query that shows all tables you can use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_hive_query]
:end-before: [END howto_operator_qubole_run_hive_query]
Also you can run script that locates in the bucket by passing path to query file
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_hive_script]
:end-before: [END howto_operator_qubole_run_hive_script]
Run Hadoop command
""""""""""""""""""
To run jar file in your Hadoop cluster use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_hadoop_jar]
:end-before: [END howto_operator_qubole_run_hadoop_jar]
Run Pig command
"""""""""""""""
To run script script in *Pig Latin* in your Hadoop cluster use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_pig_script]
:end-before: [END howto_operator_qubole_run_pig_script]
Run Shell command
"""""""""""""""""
To run Shell-script script use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_shell_script]
:end-before: [END howto_operator_qubole_run_shell_script]
Run Presto command
""""""""""""""""""
To run query using Presto use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_presto_query]
:end-before: [END howto_operator_qubole_run_presto_query]
Run DB commands
"""""""""""""""
To run query as `DbTap <https://docs.qubole.com/en/latest/rest-api/dbtap_api/index.html>`_ use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_db_tap_query]
:end-before: [END howto_operator_qubole_run_db_tap_query]
To run DB export command use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_db_export]
:end-before: [END howto_operator_qubole_run_db_export]
To run DB import command use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_db_import]
:end-before: [END howto_operator_qubole_run_db_import]
Run Spark commands
""""""""""""""""""
To run Scala script as a Spark job use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_operator_qubole_run_spark_scala]
:end-before: [END howto_operator_qubole_run_spark_scala]
.. _howto/operator:QuboleFileSensor:
File sensor
^^^^^^^^^^^
Usage examples of
:class:`~airflow.providers.qubole.sensors.qubole.QuboleFileSensor`.
File or directory existence
"""""""""""""""""""""""""""
To wait for file or directory existence in cluster use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_qubole_run_file_sensor]
:end-before: [END howto_sensor_qubole_run_file_sensor]
.. _howto/operator:QubolePartitionSensor:
Partition sensor
^^^^^^^^^^^^^^^^
Usage examples of
:class:`~airflow.providers.qubole.sensors.qubole.QubolePartitionSensor`.
Partition existence
"""""""""""""""""""
To wait for table partition existence in cluster use
.. exampleinclude:: /../../airflow/providers/qubole/example_dags/example_qubole.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_qubole_run_partition_sensor]
:end-before: [END howto_sensor_qubole_run_partition_sensor]
Reference
^^^^^^^^^
For further information, look at:
* `Qubole Data Service Python SDK <https://github.com/qubole/qds-sdk-py>`__
* `Product Documentation <https://docs.qubole.com/en/latest/>`__