| :py:mod:`airflow.providers.apache.hive.transfers.mysql_to_hive` |
| =============================================================== |
| |
| .. py:module:: airflow.providers.apache.hive.transfers.mysql_to_hive |
| |
| .. autoapi-nested-parse:: |
| |
| This module contains an operator to move data from MySQL to Hive. |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.apache.hive.transfers.mysql_to_hive.MySqlToHiveOperator |
| |
| |
| |
| |
| .. py:class:: MySqlToHiveOperator(*, sql, hive_table, create = True, recreate = False, partition = None, delimiter = chr(1), quoting = None, quotechar = '"', escapechar = None, mysql_conn_id = 'mysql_default', hive_cli_conn_id = 'hive_cli_default', tblproperties = None, **kwargs) |
| |
| Bases: :py:obj:`airflow.models.BaseOperator` |
| |
| Moves data from MySql to Hive. The operator runs your query against |
| MySQL, stores the file locally before loading it into a Hive table. |
| If the ``create`` or ``recreate`` arguments are set to ``True``, |
| a ``CREATE TABLE`` and ``DROP TABLE`` statements are generated. |
| Hive data types are inferred from the cursor's metadata. Note that the |
| table generated in Hive uses ``STORED AS textfile`` |
| which isn't the most efficient serialization format. If a |
| large amount of data is loaded and/or if the table gets |
| queried considerably, you may want to use this operator only to |
| stage the data into a temporary table before loading it into its |
| final destination using a ``HiveOperator``. |
| |
| :param sql: SQL query to execute against the MySQL database. (templated) |
| :param hive_table: target Hive table, use dot notation to target a |
| specific database. (templated) |
| :param create: whether to create the table if it doesn't exist |
| :param recreate: whether to drop and recreate the table at every |
| execution |
| :param partition: target partition as a dict of partition columns |
| and values. (templated) |
| :param delimiter: field delimiter in the file |
| :param quoting: controls when quotes should be generated by csv writer, |
| It can take on any of the csv.QUOTE_* constants. |
| :param quotechar: one-character string used to quote fields |
| containing special characters. |
| :param escapechar: one-character string used by csv writer to escape |
| the delimiter or quotechar. |
| :param mysql_conn_id: source mysql connection |
| :param hive_cli_conn_id: Reference to the |
| :ref:`Hive CLI connection id <howto/connection:hive_cli>`. |
| :param tblproperties: TBLPROPERTIES of the hive table being created |
| |
| .. py:attribute:: template_fields |
| :annotation: :Sequence[str] = ['sql', 'partition', 'hive_table'] |
| |
| |
| |
| .. py:attribute:: template_ext |
| :annotation: :Sequence[str] = ['.sql'] |
| |
| |
| |
| .. py:attribute:: template_fields_renderers |
| |
| |
| |
| |
| .. py:attribute:: ui_color |
| :annotation: = #a0e08c |
| |
| |
| |
| .. py:method:: type_map(cls, mysql_type) |
| :classmethod: |
| |
| Maps MySQL type to Hive type. |
| |
| |
| .. py:method:: execute(self, context) |
| |
| This is the main method to derive when creating an operator. |
| Context is the same dictionary used as when rendering jinja templates. |
| |
| Refer to get_template_context for more context. |
| |
| |
| |