blob: 122142e0f51a3a7885da2d96cc41dce7b73905b0 [file] [log] [blame]
:py:mod:`airflow.providers.apache.hdfs.hooks.webhdfs`
=====================================================
.. py:module:: airflow.providers.apache.hdfs.hooks.webhdfs
.. autoapi-nested-parse::
Hook for Web HDFS
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.apache.hdfs.hooks.webhdfs.WebHDFSHook
Attributes
~~~~~~~~~~
.. autoapisummary::
airflow.providers.apache.hdfs.hooks.webhdfs.log
.. py:data:: log
.. py:exception:: AirflowWebHDFSHookException
Bases: :py:obj:`airflow.exceptions.AirflowException`
Exception specific for WebHDFS hook
.. py:class:: WebHDFSHook(webhdfs_conn_id = 'webhdfs_default', proxy_user = None)
Bases: :py:obj:`airflow.hooks.base.BaseHook`
Interact with HDFS. This class is a wrapper around the hdfscli library.
:param webhdfs_conn_id: The connection id for the webhdfs client to connect to.
:param proxy_user: The user used to authenticate.
.. py:method:: get_conn(self)
Establishes a connection depending on the security mode set via config or environment variable.
:return: a hdfscli InsecureClient or KerberosClient object.
:rtype: hdfs.InsecureClient or hdfs.ext.kerberos.KerberosClient
.. py:method:: check_for_path(self, hdfs_path)
Check for the existence of a path in HDFS by querying FileStatus.
:param hdfs_path: The path to check.
:return: True if the path exists and False if not.
:rtype: bool
.. py:method:: load_file(self, source, destination, overwrite = True, parallelism = 1, **kwargs)
Uploads a file to HDFS.
:param source: Local path to file or folder.
If it's a folder, all the files inside of it will be uploaded.
.. note:: This implies that folders empty of files will not be created remotely.
:param destination: PTarget HDFS path.
If it already exists and is a directory, files will be uploaded inside.
:param overwrite: Overwrite any existing file or directory.
:param parallelism: Number of threads to use for parallelization.
A value of `0` (or negative) uses as many threads as there are files.
:param kwargs: Keyword arguments forwarded to :meth:`hdfs.client.Client.upload`.