blob: 1d87359a9699fe63f6339bb9f113c43878a3613d [file] [log] [blame]
:py:mod:`airflow.providers.apache.hdfs.sensors.hdfs`
====================================================
.. py:module:: airflow.providers.apache.hdfs.sensors.hdfs
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensor
airflow.providers.apache.hdfs.sensors.hdfs.HdfsRegexSensor
airflow.providers.apache.hdfs.sensors.hdfs.HdfsFolderSensor
Attributes
~~~~~~~~~~
.. autoapisummary::
airflow.providers.apache.hdfs.sensors.hdfs.log
.. py:data:: log
.. py:class:: HdfsSensor(*, filepath, hdfs_conn_id = 'hdfs_default', ignored_ext = None, ignore_copying = True, file_size = None, hook = HDFSHook, **kwargs)
Bases: :py:obj:`airflow.sensors.base.BaseSensorOperator`
Waits for a file or folder to land in HDFS
:param filepath: The route to a stored file.
:param hdfs_conn_id: The Airflow connection used for HDFS credentials.
:param ignored_ext: This is the list of ignored extensions.
:param ignore_copying: Shall we ignore?
:param file_size: This is the size of the file.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:HdfsSensor`
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['filepath']
.. py:attribute:: ui_color
.. py:method:: filter_for_filesize(result, size = None)
:staticmethod:
Will test the filepath result and test if its size is at least self.filesize
:param result: a list of dicts returned by Snakebite ls
:param size: the file size in MB a file should be at least to trigger True
:return: (bool) depending on the matching criteria
.. py:method:: filter_for_ignored_ext(result, ignored_ext, ignore_copying)
:staticmethod:
Will filter if instructed to do so the result to remove matching criteria
:param result: list of dicts returned by Snakebite ls
:param ignored_ext: list of ignored extensions
:param ignore_copying: shall we ignore ?
:return: list of dicts which were not removed
:rtype: list[dict]
.. py:method:: poke(self, context)
Get a snakebite client connection and check for file.
.. py:class:: HdfsRegexSensor(regex, *args, **kwargs)
Bases: :py:obj:`HdfsSensor`
Waits for matching files by matching on regex
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:HdfsRegexSensor`
.. py:method:: poke(self, context)
Poke matching files in a directory with self.regex
:return: Bool depending on the search criteria
.. py:class:: HdfsFolderSensor(be_empty = False, *args, **kwargs)
Bases: :py:obj:`HdfsSensor`
Waits for a non-empty directory
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:HdfsFolderSensor`
.. py:method:: poke(self, context)
Poke for a non empty directory
:return: Bool depending on the search criteria