| :mod:`airflow.providers.microsoft.azure.hooks.wasb` |
| =================================================== |
| |
| .. py:module:: airflow.providers.microsoft.azure.hooks.wasb |
| |
| .. autoapi-nested-parse:: |
| |
| This module contains integration with Azure Blob Storage. |
| |
| It communicate via the Window Azure Storage Blob protocol. Make sure that a |
| Airflow connection of type `wasb` exists. Authorization can be done by supplying a |
| login (=Storage account name) and password (=KEY), or login and SAS token in the extra |
| field (see connection `wasb_default` for an example). |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| .. py:class:: WasbHook(wasb_conn_id: str = default_conn_name, public_read: bool = False) |
| |
| Bases: :class:`airflow.hooks.base.BaseHook` |
| |
| Interacts with Azure Blob Storage through the ``wasb://`` protocol. |
| |
| These parameters have to be passed in Airflow Data Base: account_name and account_key. |
| |
| Additional options passed in the 'extra' field of the connection will be |
| passed to the `BlockBlockService()` constructor. For example, authenticate |
| using a SAS token by adding {"sas_token": "YOUR_TOKEN"}. |
| |
| :param wasb_conn_id: Reference to the wasb connection. |
| :type wasb_conn_id: str |
| :param public_read: Whether an anonymous public read access should be used. default is False |
| :type public_read: bool |
| |
| .. attribute:: conn_name_attr |
| :annotation: = wasb_conn_id |
| |
| |
| |
| .. attribute:: default_conn_name |
| :annotation: = wasb_default |
| |
| |
| |
| .. attribute:: conn_type |
| :annotation: = wasb |
| |
| |
| |
| .. attribute:: hook_name |
| :annotation: = Azure Blob Storage |
| |
| |
| |
| |
| .. method:: get_conn(self) |
| |
| Return the BlobServiceClient object. |
| |
| |
| |
| |
| .. method:: _get_container_client(self, container_name: str) |
| |
| Instantiates a container client |
| |
| :param container_name: The name of the container |
| :type container_name: str |
| :return: ContainerClient |
| |
| |
| |
| |
| .. method:: _get_blob_client(self, container_name: str, blob_name: str) |
| |
| Instantiates a blob client |
| |
| :param container_name: The name of the blob container |
| :type container_name: str |
| :param blob_name: The name of the blob. This needs not be existing |
| :type blob_name: str |
| |
| |
| |
| |
| .. method:: check_for_blob(self, container_name: str, blob_name: str, **kwargs) |
| |
| Check if a blob exists on Azure Blob Storage. |
| |
| :param container_name: Name of the container. |
| :type container_name: str |
| :param blob_name: Name of the blob. |
| :type blob_name: str |
| :param kwargs: Optional keyword arguments for ``BlobClient.get_blob_properties`` takes. |
| :type kwargs: object |
| :return: True if the blob exists, False otherwise. |
| :rtype: bool |
| |
| |
| |
| |
| .. method:: check_for_prefix(self, container_name: str, prefix: str, **kwargs) |
| |
| Check if a prefix exists on Azure Blob storage. |
| |
| :param container_name: Name of the container. |
| :type container_name: str |
| :param prefix: Prefix of the blob. |
| :type prefix: str |
| :param kwargs: Optional keyword arguments that ``ContainerClient.walk_blobs`` takes |
| :type kwargs: object |
| :return: True if blobs matching the prefix exist, False otherwise. |
| :rtype: bool |
| |
| |
| |
| |
| .. method:: get_blobs_list(self, container_name: str, prefix: Optional[str] = None, include: Optional[List[str]] = None, delimiter: Optional[str] = '/', **kwargs) |
| |
| List blobs in a given container |
| |
| :param container_name: The name of the container |
| :type container_name: str |
| :param prefix: Filters the results to return only blobs whose names |
| begin with the specified prefix. |
| :type prefix: str |
| :param include: Specifies one or more additional datasets to include in the |
| response. Options include: ``snapshots``, ``metadata``, ``uncommittedblobs``, |
| ``copy`, ``deleted``. |
| :type include: List[str] |
| :param delimiter: filters objects based on the delimiter (for e.g '.csv') |
| :type delimiter: str |
| |
| |
| |
| |
| .. method:: load_file(self, file_path: str, container_name: str, blob_name: str, **kwargs) |
| |
| Upload a file to Azure Blob Storage. |
| |
| :param file_path: Path to the file to load. |
| :type file_path: str |
| :param container_name: Name of the container. |
| :type container_name: str |
| :param blob_name: Name of the blob. |
| :type blob_name: str |
| :param kwargs: Optional keyword arguments that ``BlobClient.upload_blob()`` takes. |
| :type kwargs: object |
| |
| |
| |
| |
| .. method:: load_string(self, string_data: str, container_name: str, blob_name: str, **kwargs) |
| |
| Upload a string to Azure Blob Storage. |
| |
| :param string_data: String to load. |
| :type string_data: str |
| :param container_name: Name of the container. |
| :type container_name: str |
| :param blob_name: Name of the blob. |
| :type blob_name: str |
| :param kwargs: Optional keyword arguments that ``BlobClient.upload()`` takes. |
| :type kwargs: object |
| |
| |
| |
| |
| .. method:: get_file(self, file_path: str, container_name: str, blob_name: str, **kwargs) |
| |
| Download a file from Azure Blob Storage. |
| |
| :param file_path: Path to the file to download. |
| :type file_path: str |
| :param container_name: Name of the container. |
| :type container_name: str |
| :param blob_name: Name of the blob. |
| :type blob_name: str |
| :param kwargs: Optional keyword arguments that `BlobClient.download_blob()` takes. |
| :type kwargs: object |
| |
| |
| |
| |
| .. method:: read_file(self, container_name: str, blob_name: str, **kwargs) |
| |
| Read a file from Azure Blob Storage and return as a string. |
| |
| :param container_name: Name of the container. |
| :type container_name: str |
| :param blob_name: Name of the blob. |
| :type blob_name: str |
| :param kwargs: Optional keyword arguments that `BlobClient.download_blob` takes. |
| :type kwargs: object |
| |
| |
| |
| |
| .. method:: upload(self, container_name, blob_name, data, blob_type: str = 'BlockBlob', length: Optional[int] = None, **kwargs) |
| |
| Creates a new blob from a data source with automatic chunking. |
| |
| :param container_name: The name of the container to upload data |
| :type container_name: str |
| :param blob_name: The name of the blob to upload. This need not exist in the container |
| :type blob_name: str |
| :param data: The blob data to upload |
| :param blob_type: The type of the blob. This can be either ``BlockBlob``, |
| ``PageBlob`` or ``AppendBlob``. The default value is ``BlockBlob``. |
| :type blob_type: storage.BlobType |
| :param length: Number of bytes to read from the stream. This is optional, |
| but should be supplied for optimal performance. |
| :type length: int |
| |
| |
| |
| |
| .. method:: download(self, container_name, blob_name, offset: Optional[int] = None, length: Optional[int] = None, **kwargs) |
| |
| Downloads a blob to the StorageStreamDownloader |
| |
| :param container_name: The name of the container containing the blob |
| :type container_name: str |
| :param blob_name: The name of the blob to download |
| :type blob_name: str |
| :param offset: Start of byte range to use for downloading a section of the blob. |
| Must be set if length is provided. |
| :type offset: int |
| :param length: Number of bytes to read from the stream. |
| :type length: int |
| |
| |
| |
| |
| .. method:: create_container(self, container_name: str) |
| |
| Create container object if not already existing |
| |
| :param container_name: The name of the container to create |
| :type container_name: str |
| |
| |
| |
| |
| .. method:: delete_container(self, container_name: str) |
| |
| Delete a container object |
| |
| :param container_name: The name of the container |
| :type container_name: str |
| |
| |
| |
| |
| .. method:: delete_blobs(self, container_name: str, *blobs, **kwargs) |
| |
| Marks the specified blobs or snapshots for deletion. |
| |
| :param container_name: The name of the container containing the blobs |
| :type container_name: str |
| :param blobs: The blobs to delete. This can be a single blob, or multiple values |
| can be supplied, where each value is either the name of the blob (str) or BlobProperties. |
| :type blobs: Union[str, BlobProperties] |
| |
| |
| |
| |
| .. method:: delete_file(self, container_name: str, blob_name: str, is_prefix: bool = False, ignore_if_missing: bool = False, **kwargs) |
| |
| Delete a file from Azure Blob Storage. |
| |
| :param container_name: Name of the container. |
| :type container_name: str |
| :param blob_name: Name of the blob. |
| :type blob_name: str |
| :param is_prefix: If blob_name is a prefix, delete all matching files |
| :type is_prefix: bool |
| :param ignore_if_missing: if True, then return success even if the |
| blob does not exist. |
| :type ignore_if_missing: bool |
| :param kwargs: Optional keyword arguments that ``ContainerClient.delete_blobs()`` takes. |
| :type kwargs: object |
| |
| |
| |
| |