_sources/autoapi/datafusion/context/index.rst.txt - datafusion-python - Git at Google

 datafusion.context
 ==================

 .. py:module:: datafusion.context

 .. autoapi-nested-parse::

    Session Context and it's associated configuration.


 Classes
 -------

 .. autoapisummary::

    datafusion.context.ArrowArrayExportable
    datafusion.context.ArrowStreamExportable
    datafusion.context.CatalogProviderExportable
    datafusion.context.RuntimeConfig
    datafusion.context.RuntimeEnvBuilder
    datafusion.context.SQLOptions
    datafusion.context.SessionConfig
    datafusion.context.SessionContext
    datafusion.context.TableProviderExportable


 Module Contents
 ---------------

 .. py:class:: ArrowArrayExportable

    Bases: :py:obj:`Protocol`


    Type hint for object exporting Arrow C Array via Arrow PyCapsule Interface.

    https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html


    .. py:method:: __arrow_c_array__(requested_schema: object | None = None) -> tuple[object, object]


 .. py:class:: ArrowStreamExportable

    Bases: :py:obj:`Protocol`


    Type hint for object exporting Arrow C Stream via Arrow PyCapsule Interface.

    https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html


    .. py:method:: __arrow_c_stream__(requested_schema: object | None = None) -> object


 .. py:class:: CatalogProviderExportable

    Bases: :py:obj:`Protocol`


    Type hint for object that has __datafusion_catalog_provider__ PyCapsule.

    https://docs.rs/datafusion/latest/datafusion/catalog/trait.CatalogProvider.html


    .. py:method:: __datafusion_catalog_provider__() -> object


 .. py:class:: RuntimeConfig

    Bases: :py:obj:`RuntimeEnvBuilder`


    See `RuntimeEnvBuilder`.

    Create a new :py:class:`RuntimeEnvBuilder` with default values.


 .. py:class:: RuntimeEnvBuilder

    Runtime configuration options.

    Create a new :py:class:`RuntimeEnvBuilder` with default values.


    .. py:method:: with_disk_manager_disabled() -> RuntimeEnvBuilder

       Disable the disk manager, attempts to create temporary files will error.

       :returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.


    .. py:method:: with_disk_manager_os() -> RuntimeEnvBuilder

       Use the operating system's temporary directory for disk manager.

       :returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.


    .. py:method:: with_disk_manager_specified(*paths: str | pathlib.Path) -> RuntimeEnvBuilder

       Use the specified paths for the disk manager's temporary files.

       :param paths: Paths to use for the disk manager's temporary files.

       :returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.


    .. py:method:: with_fair_spill_pool(size: int) -> RuntimeEnvBuilder

       Use a fair spill pool with the specified size.

       This pool works best when you know beforehand the query has multiple spillable
       operators that will likely all need to spill. Sometimes it will cause spills
       even when there was sufficient memory (reserved for other operators) to avoid
       doing so::

           ┌───────────────────────z──────────────────────z───────────────┐
           │                       z                      z               │
           │                       z                      z               │
           │       Spillable       z       Unspillable    z     Free      │
           │        Memory         z        Memory        z    Memory     │
           │                       z                      z               │
           │                       z                      z               │
           └───────────────────────z──────────────────────z───────────────┘

       :param size: Size of the memory pool in bytes.

       :returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.

       Examples usage::

           config = RuntimeEnvBuilder().with_fair_spill_pool(1024)


    .. py:method:: with_greedy_memory_pool(size: int) -> RuntimeEnvBuilder

       Use a greedy memory pool with the specified size.

       This pool works well for queries that do not need to spill or have a single
       spillable operator. See :py:func:`with_fair_spill_pool` if there are
       multiple spillable operators that all will spill.

       :param size: Size of the memory pool in bytes.

       :returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.

       Example usage::

           config = RuntimeEnvBuilder().with_greedy_memory_pool(1024)


    .. py:method:: with_temp_file_path(path: str | pathlib.Path) -> RuntimeEnvBuilder

       Use the specified path to create any needed temporary files.

       :param path: Path to use for temporary files.

       :returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.

       Example usage::

           config = RuntimeEnvBuilder().with_temp_file_path("/tmp")


    .. py:method:: with_unbounded_memory_pool() -> RuntimeEnvBuilder

       Use an unbounded memory pool.

       :returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.


    .. py:attribute:: config_internal


 .. py:class:: SQLOptions

    Options to be used when performing SQL queries.

    Create a new :py:class:`SQLOptions` with default values.

    The default values are:
    - DDL commands are allowed
    - DML commands are allowed
    - Statements are allowed


    .. py:method:: with_allow_ddl(allow: bool = True) -> SQLOptions

       Should DDL (Data Definition Language) commands be run?

       Examples of DDL commands include ``CREATE TABLE`` and ``DROP TABLE``.

       :param allow: Allow DDL commands to be run.

       :returns: A new :py:class:`SQLOptions` object with the updated setting.

       Example usage::

           options = SQLOptions().with_allow_ddl(True)


    .. py:method:: with_allow_dml(allow: bool = True) -> SQLOptions

       Should DML (Data Manipulation Language) commands be run?

       Examples of DML commands include ``INSERT INTO`` and ``DELETE``.

       :param allow: Allow DML commands to be run.

       :returns: A new :py:class:`SQLOptions` object with the updated setting.

       Example usage::

           options = SQLOptions().with_allow_dml(True)


    .. py:method:: with_allow_statements(allow: bool = True) -> SQLOptions

       Should statements such as ``SET VARIABLE`` and ``BEGIN TRANSACTION`` be run?

       :param allow: Allow statements to be run.

       :returns: py:class:SQLOptions` object with the updated setting.
       :rtype: A new

       Example usage::

           options = SQLOptions().with_allow_statements(True)


    .. py:attribute:: options_internal


 .. py:class:: SessionConfig(config_options: dict[str, str] | None = None)

    Session configuration options.

    Create a new :py:class:`SessionConfig` with the given configuration options.

    :param config_options: Configuration options.


    .. py:method:: set(key: str, value: str) -> SessionConfig

       Set a configuration option.

       Args:
       key: Option key.
       value: Option value.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_batch_size(batch_size: int) -> SessionConfig

       Customize batch size.

       :param batch_size: Batch size.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_create_default_catalog_and_schema(enabled: bool = True) -> SessionConfig

       Control if the default catalog and schema will be automatically created.

       :param enabled: Whether the default catalog and schema will be
                       automatically created.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_default_catalog_and_schema(catalog: str, schema: str) -> SessionConfig

       Select a name for the default catalog and schema.

       :param catalog: Catalog name.
       :param schema: Schema name.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_information_schema(enabled: bool = True) -> SessionConfig

       Enable or disable the inclusion of ``information_schema`` virtual tables.

       :param enabled: Whether to include ``information_schema`` virtual tables.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_parquet_pruning(enabled: bool = True) -> SessionConfig

       Enable or disable the use of pruning predicate for parquet readers.

       Pruning predicates will enable the reader to skip row groups.

       :param enabled: Whether to use pruning predicate for parquet readers.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_repartition_aggregations(enabled: bool = True) -> SessionConfig

       Enable or disable the use of repartitioning for aggregations.

       Enabling this improves parallelism.

       :param enabled: Whether to use repartitioning for aggregations.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_repartition_file_min_size(size: int) -> SessionConfig

       Set minimum file range size for repartitioning scans.

       :param size: Minimum file range size.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_repartition_file_scans(enabled: bool = True) -> SessionConfig

       Enable or disable the use of repartitioning for file scans.

       :param enabled: Whether to use repartitioning for file scans.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_repartition_joins(enabled: bool = True) -> SessionConfig

       Enable or disable the use of repartitioning for joins to improve parallelism.

       :param enabled: Whether to use repartitioning for joins.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_repartition_sorts(enabled: bool = True) -> SessionConfig

       Enable or disable the use of repartitioning for window functions.

       This may improve parallelism.

       :param enabled: Whether to use repartitioning for window functions.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_repartition_windows(enabled: bool = True) -> SessionConfig

       Enable or disable the use of repartitioning for window functions.

       This may improve parallelism.

       :param enabled: Whether to use repartitioning for window functions.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:method:: with_target_partitions(target_partitions: int) -> SessionConfig

       Customize the number of target partitions for query execution.

       Increasing partitions can increase concurrency.

       :param target_partitions: Number of target partitions.

       :returns: A new :py:class:`SessionConfig` object with the updated setting.


    .. py:attribute:: config_internal


 .. py:class:: SessionContext(config: SessionConfig | None = None, runtime: RuntimeEnvBuilder | None = None)

    This is the main interface for executing queries and creating DataFrames.

    See :ref:`user_guide_concepts` in the online documentation for more information.

    Main interface for executing queries with DataFusion.

    Maintains the state of the connection between a user and an instance
    of the connection between a user and an instance of the DataFusion
    engine.

    :param config: Session configuration options.
    :param runtime: Runtime configuration options.

    Example usage:

    The following example demonstrates how to use the context to execute
    a query against a CSV data source using the :py:class:`DataFrame` API::

        from datafusion import SessionContext

        ctx = SessionContext()
        df = ctx.read_csv("data.csv")


    .. py:method:: __repr__() -> str

       Print a string representation of the Session Context.


    .. py:method:: _convert_file_sort_order(file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None) -> list[list[datafusion._internal.expr.SortExpr]] | None
       :staticmethod:


       Convert nested ``SortKey`` sequences into raw sort expressions.

       Each ``SortKey`` can be a column name string, an ``Expr``, or a
       ``SortExpr`` and will be converted using
       :func:`datafusion.expr.sort_list_to_raw_sort_list`.


    .. py:method:: _convert_table_partition_cols(table_partition_cols: list[tuple[str, str | pyarrow.DataType]]) -> list[tuple[str, pyarrow.DataType]]
       :staticmethod:


    .. py:method:: catalog(name: str = 'datafusion') -> datafusion.catalog.Catalog

       Retrieve a catalog by name.


    .. py:method:: catalog_names() -> set[str]

       Returns the list of catalogs in this context.


    .. py:method:: create_dataframe(partitions: list[list[pyarrow.RecordBatch]], name: str | None = None, schema: pyarrow.Schema | None = None) -> datafusion.dataframe.DataFrame

       Create and return a dataframe using the provided partitions.

       :param partitions: :py:class:`pa.RecordBatch` partitions to register.
       :param name: Resultant dataframe name.
       :param schema: Schema for the partitions.

       :returns: DataFrame representation of the SQL query.


    .. py:method:: create_dataframe_from_logical_plan(plan: datafusion.plan.LogicalPlan) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from an existing plan.

       :param plan: Logical plan.

       :returns: DataFrame representation of the logical plan.


    .. py:method:: deregister_table(name: str) -> None

       Remove a table from the session.


    .. py:method:: empty_table() -> datafusion.dataframe.DataFrame

       Create an empty :py:class:`~datafusion.dataframe.DataFrame`.


    .. py:method:: enable_url_table() -> SessionContext

       Control if local files can be queried as tables.

       :returns: A new :py:class:`SessionContext` object with url table enabled.


    .. py:method:: execute(plan: datafusion.plan.ExecutionPlan, partitions: int) -> datafusion.record_batch.RecordBatchStream

       Execute the ``plan`` and return the results.


    .. py:method:: from_arrow(data: ArrowStreamExportable | ArrowArrayExportable, name: str | None = None) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow source.

       The Arrow data source can be any object that implements either
       ``__arrow_c_stream__`` or ``__arrow_c_array__``. For the latter, it must return
       a struct array.

       Arrow data can be Polars, Pandas, Pyarrow etc.

       :param data: Arrow data source.
       :param name: Name of the DataFrame.

       :returns: DataFrame representation of the Arrow table.


    .. py:method:: from_arrow_table(data: pyarrow.Table, name: str | None = None) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow table.

       This is an alias for :py:func:`from_arrow`.


    .. py:method:: from_pandas(data: pandas.DataFrame, name: str | None = None) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from a Pandas DataFrame.

       :param data: Pandas DataFrame.
       :param name: Name of the DataFrame.

       :returns: DataFrame representation of the Pandas DataFrame.


    .. py:method:: from_polars(data: polars.DataFrame, name: str | None = None) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from a Polars DataFrame.

       :param data: Polars DataFrame.
       :param name: Name of the DataFrame.

       :returns: DataFrame representation of the Polars DataFrame.


    .. py:method:: from_pydict(data: dict[str, list[Any]], name: str | None = None) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from a dictionary.

       :param data: Dictionary of lists.
       :param name: Name of the DataFrame.

       :returns: DataFrame representation of the dictionary of lists.


    .. py:method:: from_pylist(data: list[dict[str, Any]], name: str | None = None) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from a list.

       :param data: List of dictionaries.
       :param name: Name of the DataFrame.

       :returns: DataFrame representation of the list of dictionaries.


    .. py:method:: global_ctx() -> SessionContext
       :classmethod:


       Retrieve the global context as a `SessionContext` wrapper.

       :returns: A `SessionContext` object that wraps the global `SessionContextInternal`.


    .. py:method:: read_avro(path: str | pathlib.Path, schema: pyarrow.Schema | None = None, file_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_extension: str = '.avro') -> datafusion.dataframe.DataFrame

       Create a :py:class:`DataFrame` for reading Avro data source.

       :param path: Path to the Avro file.
       :param schema: The data source schema.
       :param file_partition_cols: Partition columns.
       :param file_extension: File extension to select.

       :returns: DataFrame representation of the read Avro file


    .. py:method:: read_csv(path: str | pathlib.Path | list[str] | list[pathlib.Path], schema: pyarrow.Schema | None = None, has_header: bool = True, delimiter: str = ',', schema_infer_max_records: int = 1000, file_extension: str = '.csv', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_compression_type: str | None = None) -> datafusion.dataframe.DataFrame

       Read a CSV data source.

       :param path: Path to the CSV file
       :param schema: An optional schema representing the CSV files. If None, the
                      CSV reader will try to infer it based on data in file.
       :param has_header: Whether the CSV file have a header. If schema inference
                          is run on a file with no headers, default column names are
                          created.
       :param delimiter: An optional column delimiter.
       :param schema_infer_max_records: Maximum number of rows to read from CSV
                                        files for schema inference if needed.
       :param file_extension: File extension; only files with this extension are
                              selected for data input.
       :param table_partition_cols: Partition columns.
       :param file_compression_type: File compression type.

       :returns: DataFrame representation of the read CSV files


    .. py:method:: read_json(path: str | pathlib.Path, schema: pyarrow.Schema | None = None, schema_infer_max_records: int = 1000, file_extension: str = '.json', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_compression_type: str | None = None) -> datafusion.dataframe.DataFrame

       Read a line-delimited JSON data source.

       :param path: Path to the JSON file.
       :param schema: The data source schema.
       :param schema_infer_max_records: Maximum number of rows to read from JSON
                                        files for schema inference if needed.
       :param file_extension: File extension; only files with this extension are
                              selected for data input.
       :param table_partition_cols: Partition columns.
       :param file_compression_type: File compression type.

       :returns: DataFrame representation of the read JSON files.


    .. py:method:: read_parquet(path: str | pathlib.Path, table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, parquet_pruning: bool = True, file_extension: str = '.parquet', skip_metadata: bool = True, schema: pyarrow.Schema | None = None, file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None = None) -> datafusion.dataframe.DataFrame

       Read a Parquet source into a :py:class:`~datafusion.dataframe.Dataframe`.

       :param path: Path to the Parquet file.
       :param table_partition_cols: Partition columns.
       :param parquet_pruning: Whether the parquet reader should use the predicate
                               to prune row groups.
       :param file_extension: File extension; only files with this extension are
                              selected for data input.
       :param skip_metadata: Whether the parquet reader should skip any metadata
                             that may be in the file schema. This can help avoid schema
                             conflicts due to metadata.
       :param schema: An optional schema representing the parquet files. If None,
                      the parquet reader will try to infer it based on data in the
                      file.
       :param file_sort_order: Sort order for the file. Each sort key can be
                               specified as a column name (``str``), an expression
                               (``Expr``), or a ``SortExpr``.

       :returns: DataFrame representation of the read Parquet files


    .. py:method:: read_table(table: datafusion.catalog.Table | TableProviderExportable | datafusion.dataframe.DataFrame | pyarrow.dataset.Dataset) -> datafusion.dataframe.DataFrame

       Creates a :py:class:`~datafusion.dataframe.DataFrame` from a table.


    .. py:method:: register_avro(name: str, path: str | pathlib.Path, schema: pyarrow.Schema | None = None, file_extension: str = '.avro', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None) -> None

       Register an Avro file as a table.

       The registered table can be referenced from SQL statement executed against
       this context.

       :param name: Name of the table to register.
       :param path: Path to the Avro file.
       :param schema: The data source schema.
       :param file_extension: File extension to select.
       :param table_partition_cols: Partition columns.


    .. py:method:: register_catalog_provider(name: str, provider: CatalogProviderExportable | datafusion.catalog.CatalogProvider | datafusion.catalog.Catalog) -> None

       Register a catalog provider.


    .. py:method:: register_csv(name: str, path: str | pathlib.Path | list[str | pathlib.Path], schema: pyarrow.Schema | None = None, has_header: bool = True, delimiter: str = ',', schema_infer_max_records: int = 1000, file_extension: str = '.csv', file_compression_type: str | None = None) -> None

       Register a CSV file as a table.

       The registered table can be referenced from SQL statement executed against.

       :param name: Name of the table to register.
       :param path: Path to the CSV file. It also accepts a list of Paths.
       :param schema: An optional schema representing the CSV file. If None, the
                      CSV reader will try to infer it based on data in file.
       :param has_header: Whether the CSV file have a header. If schema inference
                          is run on a file with no headers, default column names are
                          created.
       :param delimiter: An optional column delimiter.
       :param schema_infer_max_records: Maximum number of rows to read from CSV
                                        files for schema inference if needed.
       :param file_extension: File extension; only files with this extension are
                              selected for data input.
       :param file_compression_type: File compression type.


    .. py:method:: register_dataset(name: str, dataset: pyarrow.dataset.Dataset) -> None

       Register a :py:class:`pa.dataset.Dataset` as a table.

       :param name: Name of the table to register.
       :param dataset: PyArrow dataset.


    .. py:method:: register_json(name: str, path: str | pathlib.Path, schema: pyarrow.Schema | None = None, schema_infer_max_records: int = 1000, file_extension: str = '.json', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_compression_type: str | None = None) -> None

       Register a JSON file as a table.

       The registered table can be referenced from SQL statement executed
       against this context.

       :param name: Name of the table to register.
       :param path: Path to the JSON file.
       :param schema: The data source schema.
       :param schema_infer_max_records: Maximum number of rows to read from JSON
                                        files for schema inference if needed.
       :param file_extension: File extension; only files with this extension are
                              selected for data input.
       :param table_partition_cols: Partition columns.
       :param file_compression_type: File compression type.


    .. py:method:: register_listing_table(name: str, path: str | pathlib.Path, table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_extension: str = '.parquet', schema: pyarrow.Schema | None = None, file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None = None) -> None

       Register multiple files as a single table.

       Registers a :py:class:`~datafusion.catalog.Table` that can assemble multiple
       files from locations in an :py:class:`~datafusion.object_store.ObjectStore`
       instance.

       :param name: Name of the resultant table.
       :param path: Path to the file to register.
       :param table_partition_cols: Partition columns.
       :param file_extension: File extension of the provided table.
       :param schema: The data source schema.
       :param file_sort_order: Sort order for the file. Each sort key can be
                               specified as a column name (``str``), an expression
                               (``Expr``), or a ``SortExpr``.


    .. py:method:: register_object_store(schema: str, store: Any, host: str | None = None) -> None

       Add a new object store into the session.

       :param schema: The data source schema.
       :param store: The :py:class:`~datafusion.object_store.ObjectStore` to register.
       :param host: URL for the host.


    .. py:method:: register_parquet(name: str, path: str | pathlib.Path, table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, parquet_pruning: bool = True, file_extension: str = '.parquet', skip_metadata: bool = True, schema: pyarrow.Schema | None = None, file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None = None) -> None

       Register a Parquet file as a table.

       The registered table can be referenced from SQL statement executed
       against this context.

       :param name: Name of the table to register.
       :param path: Path to the Parquet file.
       :param table_partition_cols: Partition columns.
       :param parquet_pruning: Whether the parquet reader should use the
                               predicate to prune row groups.
       :param file_extension: File extension; only files with this extension are
                              selected for data input.
       :param skip_metadata: Whether the parquet reader should skip any metadata
                             that may be in the file schema. This can help avoid schema
                             conflicts due to metadata.
       :param schema: The data source schema.
       :param file_sort_order: Sort order for the file. Each sort key can be
                               specified as a column name (``str``), an expression
                               (``Expr``), or a ``SortExpr``.


    .. py:method:: register_record_batches(name: str, partitions: list[list[pyarrow.RecordBatch]]) -> None

       Register record batches as a table.

       This function will convert the provided partitions into a table and
       register it into the session using the given name.

       :param name: Name of the resultant table.
       :param partitions: Record batches to register as a table.


    .. py:method:: register_table(name: str, table: datafusion.catalog.Table | TableProviderExportable | datafusion.dataframe.DataFrame | pyarrow.dataset.Dataset) -> None

       Register a :py:class:`~datafusion.Table` with this context.

       The registered table can be referenced from SQL statements executed against
       this context.

       :param name: Name of the resultant table.
       :param table: Any object that can be converted into a :class:`Table`.


    .. py:method:: register_table_provider(name: str, provider: datafusion.catalog.Table | TableProviderExportable | datafusion.dataframe.DataFrame | pyarrow.dataset.Dataset) -> None

       Register a table provider.

       Deprecated: use :meth:`register_table` instead.


    .. py:method:: register_udaf(udaf: datafusion.user_defined.AggregateUDF) -> None

       Register a user-defined aggregation function (UDAF) with the context.


    .. py:method:: register_udf(udf: datafusion.user_defined.ScalarUDF) -> None

       Register a user-defined function (UDF) with the context.


    .. py:method:: register_udtf(func: datafusion.user_defined.TableFunction) -> None

       Register a user defined table function.


    .. py:method:: register_udwf(udwf: datafusion.user_defined.WindowUDF) -> None

       Register a user-defined window function (UDWF) with the context.


    .. py:method:: register_view(name: str, df: datafusion.dataframe.DataFrame) -> None

       Register a :py:class:`~datafusion.dataframe.DataFrame` as a view.

       :param name: The name to register the view under.
       :type name: str
       :param df: The DataFrame to be converted into a view and registered.
       :type df: DataFrame


    .. py:method:: session_id() -> str

       Return an id that uniquely identifies this :py:class:`SessionContext`.


    .. py:method:: sql(query: str, options: SQLOptions | None = None) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.DataFrame` from SQL query text.

       Note: This API implements DDL statements such as ``CREATE TABLE`` and
       ``CREATE VIEW`` and DML statements such as ``INSERT INTO`` with in-memory
       default implementation.See
       :py:func:`~datafusion.context.SessionContext.sql_with_options`.

       :param query: SQL query text.
       :param options: If provided, the query will be validated against these options.

       :returns: DataFrame representation of the SQL query.


    .. py:method:: sql_with_options(query: str, options: SQLOptions) -> datafusion.dataframe.DataFrame

       Create a :py:class:`~datafusion.dataframe.DataFrame` from SQL query text.

       This function will first validate that the query is allowed by the
       provided options.

       :param query: SQL query text.
       :param options: SQL options.

       :returns: DataFrame representation of the SQL query.


    .. py:method:: table(name: str) -> datafusion.dataframe.DataFrame

       Retrieve a previously registered table by name.


    .. py:method:: table_exist(name: str) -> bool

       Return whether a table with the given name exists.


    .. py:attribute:: ctx


 .. py:class:: TableProviderExportable

    Bases: :py:obj:`Protocol`


    Type hint for object that has __datafusion_table_provider__ PyCapsule.

    https://datafusion.apache.org/python/user-guide/io/table_provider.html


    .. py:method:: __datafusion_table_provider__() -> object