blob: 3641fe8e6be606fdfcb293962be8d0d141aebe8f [file] [log] [blame]
datafusion.context
==================
.. py:module:: datafusion.context
.. autoapi-nested-parse::
Session Context and it's associated configuration.
Classes
-------
.. autoapisummary::
datafusion.context.ArrowArrayExportable
datafusion.context.ArrowStreamExportable
datafusion.context.CatalogProviderExportable
datafusion.context.RuntimeConfig
datafusion.context.RuntimeEnvBuilder
datafusion.context.SQLOptions
datafusion.context.SessionConfig
datafusion.context.SessionContext
datafusion.context.TableProviderExportable
Module Contents
---------------
.. py:class:: ArrowArrayExportable
Bases: :py:obj:`Protocol`
Type hint for object exporting Arrow C Array via Arrow PyCapsule Interface.
https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html
.. py:method:: __arrow_c_array__(requested_schema: object | None = None) -> tuple[object, object]
.. py:class:: ArrowStreamExportable
Bases: :py:obj:`Protocol`
Type hint for object exporting Arrow C Stream via Arrow PyCapsule Interface.
https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html
.. py:method:: __arrow_c_stream__(requested_schema: object | None = None) -> object
.. py:class:: CatalogProviderExportable
Bases: :py:obj:`Protocol`
Type hint for object that has __datafusion_catalog_provider__ PyCapsule.
https://docs.rs/datafusion/latest/datafusion/catalog/trait.CatalogProvider.html
.. py:method:: __datafusion_catalog_provider__() -> object
.. py:class:: RuntimeConfig
Bases: :py:obj:`RuntimeEnvBuilder`
See `RuntimeEnvBuilder`.
Create a new :py:class:`RuntimeEnvBuilder` with default values.
.. py:class:: RuntimeEnvBuilder
Runtime configuration options.
Create a new :py:class:`RuntimeEnvBuilder` with default values.
.. py:method:: with_disk_manager_disabled() -> RuntimeEnvBuilder
Disable the disk manager, attempts to create temporary files will error.
:returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.
.. py:method:: with_disk_manager_os() -> RuntimeEnvBuilder
Use the operating system's temporary directory for disk manager.
:returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.
.. py:method:: with_disk_manager_specified(*paths: str | pathlib.Path) -> RuntimeEnvBuilder
Use the specified paths for the disk manager's temporary files.
:param paths: Paths to use for the disk manager's temporary files.
:returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.
.. py:method:: with_fair_spill_pool(size: int) -> RuntimeEnvBuilder
Use a fair spill pool with the specified size.
This pool works best when you know beforehand the query has multiple spillable
operators that will likely all need to spill. Sometimes it will cause spills
even when there was sufficient memory (reserved for other operators) to avoid
doing so::
┌───────────────────────z──────────────────────z───────────────┐
│ z z │
│ z z │
│ Spillable z Unspillable z Free │
│ Memory z Memory z Memory │
│ z z │
│ z z │
└───────────────────────z──────────────────────z───────────────┘
:param size: Size of the memory pool in bytes.
:returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.
Examples usage::
config = RuntimeEnvBuilder().with_fair_spill_pool(1024)
.. py:method:: with_greedy_memory_pool(size: int) -> RuntimeEnvBuilder
Use a greedy memory pool with the specified size.
This pool works well for queries that do not need to spill or have a single
spillable operator. See :py:func:`with_fair_spill_pool` if there are
multiple spillable operators that all will spill.
:param size: Size of the memory pool in bytes.
:returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.
Example usage::
config = RuntimeEnvBuilder().with_greedy_memory_pool(1024)
.. py:method:: with_temp_file_path(path: str | pathlib.Path) -> RuntimeEnvBuilder
Use the specified path to create any needed temporary files.
:param path: Path to use for temporary files.
:returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.
Example usage::
config = RuntimeEnvBuilder().with_temp_file_path("/tmp")
.. py:method:: with_unbounded_memory_pool() -> RuntimeEnvBuilder
Use an unbounded memory pool.
:returns: A new :py:class:`RuntimeEnvBuilder` object with the updated setting.
.. py:attribute:: config_internal
.. py:class:: SQLOptions
Options to be used when performing SQL queries.
Create a new :py:class:`SQLOptions` with default values.
The default values are:
- DDL commands are allowed
- DML commands are allowed
- Statements are allowed
.. py:method:: with_allow_ddl(allow: bool = True) -> SQLOptions
Should DDL (Data Definition Language) commands be run?
Examples of DDL commands include ``CREATE TABLE`` and ``DROP TABLE``.
:param allow: Allow DDL commands to be run.
:returns: A new :py:class:`SQLOptions` object with the updated setting.
Example usage::
options = SQLOptions().with_allow_ddl(True)
.. py:method:: with_allow_dml(allow: bool = True) -> SQLOptions
Should DML (Data Manipulation Language) commands be run?
Examples of DML commands include ``INSERT INTO`` and ``DELETE``.
:param allow: Allow DML commands to be run.
:returns: A new :py:class:`SQLOptions` object with the updated setting.
Example usage::
options = SQLOptions().with_allow_dml(True)
.. py:method:: with_allow_statements(allow: bool = True) -> SQLOptions
Should statements such as ``SET VARIABLE`` and ``BEGIN TRANSACTION`` be run?
:param allow: Allow statements to be run.
:returns: py:class:SQLOptions` object with the updated setting.
:rtype: A new
Example usage::
options = SQLOptions().with_allow_statements(True)
.. py:attribute:: options_internal
.. py:class:: SessionConfig(config_options: dict[str, str] | None = None)
Session configuration options.
Create a new :py:class:`SessionConfig` with the given configuration options.
:param config_options: Configuration options.
.. py:method:: set(key: str, value: str) -> SessionConfig
Set a configuration option.
Args:
key: Option key.
value: Option value.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_batch_size(batch_size: int) -> SessionConfig
Customize batch size.
:param batch_size: Batch size.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_create_default_catalog_and_schema(enabled: bool = True) -> SessionConfig
Control if the default catalog and schema will be automatically created.
:param enabled: Whether the default catalog and schema will be
automatically created.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_default_catalog_and_schema(catalog: str, schema: str) -> SessionConfig
Select a name for the default catalog and schema.
:param catalog: Catalog name.
:param schema: Schema name.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_information_schema(enabled: bool = True) -> SessionConfig
Enable or disable the inclusion of ``information_schema`` virtual tables.
:param enabled: Whether to include ``information_schema`` virtual tables.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_parquet_pruning(enabled: bool = True) -> SessionConfig
Enable or disable the use of pruning predicate for parquet readers.
Pruning predicates will enable the reader to skip row groups.
:param enabled: Whether to use pruning predicate for parquet readers.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_repartition_aggregations(enabled: bool = True) -> SessionConfig
Enable or disable the use of repartitioning for aggregations.
Enabling this improves parallelism.
:param enabled: Whether to use repartitioning for aggregations.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_repartition_file_min_size(size: int) -> SessionConfig
Set minimum file range size for repartitioning scans.
:param size: Minimum file range size.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_repartition_file_scans(enabled: bool = True) -> SessionConfig
Enable or disable the use of repartitioning for file scans.
:param enabled: Whether to use repartitioning for file scans.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_repartition_joins(enabled: bool = True) -> SessionConfig
Enable or disable the use of repartitioning for joins to improve parallelism.
:param enabled: Whether to use repartitioning for joins.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_repartition_sorts(enabled: bool = True) -> SessionConfig
Enable or disable the use of repartitioning for window functions.
This may improve parallelism.
:param enabled: Whether to use repartitioning for window functions.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_repartition_windows(enabled: bool = True) -> SessionConfig
Enable or disable the use of repartitioning for window functions.
This may improve parallelism.
:param enabled: Whether to use repartitioning for window functions.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:method:: with_target_partitions(target_partitions: int) -> SessionConfig
Customize the number of target partitions for query execution.
Increasing partitions can increase concurrency.
:param target_partitions: Number of target partitions.
:returns: A new :py:class:`SessionConfig` object with the updated setting.
.. py:attribute:: config_internal
.. py:class:: SessionContext(config: SessionConfig | None = None, runtime: RuntimeEnvBuilder | None = None)
This is the main interface for executing queries and creating DataFrames.
See :ref:`user_guide_concepts` in the online documentation for more information.
Main interface for executing queries with DataFusion.
Maintains the state of the connection between a user and an instance
of the connection between a user and an instance of the DataFusion
engine.
:param config: Session configuration options.
:param runtime: Runtime configuration options.
Example usage:
The following example demonstrates how to use the context to execute
a query against a CSV data source using the :py:class:`DataFrame` API::
from datafusion import SessionContext
ctx = SessionContext()
df = ctx.read_csv("data.csv")
.. py:method:: __repr__() -> str
Print a string representation of the Session Context.
.. py:method:: _convert_file_sort_order(file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None) -> list[list[datafusion._internal.expr.SortExpr]] | None
:staticmethod:
Convert nested ``SortKey`` sequences into raw sort expressions.
Each ``SortKey`` can be a column name string, an ``Expr``, or a
``SortExpr`` and will be converted using
:func:`datafusion.expr.sort_list_to_raw_sort_list`.
.. py:method:: _convert_table_partition_cols(table_partition_cols: list[tuple[str, str | pyarrow.DataType]]) -> list[tuple[str, pyarrow.DataType]]
:staticmethod:
.. py:method:: catalog(name: str = 'datafusion') -> datafusion.catalog.Catalog
Retrieve a catalog by name.
.. py:method:: catalog_names() -> set[str]
Returns the list of catalogs in this context.
.. py:method:: create_dataframe(partitions: list[list[pyarrow.RecordBatch]], name: str | None = None, schema: pyarrow.Schema | None = None) -> datafusion.dataframe.DataFrame
Create and return a dataframe using the provided partitions.
:param partitions: :py:class:`pa.RecordBatch` partitions to register.
:param name: Resultant dataframe name.
:param schema: Schema for the partitions.
:returns: DataFrame representation of the SQL query.
.. py:method:: create_dataframe_from_logical_plan(plan: datafusion.plan.LogicalPlan) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from an existing plan.
:param plan: Logical plan.
:returns: DataFrame representation of the logical plan.
.. py:method:: deregister_table(name: str) -> None
Remove a table from the session.
.. py:method:: empty_table() -> datafusion.dataframe.DataFrame
Create an empty :py:class:`~datafusion.dataframe.DataFrame`.
.. py:method:: enable_url_table() -> SessionContext
Control if local files can be queried as tables.
:returns: A new :py:class:`SessionContext` object with url table enabled.
.. py:method:: execute(plan: datafusion.plan.ExecutionPlan, partitions: int) -> datafusion.record_batch.RecordBatchStream
Execute the ``plan`` and return the results.
.. py:method:: from_arrow(data: ArrowStreamExportable | ArrowArrayExportable, name: str | None = None) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow source.
The Arrow data source can be any object that implements either
``__arrow_c_stream__`` or ``__arrow_c_array__``. For the latter, it must return
a struct array.
Arrow data can be Polars, Pandas, Pyarrow etc.
:param data: Arrow data source.
:param name: Name of the DataFrame.
:returns: DataFrame representation of the Arrow table.
.. py:method:: from_arrow_table(data: pyarrow.Table, name: str | None = None) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from an Arrow table.
This is an alias for :py:func:`from_arrow`.
.. py:method:: from_pandas(data: pandas.DataFrame, name: str | None = None) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from a Pandas DataFrame.
:param data: Pandas DataFrame.
:param name: Name of the DataFrame.
:returns: DataFrame representation of the Pandas DataFrame.
.. py:method:: from_polars(data: polars.DataFrame, name: str | None = None) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from a Polars DataFrame.
:param data: Polars DataFrame.
:param name: Name of the DataFrame.
:returns: DataFrame representation of the Polars DataFrame.
.. py:method:: from_pydict(data: dict[str, list[Any]], name: str | None = None) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from a dictionary.
:param data: Dictionary of lists.
:param name: Name of the DataFrame.
:returns: DataFrame representation of the dictionary of lists.
.. py:method:: from_pylist(data: list[dict[str, Any]], name: str | None = None) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from a list.
:param data: List of dictionaries.
:param name: Name of the DataFrame.
:returns: DataFrame representation of the list of dictionaries.
.. py:method:: global_ctx() -> SessionContext
:classmethod:
Retrieve the global context as a `SessionContext` wrapper.
:returns: A `SessionContext` object that wraps the global `SessionContextInternal`.
.. py:method:: read_avro(path: str | pathlib.Path, schema: pyarrow.Schema | None = None, file_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_extension: str = '.avro') -> datafusion.dataframe.DataFrame
Create a :py:class:`DataFrame` for reading Avro data source.
:param path: Path to the Avro file.
:param schema: The data source schema.
:param file_partition_cols: Partition columns.
:param file_extension: File extension to select.
:returns: DataFrame representation of the read Avro file
.. py:method:: read_csv(path: str | pathlib.Path | list[str] | list[pathlib.Path], schema: pyarrow.Schema | None = None, has_header: bool = True, delimiter: str = ',', schema_infer_max_records: int = 1000, file_extension: str = '.csv', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_compression_type: str | None = None) -> datafusion.dataframe.DataFrame
Read a CSV data source.
:param path: Path to the CSV file
:param schema: An optional schema representing the CSV files. If None, the
CSV reader will try to infer it based on data in file.
:param has_header: Whether the CSV file have a header. If schema inference
is run on a file with no headers, default column names are
created.
:param delimiter: An optional column delimiter.
:param schema_infer_max_records: Maximum number of rows to read from CSV
files for schema inference if needed.
:param file_extension: File extension; only files with this extension are
selected for data input.
:param table_partition_cols: Partition columns.
:param file_compression_type: File compression type.
:returns: DataFrame representation of the read CSV files
.. py:method:: read_json(path: str | pathlib.Path, schema: pyarrow.Schema | None = None, schema_infer_max_records: int = 1000, file_extension: str = '.json', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_compression_type: str | None = None) -> datafusion.dataframe.DataFrame
Read a line-delimited JSON data source.
:param path: Path to the JSON file.
:param schema: The data source schema.
:param schema_infer_max_records: Maximum number of rows to read from JSON
files for schema inference if needed.
:param file_extension: File extension; only files with this extension are
selected for data input.
:param table_partition_cols: Partition columns.
:param file_compression_type: File compression type.
:returns: DataFrame representation of the read JSON files.
.. py:method:: read_parquet(path: str | pathlib.Path, table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, parquet_pruning: bool = True, file_extension: str = '.parquet', skip_metadata: bool = True, schema: pyarrow.Schema | None = None, file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None = None) -> datafusion.dataframe.DataFrame
Read a Parquet source into a :py:class:`~datafusion.dataframe.Dataframe`.
:param path: Path to the Parquet file.
:param table_partition_cols: Partition columns.
:param parquet_pruning: Whether the parquet reader should use the predicate
to prune row groups.
:param file_extension: File extension; only files with this extension are
selected for data input.
:param skip_metadata: Whether the parquet reader should skip any metadata
that may be in the file schema. This can help avoid schema
conflicts due to metadata.
:param schema: An optional schema representing the parquet files. If None,
the parquet reader will try to infer it based on data in the
file.
:param file_sort_order: Sort order for the file. Each sort key can be
specified as a column name (``str``), an expression
(``Expr``), or a ``SortExpr``.
:returns: DataFrame representation of the read Parquet files
.. py:method:: read_table(table: datafusion.catalog.Table | TableProviderExportable | datafusion.dataframe.DataFrame | pyarrow.dataset.Dataset) -> datafusion.dataframe.DataFrame
Creates a :py:class:`~datafusion.dataframe.DataFrame` from a table.
.. py:method:: register_avro(name: str, path: str | pathlib.Path, schema: pyarrow.Schema | None = None, file_extension: str = '.avro', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None) -> None
Register an Avro file as a table.
The registered table can be referenced from SQL statement executed against
this context.
:param name: Name of the table to register.
:param path: Path to the Avro file.
:param schema: The data source schema.
:param file_extension: File extension to select.
:param table_partition_cols: Partition columns.
.. py:method:: register_catalog_provider(name: str, provider: CatalogProviderExportable | datafusion.catalog.CatalogProvider | datafusion.catalog.Catalog) -> None
Register a catalog provider.
.. py:method:: register_csv(name: str, path: str | pathlib.Path | list[str | pathlib.Path], schema: pyarrow.Schema | None = None, has_header: bool = True, delimiter: str = ',', schema_infer_max_records: int = 1000, file_extension: str = '.csv', file_compression_type: str | None = None) -> None
Register a CSV file as a table.
The registered table can be referenced from SQL statement executed against.
:param name: Name of the table to register.
:param path: Path to the CSV file. It also accepts a list of Paths.
:param schema: An optional schema representing the CSV file. If None, the
CSV reader will try to infer it based on data in file.
:param has_header: Whether the CSV file have a header. If schema inference
is run on a file with no headers, default column names are
created.
:param delimiter: An optional column delimiter.
:param schema_infer_max_records: Maximum number of rows to read from CSV
files for schema inference if needed.
:param file_extension: File extension; only files with this extension are
selected for data input.
:param file_compression_type: File compression type.
.. py:method:: register_dataset(name: str, dataset: pyarrow.dataset.Dataset) -> None
Register a :py:class:`pa.dataset.Dataset` as a table.
:param name: Name of the table to register.
:param dataset: PyArrow dataset.
.. py:method:: register_json(name: str, path: str | pathlib.Path, schema: pyarrow.Schema | None = None, schema_infer_max_records: int = 1000, file_extension: str = '.json', table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_compression_type: str | None = None) -> None
Register a JSON file as a table.
The registered table can be referenced from SQL statement executed
against this context.
:param name: Name of the table to register.
:param path: Path to the JSON file.
:param schema: The data source schema.
:param schema_infer_max_records: Maximum number of rows to read from JSON
files for schema inference if needed.
:param file_extension: File extension; only files with this extension are
selected for data input.
:param table_partition_cols: Partition columns.
:param file_compression_type: File compression type.
.. py:method:: register_listing_table(name: str, path: str | pathlib.Path, table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, file_extension: str = '.parquet', schema: pyarrow.Schema | None = None, file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None = None) -> None
Register multiple files as a single table.
Registers a :py:class:`~datafusion.catalog.Table` that can assemble multiple
files from locations in an :py:class:`~datafusion.object_store.ObjectStore`
instance.
:param name: Name of the resultant table.
:param path: Path to the file to register.
:param table_partition_cols: Partition columns.
:param file_extension: File extension of the provided table.
:param schema: The data source schema.
:param file_sort_order: Sort order for the file. Each sort key can be
specified as a column name (``str``), an expression
(``Expr``), or a ``SortExpr``.
.. py:method:: register_object_store(schema: str, store: Any, host: str | None = None) -> None
Add a new object store into the session.
:param schema: The data source schema.
:param store: The :py:class:`~datafusion.object_store.ObjectStore` to register.
:param host: URL for the host.
.. py:method:: register_parquet(name: str, path: str | pathlib.Path, table_partition_cols: list[tuple[str, str | pyarrow.DataType]] | None = None, parquet_pruning: bool = True, file_extension: str = '.parquet', skip_metadata: bool = True, schema: pyarrow.Schema | None = None, file_sort_order: collections.abc.Sequence[collections.abc.Sequence[datafusion.expr.SortKey]] | None = None) -> None
Register a Parquet file as a table.
The registered table can be referenced from SQL statement executed
against this context.
:param name: Name of the table to register.
:param path: Path to the Parquet file.
:param table_partition_cols: Partition columns.
:param parquet_pruning: Whether the parquet reader should use the
predicate to prune row groups.
:param file_extension: File extension; only files with this extension are
selected for data input.
:param skip_metadata: Whether the parquet reader should skip any metadata
that may be in the file schema. This can help avoid schema
conflicts due to metadata.
:param schema: The data source schema.
:param file_sort_order: Sort order for the file. Each sort key can be
specified as a column name (``str``), an expression
(``Expr``), or a ``SortExpr``.
.. py:method:: register_record_batches(name: str, partitions: list[list[pyarrow.RecordBatch]]) -> None
Register record batches as a table.
This function will convert the provided partitions into a table and
register it into the session using the given name.
:param name: Name of the resultant table.
:param partitions: Record batches to register as a table.
.. py:method:: register_table(name: str, table: datafusion.catalog.Table | TableProviderExportable | datafusion.dataframe.DataFrame | pyarrow.dataset.Dataset) -> None
Register a :py:class:`~datafusion.Table` with this context.
The registered table can be referenced from SQL statements executed against
this context.
:param name: Name of the resultant table.
:param table: Any object that can be converted into a :class:`Table`.
.. py:method:: register_table_provider(name: str, provider: datafusion.catalog.Table | TableProviderExportable | datafusion.dataframe.DataFrame | pyarrow.dataset.Dataset) -> None
Register a table provider.
Deprecated: use :meth:`register_table` instead.
.. py:method:: register_udaf(udaf: datafusion.user_defined.AggregateUDF) -> None
Register a user-defined aggregation function (UDAF) with the context.
.. py:method:: register_udf(udf: datafusion.user_defined.ScalarUDF) -> None
Register a user-defined function (UDF) with the context.
.. py:method:: register_udtf(func: datafusion.user_defined.TableFunction) -> None
Register a user defined table function.
.. py:method:: register_udwf(udwf: datafusion.user_defined.WindowUDF) -> None
Register a user-defined window function (UDWF) with the context.
.. py:method:: register_view(name: str, df: datafusion.dataframe.DataFrame) -> None
Register a :py:class:`~datafusion.dataframe.DataFrame` as a view.
:param name: The name to register the view under.
:type name: str
:param df: The DataFrame to be converted into a view and registered.
:type df: DataFrame
.. py:method:: session_id() -> str
Return an id that uniquely identifies this :py:class:`SessionContext`.
.. py:method:: sql(query: str, options: SQLOptions | None = None) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.DataFrame` from SQL query text.
Note: This API implements DDL statements such as ``CREATE TABLE`` and
``CREATE VIEW`` and DML statements such as ``INSERT INTO`` with in-memory
default implementation.See
:py:func:`~datafusion.context.SessionContext.sql_with_options`.
:param query: SQL query text.
:param options: If provided, the query will be validated against these options.
:returns: DataFrame representation of the SQL query.
.. py:method:: sql_with_options(query: str, options: SQLOptions) -> datafusion.dataframe.DataFrame
Create a :py:class:`~datafusion.dataframe.DataFrame` from SQL query text.
This function will first validate that the query is allowed by the
provided options.
:param query: SQL query text.
:param options: SQL options.
:returns: DataFrame representation of the SQL query.
.. py:method:: table(name: str) -> datafusion.dataframe.DataFrame
Retrieve a previously registered table by name.
.. py:method:: table_exist(name: str) -> bool
Return whether a table with the given name exists.
.. py:attribute:: ctx
.. py:class:: TableProviderExportable
Bases: :py:obj:`Protocol`
Type hint for object that has __datafusion_table_provider__ PyCapsule.
https://datafusion.apache.org/python/user-guide/io/table_provider.html
.. py:method:: __datafusion_table_provider__() -> object