QueryObject Documentation

File: superset/common/query_object.py

The QueryObject class represents a single database query specification constructed on the client. It describes what data to fetch (columns, metrics, filters) without specifying how to fetch it. The datasource connector is responsible for translating the QueryObject into the appropriate query language (SQL, GraphQL, etc.).

Core Query Specification
Data Selection
Filtering
Aggregation & Metrics
Time/Temporal
Sorting & Limiting
Series Limiting (Timeseries)
Post-Processing
Annotations
Query Execution Control
Deprecated Fields

Core Query Specification

`datasource`

Type: BaseDatasource | None

Description: Reference to the datasource (dataset, table, or query) from which to fetch data. This is the data source object itself, not just an identifier.

Default: None

Usage: Set automatically by the QueryContext when loading from the schema. The datasource provides metadata about available columns, metrics, and handles the actual query execution.

Example:

# Automatically set during query context creation
query_object.datasource  # <SqlaTable 'public.sales_data'>

`columns`

Type: list[Column]

Column Type: Union[AdhocColumn, str]

Description: List of dimensions (non-aggregated columns) to select in the query. These become GROUP BY columns if metrics are specified, or simple SELECT columns otherwise.

Default: [] (empty list)

Formats Supported:

String reference: "country" - references a physical column

Adhoc column (dict):

{
    "label": "upper_name",
    "sqlExpression": "UPPER(name)",
    "hasCustomLabel": True
}

Usage Notes:

Formula annotations don't count as columns (filtered out)
Used as the default for series_columns in timeseries queries
Referenced in GROUP BY clauses when metrics are present

Related:

column_names property: Returns list of column labels as strings
series_columns: Subset of columns used for series limiting

Example:

query_object.columns = ["country", "state", "city"]

# Or with adhoc columns:
query_object.columns = [
    "country",
    {
        "label": "year",
        "sqlExpression": "EXTRACT(YEAR FROM created_at)",
        "hasCustomLabel": True
    }
]

`metrics`

Type: list[Metric] | None

Metric Type: Union[AdhocMetric, str]

Description: List of aggregate expressions to compute. These become the SELECT clause aggregations in SQL queries.

Default: None

Formats Supported:

String reference: "count" - references a predefined metric
Legacy format: {"label": "count"} - references a predefined metric (converted to string)

Adhoc SIMPLE metric:

{
    "expressionType": "SIMPLE",
    "aggregate": "SUM",
    "column": {"column_name": "revenue"},
    "label": "Total Revenue",
    "hasCustomLabel": True
}

Adhoc SQL metric:

{
    "expressionType": "SQL",
    "sqlExpression": "SUM(price * quantity)",
    "label": "Total Sales",
    "hasCustomLabel": True
}

Usage Notes:

When metrics are specified, queries automatically include GROUP BY
When None or empty, no aggregation is performed
Legacy format {"label": "..."} is automatically converted to string

Related:

metric_names property: Returns list of metric labels as strings
is_rowcount: Alternative to metrics for counting rows

Example:

# Simple metric references
query_object.metrics = ["count", "sum__revenue"]

# Adhoc metrics
query_object.metrics = [
    {
        "expressionType": "SIMPLE",
        "aggregate": "AVG",
        "column": {"column_name": "price"},
        "label": "Average Price"
    },
    {
        "expressionType": "SQL",
        "sqlExpression": "SUM(CASE WHEN status='completed' THEN 1 ELSE 0 END)",
        "label": "Completed Orders"
    }
]

Data Selection

`is_timeseries`

Type: bool

Description: Indicates whether this query is a timeseries query (data points over time). Affects how results are processed and displayed.

Default: Automatically determined - True if DTTM_ALIAS (special time column constant) is in columns, otherwise can be explicitly set

Usage Notes:

Influences series_columns initialization
Affects post-processing behavior in some visualizations
Used to determine if series limiting should apply

Example:

# Automatically set to True for timeseries
query_object.columns = ["__timestamp", "country"]
query_object.is_timeseries  # True

# Explicitly set
query_object.is_timeseries = True

`is_rowcount`

Type: bool

Description: When True, returns only the total row count instead of actual data. Used for preview operations or checking data volume.

Default: False

Usage Notes:

Mutually exclusive with normal metric aggregation
Result contains just a count, no actual rows
Useful for pagination and data size checks

Example:

query_object.is_rowcount = True
# Query returns: {"count": 15234}

`result_type`

Type: ChartDataResultType | None

Values: "full", "samples", "query", "results", "post_processed", "columns", "timegrains", "drill_detail"

Description: Specifies what type of result to return. Controls which processing pipeline is used.

Default: None (inherits from QueryContext if not specified)

Types:

"full": Complete query execution with all data
"samples": Sample rows from the datasource (no metrics, limited rows)
"query": Return the query string without executing
"results": Like “full” but with minimal metadata
"post_processed": Full results for client-side post-processing
"columns": Return column metadata only
"timegrains": Return available time granularities
"drill_detail": Return drill-through detail rows

Example:

query_object.result_type = ChartDataResultType.SAMPLES
# Returns sample rows without aggregation

Filtering

`filter`

Type: list[QueryObjectFilterClause]

FilterClause Type:

{
    "col": Union[str, AdhocColumn],  # Column to filter
    "op": str,                        # Operator (e.g., "==", "IN", "LIKE")
    "val": FilterValues | None,       # Value(s) to compare
    "grain": str | None,              # Time grain for temporal filters
    "isExtra": bool | None            # Added by dashboard filters
}

Description: List of filter conditions to apply to the query. These become WHERE clause conditions in SQL.

Default: [] (empty list)

Supported Operators: ==, !=, >, <, >=, <=, IN, NOT IN, LIKE, ILIKE, REGEX, TEMPORAL_RANGE, IS NULL, IS NOT NULL, and more (see FilterOperator enum)

Usage Notes:

Filters are ANDed together
isExtra: True indicates filter was added by dashboard/native filters
Temporal filters (TEMPORAL_RANGE) require grain to be set
Filters go through Jinja template processing if datasource supports it
Sanitized for SQL injection before execution

Example:

query_object.filter = [
    {
        "col": "country",
        "op": "IN",
        "val": ["USA", "Canada", "Mexico"]
    },
    {
        "col": "revenue",
        "op": ">=",
        "val": 1000
    },
    {
        "col": "created_at",
        "op": "TEMPORAL_RANGE",
        "val": "Last 30 days",
        "grain": "P1D"
    }
]

`extras`

Type: dict[str, Any]

Description: Additional query parameters and modifiers. This is an extensible dictionary for extra query options.

Default: {} (empty dict)

Common Keys:

"where": Raw SQL WHERE clause (added via AND)
"having": Raw SQL HAVING clause for aggregate filters
"time_grain_sqla": Time granularity (e.g., "P1D" for 1 day)
"relative_start": Start reference point ("today" or "now")
"relative_end": End reference point ("today" or "now")
"instant_time_comparison_range": For advanced time comparison features

Usage Notes:

WHERE and HAVING clauses support Jinja templates
All SQL clauses are sanitized for security
time_grain_sqla controls temporal aggregation level
Warning: Direct SQL clauses (where, having) bypass some security layers

Example:

query_object.extras = {
    "where": "status = 'active' AND archived = false",
    "having": "SUM(revenue) > 10000",
    "time_grain_sqla": "P1W",  # Weekly granularity
    "relative_start": "now"
}

`apply_fetch_values_predicate`

Type: bool

Description: When True, applies additional WHERE clause predicates defined in the datasource configuration for fetching filter values.

Default: False

Usage Notes:

Used primarily when loading filter options
Applies datasource-specific filter predicates
Helps limit the domain of filter values

Example:

query_object.apply_fetch_values_predicate = True
# Applies any predicates configured on the datasource

`applied_time_extras`

Type: dict[str, str]

Description: Mapping of temporal extras that have been applied to the query. Used for tracking which time filters were actually used.

Default: {} (empty dict)

Usage Notes:

Populated during query execution
Used for displaying which time filters are active
Keys are typically time column names
Values are human-readable descriptions

Example:

query_object.applied_time_extras = {
    "__time_range": "1 year ago : now",
    "__time_grain": "P1D"
}

Aggregation & Metrics

`groupby` (DEPRECATED)

Status: ⚠️ DEPRECATED - Use columns instead

Type: N/A (automatically renamed to columns)

Description: Legacy field name for grouping columns. Automatically converted to columns during initialization.

Migration: Replace groupby with columns in all new code.

Time/Temporal

`granularity`

Type: str | None

Description: Name of the temporal column to use for time-based operations (filtering, grouping). This is the primary time dimension for the query.

Default: None

Usage Notes:

Used for temporal filtering and aggregation
Essential for timeseries queries
References a datetime column in the datasource
Used in time range filters

Related:

extras["time_grain_sqla"]: Controls temporal aggregation granularity
time_range: The time range to filter by
granularity_sqla: Deprecated alias for granularity

Example:

query_object.granularity = "order_date"
# All time operations will use the order_date column

`granularity_sqla` (DEPRECATED)

Status: ⚠️ DEPRECATED - Use granularity instead

Type: N/A (automatically renamed to granularity)

Description: Legacy SQL-specific field name for temporal column. Automatically converted during initialization.

Migration: Replace granularity_sqla with granularity in all new code.

`time_range`

Type: str | None

Description: Human-readable time range specification for filtering temporal data. Supports both relative and absolute formats.

Default: None

Supported Formats:

Relative: "Last 7 days", "Last week", "Last month", "Last quarter", "Last year"
Relative with number: "Last 30 days", "Last 6 months", "Next 2 weeks"
Absolute: "2023-01-01 : 2023-12-31" (ISO 8601)
Mixed: "2023-01-01 : now", "1 year ago : now"
No filter: "No filter"

Usage Notes:

Parsed into from_dttm and to_dttm datetime objects
Relative times are resolved at query execution time
Used for cache key generation (not the parsed datetime values)
Supports parsedatetime syntax

Related:

from_dttm: Computed start datetime
to_dttm: Computed end datetime
granularity: Column to filter on

Example:

query_object.time_range = "Last 30 days"
# Computed at runtime: from_dttm = now() - 30 days, to_dttm = now()

query_object.time_range = "2023-01-01 : 2023-06-30"
# Explicit range

`from_dttm`

Type: datetime | None

Description: Computed start datetime for the time range filter. Automatically calculated from time_range.

Default: None

Usage Notes:

Set automatically during query processing
Not included in cache key (time_range is used instead)
May be overridden for time offset queries

Example:

query_object.time_range = "Last 7 days"
# After processing:
query_object.from_dttm  # datetime(2024, 1, 15, 0, 0, 0)

`to_dttm`

Type: datetime | None

Description: Computed end datetime for the time range filter. Automatically calculated from time_range.

Default: None

Usage Notes:

Set automatically during query processing
Not included in cache key (time_range is used instead)
May be overridden for time offset queries

Example:

query_object.time_range = "Last 7 days"
# After processing:
query_object.to_dttm  # datetime(2024, 1, 22, 23, 59, 59)

`inner_from_dttm`

Type: datetime | None

Description: Inner time range start for nested temporal operations. Used when applying time filters to queries that don't have time as a dimension.

Default: None

Usage Notes:

Used in advanced time comparison scenarios
Typically same as from_dttm for simple queries
May differ when time offsets are applied

Example:

# Set during time offset processing
query_object.inner_from_dttm = datetime(2024, 1, 1)

`inner_to_dttm`

Type: datetime | None

Description: Inner time range end for nested temporal operations. Used when applying time filters to queries that don't have time as a dimension.

Default: None

Usage Notes:

Used in advanced time comparison scenarios
Typically same as to_dttm for simple queries
May differ when time offsets are applied

Example:

# Set during time offset processing
query_object.inner_to_dttm = datetime(2024, 1, 31)

`time_shift`

Type: str | None

Description: Shifts the entire time range by a specified offset. Used for comparing data across different time periods.

Default: None

Supported Formats: Any parsedatetime-compatible string (e.g., "1 week ago", "3 months ago", "1 year ago")

Usage Notes:

Applied to both from_dttm and to_dttm
Different from time_offsets (which creates separate queries)
Affects the main query time range

Example:

query_object.time_range = "Last 7 days"
query_object.time_shift = "1 week ago"
# Shifts the entire 7-day window back by 1 week

`time_offsets`

Type: list[str]

Description: List of time offsets for creating comparison queries. Each offset generates an additional query with shifted time ranges, enabling time-over-time comparisons.

Default: [] (empty list)

Supported Formats:

Relative: "1 week ago", "1 year ago", "3 months ago"
Date range (with feature flag): "2023-01-01 : 2023-01-31"
Special: "inherit" (uses the time range span)
Custom date: "2023-06-15" (compares to this specific date)

Usage Notes:

Creates separate queries for each offset
Results are joined with the main query results
Offset metrics are renamed (e.g., revenue → revenue__1 week ago)
Requires DATE_RANGE_TIMESHIFTS_ENABLED feature flag for date range format
Used for year-over-year, month-over-month comparisons

Related:

Time grain is required for proper joining of offset results
series_limit applies to the main query, not offset queries

Example:

query_object.time_offsets = ["1 week ago", "1 year ago"]
# Generates 3 queries total:
# 1. Main query (this week)
# 2. Same query shifted back 1 week
# 3. Same query shifted back 1 year
# Results joined on time + other dimensions

# Or with date range (requires feature flag):
query_object.time_offsets = ["2023-01-01 : 2023-01-31"]
# Compares current period to January 2023

Sorting & Limiting

`orderby`

Type: list[OrderBy]

OrderBy Type: tuple[Union[Metric, Column], bool]

Description: List of ordering specifications. Each tuple contains a column/metric and a boolean indicating ascending order.

Default: [] (empty list)

Format: [(column_or_metric, is_ascending), ...]

Usage Notes:

Boolean True = ascending order
Boolean False = descending order
Can order by both columns and metrics
Applied after aggregation

Related:

order_desc: Default sort direction (deprecated in favor of explicit orderby)

Example:

query_object.orderby = [
    ("revenue", False),  # Order by revenue descending
    ("country", True)    # Then by country ascending
]

`order_desc`

Type: bool

Description: Default sort direction when orderby is not specified. Primarily affects how series are ordered.

Default: True (descending)

Usage Notes:

Less flexible than orderby
Prefer using explicit orderby for complex sorting
Used mainly for backward compatibility

Example:

query_object.order_desc = False
# Results sorted in ascending order

`row_limit`

Type: int | None

Description: Maximum number of rows to return from the query. Acts as a SQL LIMIT clause.

Default: None (uses system default)

Range: >= 0 (0 means no limit)

Usage Notes:

Applied after all filtering and aggregation
Different from series_limit (which limits timeseries)
System default from config["ROW_LIMIT"]
Can be overridden per-query

Related:

row_offset: Works with row_limit for pagination
series_limit: For limiting timeseries/series count

Example:

query_object.row_limit = 1000
# Returns at most 1000 rows

`row_offset`

Type: int

Description: Number of rows to skip before returning results. Acts as a SQL OFFSET clause for pagination.

Default: 0 (no offset)

Range: >= 0

Usage Notes:

Used with row_limit for pagination
Applied after ordering
Useful for infinite scroll or paginated tables

Example:

# Page 3 of results (20 per page)
query_object.row_limit = 20
query_object.row_offset = 40  # Skip first 40 rows

Series Limiting (Timeseries)

`series_columns`

Type: list[Column]

Description: Subset of columns to use when limiting the number of series in timeseries queries. Defines which dimensions create distinct series.

Default: Automatically initialized based on context:

If explicitly provided, uses those columns
If is_timeseries=True and metrics exist, uses all columns
Otherwise, empty list

Usage Notes:

All series_columns must be present in columns
Used with series_limit to control series count
Validated during query validation
Creates one series per unique combination of series_column values

Related:

series_limit: Maximum number of series
series_limit_metric: Metric to use for ranking series
group_others_when_limit_reached: Whether to group remaining series

Example:

query_object.columns = ["country", "product", "date"]
query_object.series_columns = ["country", "product"]
# Creates one series per (country, product) combination

`series_limit`

Type: int

Description: Maximum number of series to return in a timeseries query. Series are ranked by series_limit_metric and top N are kept.

Default: 0 (no limit)

Usage Notes:

Only applies to timeseries queries
Requires series_columns and series_limit_metric to be set
Top series are selected by the specified metric
See group_others_when_limit_reached for handling excluded series

Related:

series_columns: Dimensions that define series
series_limit_metric: Metric used for ranking
timeseries_limit: Deprecated alias

Example:

query_object.series_limit = 10
query_object.series_limit_metric = "revenue"
query_object.series_columns = ["country"]
# Returns top 10 countries by revenue

`series_limit_metric`

Type: Metric | None

Description: The metric to use for ranking series when series_limit is applied. Determines which series are kept.

Default: None

Usage Notes:

Required when series_limit is set
Must be one of the metrics in the query
Series are ranked in descending order by this metric
Can be a string reference or adhoc metric

Related:

series_limit: Number of series to keep
timeseries_limit_metric: Deprecated alias

Example:

query_object.series_limit_metric = "sum__revenue"
# Ranks series by total revenue

`group_others_when_limit_reached`

Type: bool

Description: When True and series limit is reached, groups all remaining series into an “Others” category. Prevents incomplete data visualization.

Default: False

Usage Notes:

Only relevant when series_limit is set
Aggregates metrics for excluded series
Helps show complete totals while limiting series count
“Others” category appears as a separate series

Example:

query_object.series_limit = 5
query_object.series_limit_metric = "count"
query_object.group_others_when_limit_reached = True
# Shows top 5 series + "Others" category with remaining aggregated

`timeseries_limit` (DEPRECATED)

Status: ⚠️ DEPRECATED - Use series_limit instead

Type: N/A (automatically renamed to series_limit)

Description: Legacy field name for series limit. Automatically converted during initialization.

Migration: Replace timeseries_limit with series_limit in all new code.

`timeseries_limit_metric` (DEPRECATED)

Status: ⚠️ DEPRECATED - Use series_limit_metric instead

Type: N/A (automatically renamed to series_limit_metric)

Description: Legacy field name for series limit metric. Automatically converted during initialization.

Migration: Replace timeseries_limit_metric with series_limit_metric in all new code.

Post-Processing

`post_processing`

Type: list[dict[str, Any]]

Description: Ordered list of post-processing operations to apply to the query results. These transformations run on the DataFrame after SQL execution.

Default: [] (empty list)

Operation Format:

{
    "operation": str,      # Operation name (from pandas_postprocessing module)
    "options": dict        # Operation-specific parameters
}

Available Operations:

aggregate: Group and aggregate data
pivot: Pivot table transformation
rolling: Rolling window calculations
sort: Sort data
select: Select/rename columns
contribution: Calculate contribution percentages
prophet: Time series forecasting
boxplot: Statistical boxplot calculations
histogram: Create histogram bins
geohash_decode: Decode geohash to lat/lon
geohash_encode: Encode lat/lon to geohash
geodetic_parse: Parse geodetic coordinates
And more...

Usage Notes:

Operations applied in sequence
Each operation receives output of previous operation
All operations are from superset.utils.pandas_postprocessing
Validation ensures operation names are valid
Useful for client-side transformations without re-querying

Example:

query_object.post_processing = [
    {
        "operation": "pivot",
        "options": {
            "index": ["country"],
            "columns": ["product"],
            "aggregates": {
                "revenue": {"operator": "sum"}
            }
        }
    },
    {
        "operation": "sort",
        "options": {
            "columns": {"revenue": False}  # Descending
        }
    }
]

Histogram Example:

query_object.post_processing = [
    {
        "operation": "histogram",
        "options": {
            "column": "age",
            "bins": 25
        }
    }
]

Annotations

`annotation_layers`

Type: list[dict[str, Any]]

Description: List of annotation layers to overlay on the chart. Annotations add contextual information like events, ranges, or reference data.

Default: [] (empty list, formula annotations filtered out)

Layer Structure:

{
    "annotationType": str,          # "FORMULA", "NATIVE", "line", "table"
    "name": str,                    # Layer name
    "value": Any,                   # Layer-specific value (ID, formula, etc.)
    "show": bool,                   # Whether to show the layer
    "sourceType": str,              # "NATIVE", "line", "table", etc.
    "color": str,                   # Layer color
    "opacity": str,                 # "opacityLow", "opacityMedium", "opacityHigh"
    "style": str,                   # "solid", "dashed", "dotted", "longDashed"
    "width": float,                 # Line width
    "showMarkers": bool,            # Show markers on line annotations
    "showLabel": bool,              # Always show label
    "hideLine": bool,               # Hide line (show markers only)
    "timeColumn": str,              # Column with timestamps
    "intervalEndColumn": str,       # For interval annotations
    "titleColumn": str,             # Column for titles
    "descriptionColumns": list,     # Columns for descriptions
    "overrides": dict               # Override query properties
}

Annotation Types:

"FORMULA": Simple formula overlays (e.g., constant line) - filtered out, don't affect query
"NATIVE": Native Superset annotations stored in DB
"line" / "table": Annotations from other charts/queries

Usage Notes:

Formula annotations are filtered out (don't affect payload)
Other annotations trigger additional data fetches
Annotations can override time range, granularity, etc.
Used primarily for time series visualizations

Example:

query_object.annotation_layers = [
    {
        "annotationType": "NATIVE",
        "name": "Important Events",
        "value": 1,  # annotation_layer_id
        "show": True,
        "sourceType": "NATIVE",
        "color": "#ff0000"
    },
    {
        "annotationType": "line",
        "name": "Baseline",
        "value": 42,  # chart_id
        "show": True,
        "sourceType": "line",
        "style": "dashed",
        "overrides": {
            "time_range": "Last year"
        }
    }
]

Query Execution Control

`url_params`

Type: dict[str, str] (keys and values are strings)

Description: Optional query parameters passed from dashboard or Explore view URLs. Used for dynamic filtering and Jinja template variables.

Default: Not set (from kwargs)

Usage Notes:

Available in Jinja templates via url_param function
Typically set by dashboard filters or URL parameters
Keys and values are always strings
Can be used for dynamic query customization

Example:

query_object.url_params = {
    "country_filter": "USA",
    "min_date": "2024-01-01"
}

# In Jinja template:
# WHERE country = '{{ url_param("country_filter") }}'

Deprecated Fields

`where` (DEPRECATED)

Status: ⚠️ DEPRECATED - Use extras["where"] instead

Type: Field moved to extras during initialization

Description: Raw SQL WHERE clause. Automatically moved to extras["where"].

Migration: Use extras["where"] directly instead of passing where parameter.

`having` (DEPRECATED)

Status: ⚠️ DEPRECATED - Use extras["having"] instead

Type: Field moved to extras during initialization

Description: Raw SQL HAVING clause. Automatically moved to extras["having"].

Migration: Use extras["having"] directly instead of passing having parameter.

Properties

`metric_names`

Type: list[str] (read-only property)

Description: Returns the metric labels as strings. Converts adhoc metrics to their labels.

Usage: Accessing metric names without dealing with adhoc metric dictionaries.

Example:

query_object.metrics = ["count", {"expressionType": "SQL", "label": "revenue", ...}]
query_object.metric_names  # ["count", "revenue"]

`column_names`

Type: list[str] (read-only property)

Description: Returns the column labels as strings. Converts adhoc columns to their labels.

Usage: Accessing column names without dealing with adhoc column dictionaries.

Example:

query_object.columns = ["country", {"label": "year", "sqlExpression": "...", ...}]
query_object.column_names  # ["country", "year"]

Methods

`validate(raise_exceptions: bool = True) -> QueryObjectValidationError | None`

Description: Validates the query object for correctness. Checks for duplicate labels, missing series columns, invalid time offsets, and sanitizes filters.

Parameters:

raise_exceptions: If True, raises exception on validation error. If False, returns the error object.

Validates:

No missing series columns
No duplicate column/metric labels
Valid time offset configurations
Sanitizes WHERE/HAVING clauses for SQL injection

Returns: None if valid, or QueryObjectValidationError if raise_exceptions=False

Example:

try:
    query_object.validate()
except QueryObjectValidationError as e:
    print(f"Validation failed: {e.message}")

`to_dict() -> dict[str, Any]`

Description: Serializes the QueryObject to a dictionary. Used for passing to datasource connectors and caching.

Returns: Dictionary representation of the query object

Usage: Called internally when executing queries on datasources.

Example:

query_dict = query_object.to_dict()
# {
#   "columns": ["country"],
#   "metrics": ["count"],
#   "filters": [...],
#   ...
# }

`cache_key(**extra: Any) -> str`

Description: Generates a unique cache key for this query. Uses MD5/SHA hash of query parameters.

Parameters:

**extra: Additional key-value pairs to include in cache key

Key Components:

All query parameters from to_dict()
time_range (not from_dttm/to_dttm - for relative time caching)
datasource.uid
result_type
post_processing
time_offsets
annotation_layers (if present)
User impersonation key (if enabled)

Returns: MD5/SHA hash string

Example:

cache_key = query_object.cache_key(
    time_offset="1 week ago",
    time_grain="P1D"
)
# "a3f5c8e9d..."

`exec_post_processing(df: DataFrame) -> DataFrame`

Description: Applies all post-processing operations to a DataFrame in sequence.

Parameters:

df: Pandas DataFrame from query execution

Returns: Transformed DataFrame

Raises: InvalidPostProcessingError if operation is invalid

Example:

df = datasource.query(query_object.to_dict()).df
processed_df = query_object.exec_post_processing(df)

Type Definitions

`Column`

Type: Union[AdhocColumn, str]

AdhocColumn Structure:

{
    "label": str,              # Column label
    "sqlExpression": str,      # SQL expression
    "hasCustomLabel": bool,    # Whether label is custom
    "columnType": str,         # "BASE_AXIS" or "SERIES"
    "timeGrain": str           # Time grain if temporal
}

`Metric`

Type: Union[AdhocMetric, str]

AdhocMetric Structure:

{
    "expressionType": "SIMPLE" | "SQL",
    "label": str,
    "hasCustomLabel": bool,

    # For SIMPLE:
    "aggregate": "SUM" | "AVG" | "COUNT" | "MIN" | "MAX" | "COUNT_DISTINCT",
    "column": AdhocMetricColumn,

    # For SQL:
    "sqlExpression": str
}

`OrderBy`

Type: tuple[Union[Metric, Column], bool]

Format: (column_or_metric, is_ascending)

Example: ("revenue", False) means “ORDER BY revenue DESC”

Common Usage Patterns

Basic Query

query_object = QueryObject(
    columns=["country", "city"],
    metrics=["count"],
    row_limit=100
)

Timeseries Query

query_object = QueryObject(
    columns=["__timestamp", "country"],
    metrics=["sum__revenue"],
    granularity="order_date",
    time_range="Last 30 days",
    extras={"time_grain_sqla": "P1D"},
    is_timeseries=True
)

Time Comparison Query

query_object = QueryObject(
    columns=["country"],
    metrics=["sum__revenue"],
    time_range="Last 7 days",
    time_offsets=["1 week ago", "1 year ago"],
    granularity="order_date"
)

Post-Processed Query

query_object = QueryObject(
    columns=["age"],
    row_limit=10000,
    post_processing=[
        {
            "operation": "histogram",
            "options": {"column": "age", "bins": 25}
        }
    ]
)

Filtered and Sorted Query

query_object = QueryObject(
    columns=["country", "product"],
    metrics=["sum__revenue", "count"],
    filter=[
        {"col": "status", "op": "==", "val": "completed"},
        {"col": "revenue", "op": ">=", "val": 100}
    ],
    orderby=[("sum__revenue", False)],  # Descending
    row_limit=50
)

QueryObject Documentation

Table of Contents

Core Query Specification

datasource

columns

metrics

Data Selection

is_timeseries

is_rowcount

result_type

Filtering

filter

extras

apply_fetch_values_predicate

applied_time_extras

Aggregation & Metrics

groupby (DEPRECATED)

Time/Temporal

granularity

granularity_sqla (DEPRECATED)

time_range

from_dttm

to_dttm

inner_from_dttm

inner_to_dttm

time_shift

time_offsets

Sorting & Limiting

orderby

order_desc

row_limit

row_offset

Series Limiting (Timeseries)

series_columns

series_limit

series_limit_metric

group_others_when_limit_reached

timeseries_limit (DEPRECATED)

timeseries_limit_metric (DEPRECATED)

Post-Processing

post_processing

Annotations

annotation_layers

Query Execution Control

url_params

Deprecated Fields

where (DEPRECATED)

having (DEPRECATED)

Properties

metric_names

column_names

Methods

validate(raise_exceptions: bool = True) -> QueryObjectValidationError | None

to_dict() -> dict[str, Any]

cache_key(**extra: Any) -> str

exec_post_processing(df: DataFrame) -> DataFrame

Type Definitions

Column

Metric

OrderBy

Common Usage Patterns

Basic Query

Timeseries Query

Time Comparison Query

Post-Processed Query

Filtered and Sorted Query

`datasource`

`columns`

`metrics`

`is_timeseries`

`is_rowcount`

`result_type`

`filter`

`extras`

`apply_fetch_values_predicate`

`applied_time_extras`

`groupby` (DEPRECATED)

`granularity`

`granularity_sqla` (DEPRECATED)

`time_range`

`from_dttm`

`to_dttm`

`inner_from_dttm`

`inner_to_dttm`

`time_shift`

`time_offsets`

`orderby`

`order_desc`

`row_limit`

`row_offset`

`series_columns`

`series_limit`

`series_limit_metric`

`group_others_when_limit_reached`

`timeseries_limit` (DEPRECATED)

`timeseries_limit_metric` (DEPRECATED)

`post_processing`

`annotation_layers`

`url_params`

`where` (DEPRECATED)

`having` (DEPRECATED)

`metric_names`

`column_names`

`validate(raise_exceptions: bool = True) -> QueryObjectValidationError | None`

`to_dict() -> dict[str, Any]`

`cache_key(**extra: Any) -> str`

`exec_post_processing(df: DataFrame) -> DataFrame`

`Column`

`Metric`

`OrderBy`