QueryObject Documentation

File: superset/common/query_object.py

The QueryObject class represents a single database query specification constructed on the client. It describes what data to fetch (columns, metrics, filters) without specifying how to fetch it. The datasource connector is responsible for translating the QueryObject into the appropriate query language (SQL, GraphQL, etc.).

Table of Contents


Core Query Specification

datasource

Type: BaseDatasource | None

Description: Reference to the datasource (dataset, table, or query) from which to fetch data. This is the data source object itself, not just an identifier.

Default: None

Usage: Set automatically by the QueryContext when loading from the schema. The datasource provides metadata about available columns, metrics, and handles the actual query execution.

Example:

# Automatically set during query context creation
query_object.datasource  # <SqlaTable 'public.sales_data'>

columns

Type: list[Column]

Column Type: Union[AdhocColumn, str]

Description: List of dimensions (non-aggregated columns) to select in the query. These become GROUP BY columns if metrics are specified, or simple SELECT columns otherwise.

Default: [] (empty list)

Formats Supported:

  1. String reference: "country" - references a physical column
  2. Adhoc column (dict):
    {
        "label": "upper_name",
        "sqlExpression": "UPPER(name)",
        "hasCustomLabel": True
    }
    

Usage Notes:

  • Formula annotations don't count as columns (filtered out)
  • Used as the default for series_columns in timeseries queries
  • Referenced in GROUP BY clauses when metrics are present

Related:

  • column_names property: Returns list of column labels as strings
  • series_columns: Subset of columns used for series limiting

Example:

query_object.columns = ["country", "state", "city"]

# Or with adhoc columns:
query_object.columns = [
    "country",
    {
        "label": "year",
        "sqlExpression": "EXTRACT(YEAR FROM created_at)",
        "hasCustomLabel": True
    }
]

metrics

Type: list[Metric] | None

Metric Type: Union[AdhocMetric, str]

Description: List of aggregate expressions to compute. These become the SELECT clause aggregations in SQL queries.

Default: None

Formats Supported:

  1. String reference: "count" - references a predefined metric
  2. Legacy format: {"label": "count"} - references a predefined metric (converted to string)
  3. Adhoc SIMPLE metric:
    {
        "expressionType": "SIMPLE",
        "aggregate": "SUM",
        "column": {"column_name": "revenue"},
        "label": "Total Revenue",
        "hasCustomLabel": True
    }
    
  4. Adhoc SQL metric:
    {
        "expressionType": "SQL",
        "sqlExpression": "SUM(price * quantity)",
        "label": "Total Sales",
        "hasCustomLabel": True
    }
    

Usage Notes:

  • When metrics are specified, queries automatically include GROUP BY
  • When None or empty, no aggregation is performed
  • Legacy format {"label": "..."} is automatically converted to string

Related:

  • metric_names property: Returns list of metric labels as strings
  • is_rowcount: Alternative to metrics for counting rows

Example:

# Simple metric references
query_object.metrics = ["count", "sum__revenue"]

# Adhoc metrics
query_object.metrics = [
    {
        "expressionType": "SIMPLE",
        "aggregate": "AVG",
        "column": {"column_name": "price"},
        "label": "Average Price"
    },
    {
        "expressionType": "SQL",
        "sqlExpression": "SUM(CASE WHEN status='completed' THEN 1 ELSE 0 END)",
        "label": "Completed Orders"
    }
]

Data Selection

is_timeseries

Type: bool

Description: Indicates whether this query is a timeseries query (data points over time). Affects how results are processed and displayed.

Default: Automatically determined - True if DTTM_ALIAS (special time column constant) is in columns, otherwise can be explicitly set

Usage Notes:

  • Influences series_columns initialization
  • Affects post-processing behavior in some visualizations
  • Used to determine if series limiting should apply

Example:

# Automatically set to True for timeseries
query_object.columns = ["__timestamp", "country"]
query_object.is_timeseries  # True

# Explicitly set
query_object.is_timeseries = True

is_rowcount

Type: bool

Description: When True, returns only the total row count instead of actual data. Used for preview operations or checking data volume.

Default: False

Usage Notes:

  • Mutually exclusive with normal metric aggregation
  • Result contains just a count, no actual rows
  • Useful for pagination and data size checks

Example:

query_object.is_rowcount = True
# Query returns: {"count": 15234}

result_type

Type: ChartDataResultType | None

Values: "full", "samples", "query", "results", "post_processed", "columns", "timegrains", "drill_detail"

Description: Specifies what type of result to return. Controls which processing pipeline is used.

Default: None (inherits from QueryContext if not specified)

Types:

  • "full": Complete query execution with all data
  • "samples": Sample rows from the datasource (no metrics, limited rows)
  • "query": Return the query string without executing
  • "results": Like “full” but with minimal metadata
  • "post_processed": Full results for client-side post-processing
  • "columns": Return column metadata only
  • "timegrains": Return available time granularities
  • "drill_detail": Return drill-through detail rows

Example:

query_object.result_type = ChartDataResultType.SAMPLES
# Returns sample rows without aggregation

Filtering

filter

Type: list[QueryObjectFilterClause]

FilterClause Type:

{
    "col": Union[str, AdhocColumn],  # Column to filter
    "op": str,                        # Operator (e.g., "==", "IN", "LIKE")
    "val": FilterValues | None,       # Value(s) to compare
    "grain": str | None,              # Time grain for temporal filters
    "isExtra": bool | None            # Added by dashboard filters
}

Description: List of filter conditions to apply to the query. These become WHERE clause conditions in SQL.

Default: [] (empty list)

Supported Operators: ==, !=, >, <, >=, <=, IN, NOT IN, LIKE, ILIKE, REGEX, TEMPORAL_RANGE, IS NULL, IS NOT NULL, and more (see FilterOperator enum)

Usage Notes:

  • Filters are ANDed together
  • isExtra: True indicates filter was added by dashboard/native filters
  • Temporal filters (TEMPORAL_RANGE) require grain to be set
  • Filters go through Jinja template processing if datasource supports it
  • Sanitized for SQL injection before execution

Example:

query_object.filter = [
    {
        "col": "country",
        "op": "IN",
        "val": ["USA", "Canada", "Mexico"]
    },
    {
        "col": "revenue",
        "op": ">=",
        "val": 1000
    },
    {
        "col": "created_at",
        "op": "TEMPORAL_RANGE",
        "val": "Last 30 days",
        "grain": "P1D"
    }
]

extras

Type: dict[str, Any]

Description: Additional query parameters and modifiers. This is an extensible dictionary for extra query options.

Default: {} (empty dict)

Common Keys:

  • "where": Raw SQL WHERE clause (added via AND)
  • "having": Raw SQL HAVING clause for aggregate filters
  • "time_grain_sqla": Time granularity (e.g., "P1D" for 1 day)
  • "relative_start": Start reference point ("today" or "now")
  • "relative_end": End reference point ("today" or "now")
  • "instant_time_comparison_range": For advanced time comparison features

Usage Notes:

  • WHERE and HAVING clauses support Jinja templates
  • All SQL clauses are sanitized for security
  • time_grain_sqla controls temporal aggregation level
  • Warning: Direct SQL clauses (where, having) bypass some security layers

Example:

query_object.extras = {
    "where": "status = 'active' AND archived = false",
    "having": "SUM(revenue) > 10000",
    "time_grain_sqla": "P1W",  # Weekly granularity
    "relative_start": "now"
}

apply_fetch_values_predicate

Type: bool

Description: When True, applies additional WHERE clause predicates defined in the datasource configuration for fetching filter values.

Default: False

Usage Notes:

  • Used primarily when loading filter options
  • Applies datasource-specific filter predicates
  • Helps limit the domain of filter values

Example:

query_object.apply_fetch_values_predicate = True
# Applies any predicates configured on the datasource

applied_time_extras

Type: dict[str, str]

Description: Mapping of temporal extras that have been applied to the query. Used for tracking which time filters were actually used.

Default: {} (empty dict)

Usage Notes:

  • Populated during query execution
  • Used for displaying which time filters are active
  • Keys are typically time column names
  • Values are human-readable descriptions

Example:

query_object.applied_time_extras = {
    "__time_range": "1 year ago : now",
    "__time_grain": "P1D"
}

Aggregation & Metrics

groupby (DEPRECATED)

Status: ⚠️ DEPRECATED - Use columns instead

Type: N/A (automatically renamed to columns)

Description: Legacy field name for grouping columns. Automatically converted to columns during initialization.

Migration: Replace groupby with columns in all new code.


Time/Temporal

granularity

Type: str | None

Description: Name of the temporal column to use for time-based operations (filtering, grouping). This is the primary time dimension for the query.

Default: None

Usage Notes:

  • Used for temporal filtering and aggregation
  • Essential for timeseries queries
  • References a datetime column in the datasource
  • Used in time range filters

Related:

  • extras["time_grain_sqla"]: Controls temporal aggregation granularity
  • time_range: The time range to filter by
  • granularity_sqla: Deprecated alias for granularity

Example:

query_object.granularity = "order_date"
# All time operations will use the order_date column

granularity_sqla (DEPRECATED)

Status: ⚠️ DEPRECATED - Use granularity instead

Type: N/A (automatically renamed to granularity)

Description: Legacy SQL-specific field name for temporal column. Automatically converted during initialization.

Migration: Replace granularity_sqla with granularity in all new code.


time_range

Type: str | None

Description: Human-readable time range specification for filtering temporal data. Supports both relative and absolute formats.

Default: None

Supported Formats:

  • Relative: "Last 7 days", "Last week", "Last month", "Last quarter", "Last year"
  • Relative with number: "Last 30 days", "Last 6 months", "Next 2 weeks"
  • Absolute: "2023-01-01 : 2023-12-31" (ISO 8601)
  • Mixed: "2023-01-01 : now", "1 year ago : now"
  • No filter: "No filter"

Usage Notes:

  • Parsed into from_dttm and to_dttm datetime objects
  • Relative times are resolved at query execution time
  • Used for cache key generation (not the parsed datetime values)
  • Supports parsedatetime syntax

Related:

  • from_dttm: Computed start datetime
  • to_dttm: Computed end datetime
  • granularity: Column to filter on

Example:

query_object.time_range = "Last 30 days"
# Computed at runtime: from_dttm = now() - 30 days, to_dttm = now()

query_object.time_range = "2023-01-01 : 2023-06-30"
# Explicit range

from_dttm

Type: datetime | None

Description: Computed start datetime for the time range filter. Automatically calculated from time_range.

Default: None

Usage Notes:

  • Set automatically during query processing
  • Not included in cache key (time_range is used instead)
  • May be overridden for time offset queries

Example:

query_object.time_range = "Last 7 days"
# After processing:
query_object.from_dttm  # datetime(2024, 1, 15, 0, 0, 0)

to_dttm

Type: datetime | None

Description: Computed end datetime for the time range filter. Automatically calculated from time_range.

Default: None

Usage Notes:

  • Set automatically during query processing
  • Not included in cache key (time_range is used instead)
  • May be overridden for time offset queries

Example:

query_object.time_range = "Last 7 days"
# After processing:
query_object.to_dttm  # datetime(2024, 1, 22, 23, 59, 59)

inner_from_dttm

Type: datetime | None

Description: Inner time range start for nested temporal operations. Used when applying time filters to queries that don't have time as a dimension.

Default: None

Usage Notes:

  • Used in advanced time comparison scenarios
  • Typically same as from_dttm for simple queries
  • May differ when time offsets are applied

Example:

# Set during time offset processing
query_object.inner_from_dttm = datetime(2024, 1, 1)

inner_to_dttm

Type: datetime | None

Description: Inner time range end for nested temporal operations. Used when applying time filters to queries that don't have time as a dimension.

Default: None

Usage Notes:

  • Used in advanced time comparison scenarios
  • Typically same as to_dttm for simple queries
  • May differ when time offsets are applied

Example:

# Set during time offset processing
query_object.inner_to_dttm = datetime(2024, 1, 31)

time_shift

Type: str | None

Description: Shifts the entire time range by a specified offset. Used for comparing data across different time periods.

Default: None

Supported Formats: Any parsedatetime-compatible string (e.g., "1 week ago", "3 months ago", "1 year ago")

Usage Notes:

  • Applied to both from_dttm and to_dttm
  • Different from time_offsets (which creates separate queries)
  • Affects the main query time range

Example:

query_object.time_range = "Last 7 days"
query_object.time_shift = "1 week ago"
# Shifts the entire 7-day window back by 1 week

time_offsets

Type: list[str]

Description: List of time offsets for creating comparison queries. Each offset generates an additional query with shifted time ranges, enabling time-over-time comparisons.

Default: [] (empty list)

Supported Formats:

  1. Relative: "1 week ago", "1 year ago", "3 months ago"
  2. Date range (with feature flag): "2023-01-01 : 2023-01-31"
  3. Special: "inherit" (uses the time range span)
  4. Custom date: "2023-06-15" (compares to this specific date)

Usage Notes:

  • Creates separate queries for each offset
  • Results are joined with the main query results
  • Offset metrics are renamed (e.g., revenuerevenue__1 week ago)
  • Requires DATE_RANGE_TIMESHIFTS_ENABLED feature flag for date range format
  • Used for year-over-year, month-over-month comparisons

Related:

  • Time grain is required for proper joining of offset results
  • series_limit applies to the main query, not offset queries

Example:

query_object.time_offsets = ["1 week ago", "1 year ago"]
# Generates 3 queries total:
# 1. Main query (this week)
# 2. Same query shifted back 1 week
# 3. Same query shifted back 1 year
# Results joined on time + other dimensions

# Or with date range (requires feature flag):
query_object.time_offsets = ["2023-01-01 : 2023-01-31"]
# Compares current period to January 2023

Sorting & Limiting

orderby

Type: list[OrderBy]

OrderBy Type: tuple[Union[Metric, Column], bool]

Description: List of ordering specifications. Each tuple contains a column/metric and a boolean indicating ascending order.

Default: [] (empty list)

Format: [(column_or_metric, is_ascending), ...]

Usage Notes:

  • Boolean True = ascending order
  • Boolean False = descending order
  • Can order by both columns and metrics
  • Applied after aggregation

Related:

  • order_desc: Default sort direction (deprecated in favor of explicit orderby)

Example:

query_object.orderby = [
    ("revenue", False),  # Order by revenue descending
    ("country", True)    # Then by country ascending
]

order_desc

Type: bool

Description: Default sort direction when orderby is not specified. Primarily affects how series are ordered.

Default: True (descending)

Usage Notes:

  • Less flexible than orderby
  • Prefer using explicit orderby for complex sorting
  • Used mainly for backward compatibility

Example:

query_object.order_desc = False
# Results sorted in ascending order

row_limit

Type: int | None

Description: Maximum number of rows to return from the query. Acts as a SQL LIMIT clause.

Default: None (uses system default)

Range: >= 0 (0 means no limit)

Usage Notes:

  • Applied after all filtering and aggregation
  • Different from series_limit (which limits timeseries)
  • System default from config["ROW_LIMIT"]
  • Can be overridden per-query

Related:

  • row_offset: Works with row_limit for pagination
  • series_limit: For limiting timeseries/series count

Example:

query_object.row_limit = 1000
# Returns at most 1000 rows

row_offset

Type: int

Description: Number of rows to skip before returning results. Acts as a SQL OFFSET clause for pagination.

Default: 0 (no offset)

Range: >= 0

Usage Notes:

  • Used with row_limit for pagination
  • Applied after ordering
  • Useful for infinite scroll or paginated tables

Example:

# Page 3 of results (20 per page)
query_object.row_limit = 20
query_object.row_offset = 40  # Skip first 40 rows

Series Limiting (Timeseries)

series_columns

Type: list[Column]

Description: Subset of columns to use when limiting the number of series in timeseries queries. Defines which dimensions create distinct series.

Default: Automatically initialized based on context:

  • If explicitly provided, uses those columns
  • If is_timeseries=True and metrics exist, uses all columns
  • Otherwise, empty list

Usage Notes:

  • All series_columns must be present in columns
  • Used with series_limit to control series count
  • Validated during query validation
  • Creates one series per unique combination of series_column values

Related:

  • series_limit: Maximum number of series
  • series_limit_metric: Metric to use for ranking series
  • group_others_when_limit_reached: Whether to group remaining series

Example:

query_object.columns = ["country", "product", "date"]
query_object.series_columns = ["country", "product"]
# Creates one series per (country, product) combination

series_limit

Type: int

Description: Maximum number of series to return in a timeseries query. Series are ranked by series_limit_metric and top N are kept.

Default: 0 (no limit)

Usage Notes:

  • Only applies to timeseries queries
  • Requires series_columns and series_limit_metric to be set
  • Top series are selected by the specified metric
  • See group_others_when_limit_reached for handling excluded series

Related:

  • series_columns: Dimensions that define series
  • series_limit_metric: Metric used for ranking
  • timeseries_limit: Deprecated alias

Example:

query_object.series_limit = 10
query_object.series_limit_metric = "revenue"
query_object.series_columns = ["country"]
# Returns top 10 countries by revenue

series_limit_metric

Type: Metric | None

Description: The metric to use for ranking series when series_limit is applied. Determines which series are kept.

Default: None

Usage Notes:

  • Required when series_limit is set
  • Must be one of the metrics in the query
  • Series are ranked in descending order by this metric
  • Can be a string reference or adhoc metric

Related:

  • series_limit: Number of series to keep
  • timeseries_limit_metric: Deprecated alias

Example:

query_object.series_limit_metric = "sum__revenue"
# Ranks series by total revenue

group_others_when_limit_reached

Type: bool

Description: When True and series limit is reached, groups all remaining series into an “Others” category. Prevents incomplete data visualization.

Default: False

Usage Notes:

  • Only relevant when series_limit is set
  • Aggregates metrics for excluded series
  • Helps show complete totals while limiting series count
  • “Others” category appears as a separate series

Example:

query_object.series_limit = 5
query_object.series_limit_metric = "count"
query_object.group_others_when_limit_reached = True
# Shows top 5 series + "Others" category with remaining aggregated

timeseries_limit (DEPRECATED)

Status: ⚠️ DEPRECATED - Use series_limit instead

Type: N/A (automatically renamed to series_limit)

Description: Legacy field name for series limit. Automatically converted during initialization.

Migration: Replace timeseries_limit with series_limit in all new code.


timeseries_limit_metric (DEPRECATED)

Status: ⚠️ DEPRECATED - Use series_limit_metric instead

Type: N/A (automatically renamed to series_limit_metric)

Description: Legacy field name for series limit metric. Automatically converted during initialization.

Migration: Replace timeseries_limit_metric with series_limit_metric in all new code.


Post-Processing

post_processing

Type: list[dict[str, Any]]

Description: Ordered list of post-processing operations to apply to the query results. These transformations run on the DataFrame after SQL execution.

Default: [] (empty list)

Operation Format:

{
    "operation": str,      # Operation name (from pandas_postprocessing module)
    "options": dict        # Operation-specific parameters
}

Available Operations:

  • aggregate: Group and aggregate data
  • pivot: Pivot table transformation
  • rolling: Rolling window calculations
  • sort: Sort data
  • select: Select/rename columns
  • contribution: Calculate contribution percentages
  • prophet: Time series forecasting
  • boxplot: Statistical boxplot calculations
  • histogram: Create histogram bins
  • geohash_decode: Decode geohash to lat/lon
  • geohash_encode: Encode lat/lon to geohash
  • geodetic_parse: Parse geodetic coordinates
  • And more...

Usage Notes:

  • Operations applied in sequence
  • Each operation receives output of previous operation
  • All operations are from superset.utils.pandas_postprocessing
  • Validation ensures operation names are valid
  • Useful for client-side transformations without re-querying

Example:

query_object.post_processing = [
    {
        "operation": "pivot",
        "options": {
            "index": ["country"],
            "columns": ["product"],
            "aggregates": {
                "revenue": {"operator": "sum"}
            }
        }
    },
    {
        "operation": "sort",
        "options": {
            "columns": {"revenue": False}  # Descending
        }
    }
]

Histogram Example:

query_object.post_processing = [
    {
        "operation": "histogram",
        "options": {
            "column": "age",
            "bins": 25
        }
    }
]

Annotations

annotation_layers

Type: list[dict[str, Any]]

Description: List of annotation layers to overlay on the chart. Annotations add contextual information like events, ranges, or reference data.

Default: [] (empty list, formula annotations filtered out)

Layer Structure:

{
    "annotationType": str,          # "FORMULA", "NATIVE", "line", "table"
    "name": str,                    # Layer name
    "value": Any,                   # Layer-specific value (ID, formula, etc.)
    "show": bool,                   # Whether to show the layer
    "sourceType": str,              # "NATIVE", "line", "table", etc.
    "color": str,                   # Layer color
    "opacity": str,                 # "opacityLow", "opacityMedium", "opacityHigh"
    "style": str,                   # "solid", "dashed", "dotted", "longDashed"
    "width": float,                 # Line width
    "showMarkers": bool,            # Show markers on line annotations
    "showLabel": bool,              # Always show label
    "hideLine": bool,               # Hide line (show markers only)
    "timeColumn": str,              # Column with timestamps
    "intervalEndColumn": str,       # For interval annotations
    "titleColumn": str,             # Column for titles
    "descriptionColumns": list,     # Columns for descriptions
    "overrides": dict               # Override query properties
}

Annotation Types:

  • "FORMULA": Simple formula overlays (e.g., constant line) - filtered out, don't affect query
  • "NATIVE": Native Superset annotations stored in DB
  • "line" / "table": Annotations from other charts/queries

Usage Notes:

  • Formula annotations are filtered out (don't affect payload)
  • Other annotations trigger additional data fetches
  • Annotations can override time range, granularity, etc.
  • Used primarily for time series visualizations

Example:

query_object.annotation_layers = [
    {
        "annotationType": "NATIVE",
        "name": "Important Events",
        "value": 1,  # annotation_layer_id
        "show": True,
        "sourceType": "NATIVE",
        "color": "#ff0000"
    },
    {
        "annotationType": "line",
        "name": "Baseline",
        "value": 42,  # chart_id
        "show": True,
        "sourceType": "line",
        "style": "dashed",
        "overrides": {
            "time_range": "Last year"
        }
    }
]

Query Execution Control

url_params

Type: dict[str, str] (keys and values are strings)

Description: Optional query parameters passed from dashboard or Explore view URLs. Used for dynamic filtering and Jinja template variables.

Default: Not set (from kwargs)

Usage Notes:

  • Available in Jinja templates via url_param function
  • Typically set by dashboard filters or URL parameters
  • Keys and values are always strings
  • Can be used for dynamic query customization

Example:

query_object.url_params = {
    "country_filter": "USA",
    "min_date": "2024-01-01"
}

# In Jinja template:
# WHERE country = '{{ url_param("country_filter") }}'

Deprecated Fields

where (DEPRECATED)

Status: ⚠️ DEPRECATED - Use extras["where"] instead

Type: Field moved to extras during initialization

Description: Raw SQL WHERE clause. Automatically moved to extras["where"].

Migration: Use extras["where"] directly instead of passing where parameter.


having (DEPRECATED)

Status: ⚠️ DEPRECATED - Use extras["having"] instead

Type: Field moved to extras during initialization

Description: Raw SQL HAVING clause. Automatically moved to extras["having"].

Migration: Use extras["having"] directly instead of passing having parameter.


Properties

metric_names

Type: list[str] (read-only property)

Description: Returns the metric labels as strings. Converts adhoc metrics to their labels.

Usage: Accessing metric names without dealing with adhoc metric dictionaries.

Example:

query_object.metrics = ["count", {"expressionType": "SQL", "label": "revenue", ...}]
query_object.metric_names  # ["count", "revenue"]

column_names

Type: list[str] (read-only property)

Description: Returns the column labels as strings. Converts adhoc columns to their labels.

Usage: Accessing column names without dealing with adhoc column dictionaries.

Example:

query_object.columns = ["country", {"label": "year", "sqlExpression": "...", ...}]
query_object.column_names  # ["country", "year"]

Methods

validate(raise_exceptions: bool = True) -> QueryObjectValidationError | None

Description: Validates the query object for correctness. Checks for duplicate labels, missing series columns, invalid time offsets, and sanitizes filters.

Parameters:

  • raise_exceptions: If True, raises exception on validation error. If False, returns the error object.

Validates:

  1. No missing series columns
  2. No duplicate column/metric labels
  3. Valid time offset configurations
  4. Sanitizes WHERE/HAVING clauses for SQL injection

Returns: None if valid, or QueryObjectValidationError if raise_exceptions=False

Example:

try:
    query_object.validate()
except QueryObjectValidationError as e:
    print(f"Validation failed: {e.message}")

to_dict() -> dict[str, Any]

Description: Serializes the QueryObject to a dictionary. Used for passing to datasource connectors and caching.

Returns: Dictionary representation of the query object

Usage: Called internally when executing queries on datasources.

Example:

query_dict = query_object.to_dict()
# {
#   "columns": ["country"],
#   "metrics": ["count"],
#   "filters": [...],
#   ...
# }

cache_key(**extra: Any) -> str

Description: Generates a unique cache key for this query. Uses MD5/SHA hash of query parameters.

Parameters:

  • **extra: Additional key-value pairs to include in cache key

Key Components:

  • All query parameters from to_dict()
  • time_range (not from_dttm/to_dttm - for relative time caching)
  • datasource.uid
  • result_type
  • post_processing
  • time_offsets
  • annotation_layers (if present)
  • User impersonation key (if enabled)

Returns: MD5/SHA hash string

Example:

cache_key = query_object.cache_key(
    time_offset="1 week ago",
    time_grain="P1D"
)
# "a3f5c8e9d..."

exec_post_processing(df: DataFrame) -> DataFrame

Description: Applies all post-processing operations to a DataFrame in sequence.

Parameters:

  • df: Pandas DataFrame from query execution

Returns: Transformed DataFrame

Raises: InvalidPostProcessingError if operation is invalid

Example:

df = datasource.query(query_object.to_dict()).df
processed_df = query_object.exec_post_processing(df)

Type Definitions

Column

Type: Union[AdhocColumn, str]

AdhocColumn Structure:

{
    "label": str,              # Column label
    "sqlExpression": str,      # SQL expression
    "hasCustomLabel": bool,    # Whether label is custom
    "columnType": str,         # "BASE_AXIS" or "SERIES"
    "timeGrain": str           # Time grain if temporal
}

Metric

Type: Union[AdhocMetric, str]

AdhocMetric Structure:

{
    "expressionType": "SIMPLE" | "SQL",
    "label": str,
    "hasCustomLabel": bool,

    # For SIMPLE:
    "aggregate": "SUM" | "AVG" | "COUNT" | "MIN" | "MAX" | "COUNT_DISTINCT",
    "column": AdhocMetricColumn,

    # For SQL:
    "sqlExpression": str
}

OrderBy

Type: tuple[Union[Metric, Column], bool]

Format: (column_or_metric, is_ascending)

Example: ("revenue", False) means “ORDER BY revenue DESC”


Common Usage Patterns

Basic Query

query_object = QueryObject(
    columns=["country", "city"],
    metrics=["count"],
    row_limit=100
)

Timeseries Query

query_object = QueryObject(
    columns=["__timestamp", "country"],
    metrics=["sum__revenue"],
    granularity="order_date",
    time_range="Last 30 days",
    extras={"time_grain_sqla": "P1D"},
    is_timeseries=True
)

Time Comparison Query

query_object = QueryObject(
    columns=["country"],
    metrics=["sum__revenue"],
    time_range="Last 7 days",
    time_offsets=["1 week ago", "1 year ago"],
    granularity="order_date"
)

Post-Processed Query

query_object = QueryObject(
    columns=["age"],
    row_limit=10000,
    post_processing=[
        {
            "operation": "histogram",
            "options": {"column": "age", "bins": 25}
        }
    ]
)

Filtered and Sorted Query

query_object = QueryObject(
    columns=["country", "product"],
    metrics=["sum__revenue", "count"],
    filter=[
        {"col": "status", "op": "==", "val": "completed"},
        {"col": "revenue", "op": ">=", "val": 100}
    ],
    orderby=[("sum__revenue", False)],  # Descending
    row_limit=50
)