File: superset/common/query_object.py
The QueryObject class represents a single database query specification constructed on the client. It describes what data to fetch (columns, metrics, filters) without specifying how to fetch it. The datasource connector is responsible for translating the QueryObject into the appropriate query language (SQL, GraphQL, etc.).
datasourceType: BaseDatasource | None
Description: Reference to the datasource (dataset, table, or query) from which to fetch data. This is the data source object itself, not just an identifier.
Default: None
Usage: Set automatically by the QueryContext when loading from the schema. The datasource provides metadata about available columns, metrics, and handles the actual query execution.
Example:
# Automatically set during query context creation query_object.datasource # <SqlaTable 'public.sales_data'>
columnsType: list[Column]
Column Type: Union[AdhocColumn, str]
Description: List of dimensions (non-aggregated columns) to select in the query. These become GROUP BY columns if metrics are specified, or simple SELECT columns otherwise.
Default: [] (empty list)
Formats Supported:
"country" - references a physical column{ "label": "upper_name", "sqlExpression": "UPPER(name)", "hasCustomLabel": True }
Usage Notes:
series_columns in timeseries queriesRelated:
column_names property: Returns list of column labels as stringsseries_columns: Subset of columns used for series limitingExample:
query_object.columns = ["country", "state", "city"] # Or with adhoc columns: query_object.columns = [ "country", { "label": "year", "sqlExpression": "EXTRACT(YEAR FROM created_at)", "hasCustomLabel": True } ]
metricsType: list[Metric] | None
Metric Type: Union[AdhocMetric, str]
Description: List of aggregate expressions to compute. These become the SELECT clause aggregations in SQL queries.
Default: None
Formats Supported:
"count" - references a predefined metric{"label": "count"} - references a predefined metric (converted to string){ "expressionType": "SIMPLE", "aggregate": "SUM", "column": {"column_name": "revenue"}, "label": "Total Revenue", "hasCustomLabel": True }
{ "expressionType": "SQL", "sqlExpression": "SUM(price * quantity)", "label": "Total Sales", "hasCustomLabel": True }
Usage Notes:
None or empty, no aggregation is performed{"label": "..."} is automatically converted to stringRelated:
metric_names property: Returns list of metric labels as stringsis_rowcount: Alternative to metrics for counting rowsExample:
# Simple metric references query_object.metrics = ["count", "sum__revenue"] # Adhoc metrics query_object.metrics = [ { "expressionType": "SIMPLE", "aggregate": "AVG", "column": {"column_name": "price"}, "label": "Average Price" }, { "expressionType": "SQL", "sqlExpression": "SUM(CASE WHEN status='completed' THEN 1 ELSE 0 END)", "label": "Completed Orders" } ]
is_timeseriesType: bool
Description: Indicates whether this query is a timeseries query (data points over time). Affects how results are processed and displayed.
Default: Automatically determined - True if DTTM_ALIAS (special time column constant) is in columns, otherwise can be explicitly set
Usage Notes:
series_columns initializationExample:
# Automatically set to True for timeseries query_object.columns = ["__timestamp", "country"] query_object.is_timeseries # True # Explicitly set query_object.is_timeseries = True
is_rowcountType: bool
Description: When True, returns only the total row count instead of actual data. Used for preview operations or checking data volume.
Default: False
Usage Notes:
Example:
query_object.is_rowcount = True # Query returns: {"count": 15234}
result_typeType: ChartDataResultType | None
Values: "full", "samples", "query", "results", "post_processed", "columns", "timegrains", "drill_detail"
Description: Specifies what type of result to return. Controls which processing pipeline is used.
Default: None (inherits from QueryContext if not specified)
Types:
"full": Complete query execution with all data"samples": Sample rows from the datasource (no metrics, limited rows)"query": Return the query string without executing"results": Like “full” but with minimal metadata"post_processed": Full results for client-side post-processing"columns": Return column metadata only"timegrains": Return available time granularities"drill_detail": Return drill-through detail rowsExample:
query_object.result_type = ChartDataResultType.SAMPLES # Returns sample rows without aggregation
filterType: list[QueryObjectFilterClause]
FilterClause Type:
{ "col": Union[str, AdhocColumn], # Column to filter "op": str, # Operator (e.g., "==", "IN", "LIKE") "val": FilterValues | None, # Value(s) to compare "grain": str | None, # Time grain for temporal filters "isExtra": bool | None # Added by dashboard filters }
Description: List of filter conditions to apply to the query. These become WHERE clause conditions in SQL.
Default: [] (empty list)
Supported Operators: ==, !=, >, <, >=, <=, IN, NOT IN, LIKE, ILIKE, REGEX, TEMPORAL_RANGE, IS NULL, IS NOT NULL, and more (see FilterOperator enum)
Usage Notes:
isExtra: True indicates filter was added by dashboard/native filtersTEMPORAL_RANGE) require grain to be setExample:
query_object.filter = [ { "col": "country", "op": "IN", "val": ["USA", "Canada", "Mexico"] }, { "col": "revenue", "op": ">=", "val": 1000 }, { "col": "created_at", "op": "TEMPORAL_RANGE", "val": "Last 30 days", "grain": "P1D" } ]
extrasType: dict[str, Any]
Description: Additional query parameters and modifiers. This is an extensible dictionary for extra query options.
Default: {} (empty dict)
Common Keys:
"where": Raw SQL WHERE clause (added via AND)"having": Raw SQL HAVING clause for aggregate filters"time_grain_sqla": Time granularity (e.g., "P1D" for 1 day)"relative_start": Start reference point ("today" or "now")"relative_end": End reference point ("today" or "now")"instant_time_comparison_range": For advanced time comparison featuresUsage Notes:
time_grain_sqla controls temporal aggregation levelwhere, having) bypass some security layersExample:
query_object.extras = { "where": "status = 'active' AND archived = false", "having": "SUM(revenue) > 10000", "time_grain_sqla": "P1W", # Weekly granularity "relative_start": "now" }
apply_fetch_values_predicateType: bool
Description: When True, applies additional WHERE clause predicates defined in the datasource configuration for fetching filter values.
Default: False
Usage Notes:
Example:
query_object.apply_fetch_values_predicate = True # Applies any predicates configured on the datasource
applied_time_extrasType: dict[str, str]
Description: Mapping of temporal extras that have been applied to the query. Used for tracking which time filters were actually used.
Default: {} (empty dict)
Usage Notes:
Example:
query_object.applied_time_extras = { "__time_range": "1 year ago : now", "__time_grain": "P1D" }
groupby (DEPRECATED)Status: ⚠️ DEPRECATED - Use columns instead
Type: N/A (automatically renamed to columns)
Description: Legacy field name for grouping columns. Automatically converted to columns during initialization.
Migration: Replace groupby with columns in all new code.
granularityType: str | None
Description: Name of the temporal column to use for time-based operations (filtering, grouping). This is the primary time dimension for the query.
Default: None
Usage Notes:
Related:
extras["time_grain_sqla"]: Controls temporal aggregation granularitytime_range: The time range to filter bygranularity_sqla: Deprecated alias for granularityExample:
query_object.granularity = "order_date" # All time operations will use the order_date column
granularity_sqla (DEPRECATED)Status: ⚠️ DEPRECATED - Use granularity instead
Type: N/A (automatically renamed to granularity)
Description: Legacy SQL-specific field name for temporal column. Automatically converted during initialization.
Migration: Replace granularity_sqla with granularity in all new code.
time_rangeType: str | None
Description: Human-readable time range specification for filtering temporal data. Supports both relative and absolute formats.
Default: None
Supported Formats:
"Last 7 days", "Last week", "Last month", "Last quarter", "Last year""Last 30 days", "Last 6 months", "Next 2 weeks""2023-01-01 : 2023-12-31" (ISO 8601)"2023-01-01 : now", "1 year ago : now""No filter"Usage Notes:
from_dttm and to_dttm datetime objectsRelated:
from_dttm: Computed start datetimeto_dttm: Computed end datetimegranularity: Column to filter onExample:
query_object.time_range = "Last 30 days" # Computed at runtime: from_dttm = now() - 30 days, to_dttm = now() query_object.time_range = "2023-01-01 : 2023-06-30" # Explicit range
from_dttmType: datetime | None
Description: Computed start datetime for the time range filter. Automatically calculated from time_range.
Default: None
Usage Notes:
Example:
query_object.time_range = "Last 7 days" # After processing: query_object.from_dttm # datetime(2024, 1, 15, 0, 0, 0)
to_dttmType: datetime | None
Description: Computed end datetime for the time range filter. Automatically calculated from time_range.
Default: None
Usage Notes:
Example:
query_object.time_range = "Last 7 days" # After processing: query_object.to_dttm # datetime(2024, 1, 22, 23, 59, 59)
inner_from_dttmType: datetime | None
Description: Inner time range start for nested temporal operations. Used when applying time filters to queries that don't have time as a dimension.
Default: None
Usage Notes:
from_dttm for simple queriesExample:
# Set during time offset processing query_object.inner_from_dttm = datetime(2024, 1, 1)
inner_to_dttmType: datetime | None
Description: Inner time range end for nested temporal operations. Used when applying time filters to queries that don't have time as a dimension.
Default: None
Usage Notes:
to_dttm for simple queriesExample:
# Set during time offset processing query_object.inner_to_dttm = datetime(2024, 1, 31)
time_shiftType: str | None
Description: Shifts the entire time range by a specified offset. Used for comparing data across different time periods.
Default: None
Supported Formats: Any parsedatetime-compatible string (e.g., "1 week ago", "3 months ago", "1 year ago")
Usage Notes:
from_dttm and to_dttmtime_offsets (which creates separate queries)Example:
query_object.time_range = "Last 7 days" query_object.time_shift = "1 week ago" # Shifts the entire 7-day window back by 1 week
time_offsetsType: list[str]
Description: List of time offsets for creating comparison queries. Each offset generates an additional query with shifted time ranges, enabling time-over-time comparisons.
Default: [] (empty list)
Supported Formats:
"1 week ago", "1 year ago", "3 months ago""2023-01-01 : 2023-01-31""inherit" (uses the time range span)"2023-06-15" (compares to this specific date)Usage Notes:
revenue → revenue__1 week ago)DATE_RANGE_TIMESHIFTS_ENABLED feature flag for date range formatRelated:
series_limit applies to the main query, not offset queriesExample:
query_object.time_offsets = ["1 week ago", "1 year ago"] # Generates 3 queries total: # 1. Main query (this week) # 2. Same query shifted back 1 week # 3. Same query shifted back 1 year # Results joined on time + other dimensions # Or with date range (requires feature flag): query_object.time_offsets = ["2023-01-01 : 2023-01-31"] # Compares current period to January 2023
orderbyType: list[OrderBy]
OrderBy Type: tuple[Union[Metric, Column], bool]
Description: List of ordering specifications. Each tuple contains a column/metric and a boolean indicating ascending order.
Default: [] (empty list)
Format: [(column_or_metric, is_ascending), ...]
Usage Notes:
True = ascending orderFalse = descending orderRelated:
order_desc: Default sort direction (deprecated in favor of explicit orderby)Example:
query_object.orderby = [ ("revenue", False), # Order by revenue descending ("country", True) # Then by country ascending ]
order_descType: bool
Description: Default sort direction when orderby is not specified. Primarily affects how series are ordered.
Default: True (descending)
Usage Notes:
orderbyorderby for complex sortingExample:
query_object.order_desc = False # Results sorted in ascending order
row_limitType: int | None
Description: Maximum number of rows to return from the query. Acts as a SQL LIMIT clause.
Default: None (uses system default)
Range: >= 0 (0 means no limit)
Usage Notes:
series_limit (which limits timeseries)config["ROW_LIMIT"]Related:
row_offset: Works with row_limit for paginationseries_limit: For limiting timeseries/series countExample:
query_object.row_limit = 1000 # Returns at most 1000 rows
row_offsetType: int
Description: Number of rows to skip before returning results. Acts as a SQL OFFSET clause for pagination.
Default: 0 (no offset)
Range: >= 0
Usage Notes:
row_limit for paginationExample:
# Page 3 of results (20 per page) query_object.row_limit = 20 query_object.row_offset = 40 # Skip first 40 rows
series_columnsType: list[Column]
Description: Subset of columns to use when limiting the number of series in timeseries queries. Defines which dimensions create distinct series.
Default: Automatically initialized based on context:
is_timeseries=True and metrics exist, uses all columnsUsage Notes:
columnsseries_limit to control series countRelated:
series_limit: Maximum number of seriesseries_limit_metric: Metric to use for ranking seriesgroup_others_when_limit_reached: Whether to group remaining seriesExample:
query_object.columns = ["country", "product", "date"] query_object.series_columns = ["country", "product"] # Creates one series per (country, product) combination
series_limitType: int
Description: Maximum number of series to return in a timeseries query. Series are ranked by series_limit_metric and top N are kept.
Default: 0 (no limit)
Usage Notes:
series_columns and series_limit_metric to be setgroup_others_when_limit_reached for handling excluded seriesRelated:
series_columns: Dimensions that define seriesseries_limit_metric: Metric used for rankingtimeseries_limit: Deprecated aliasExample:
query_object.series_limit = 10 query_object.series_limit_metric = "revenue" query_object.series_columns = ["country"] # Returns top 10 countries by revenue
series_limit_metricType: Metric | None
Description: The metric to use for ranking series when series_limit is applied. Determines which series are kept.
Default: None
Usage Notes:
series_limit is setRelated:
series_limit: Number of series to keeptimeseries_limit_metric: Deprecated aliasExample:
query_object.series_limit_metric = "sum__revenue" # Ranks series by total revenue
group_others_when_limit_reachedType: bool
Description: When True and series limit is reached, groups all remaining series into an “Others” category. Prevents incomplete data visualization.
Default: False
Usage Notes:
series_limit is setExample:
query_object.series_limit = 5 query_object.series_limit_metric = "count" query_object.group_others_when_limit_reached = True # Shows top 5 series + "Others" category with remaining aggregated
timeseries_limit (DEPRECATED)Status: ⚠️ DEPRECATED - Use series_limit instead
Type: N/A (automatically renamed to series_limit)
Description: Legacy field name for series limit. Automatically converted during initialization.
Migration: Replace timeseries_limit with series_limit in all new code.
timeseries_limit_metric (DEPRECATED)Status: ⚠️ DEPRECATED - Use series_limit_metric instead
Type: N/A (automatically renamed to series_limit_metric)
Description: Legacy field name for series limit metric. Automatically converted during initialization.
Migration: Replace timeseries_limit_metric with series_limit_metric in all new code.
post_processingType: list[dict[str, Any]]
Description: Ordered list of post-processing operations to apply to the query results. These transformations run on the DataFrame after SQL execution.
Default: [] (empty list)
Operation Format:
{ "operation": str, # Operation name (from pandas_postprocessing module) "options": dict # Operation-specific parameters }
Available Operations:
aggregate: Group and aggregate datapivot: Pivot table transformationrolling: Rolling window calculationssort: Sort dataselect: Select/rename columnscontribution: Calculate contribution percentagesprophet: Time series forecastingboxplot: Statistical boxplot calculationshistogram: Create histogram binsgeohash_decode: Decode geohash to lat/longeohash_encode: Encode lat/lon to geohashgeodetic_parse: Parse geodetic coordinatesUsage Notes:
superset.utils.pandas_postprocessingExample:
query_object.post_processing = [ { "operation": "pivot", "options": { "index": ["country"], "columns": ["product"], "aggregates": { "revenue": {"operator": "sum"} } } }, { "operation": "sort", "options": { "columns": {"revenue": False} # Descending } } ]
Histogram Example:
query_object.post_processing = [ { "operation": "histogram", "options": { "column": "age", "bins": 25 } } ]
annotation_layersType: list[dict[str, Any]]
Description: List of annotation layers to overlay on the chart. Annotations add contextual information like events, ranges, or reference data.
Default: [] (empty list, formula annotations filtered out)
Layer Structure:
{ "annotationType": str, # "FORMULA", "NATIVE", "line", "table" "name": str, # Layer name "value": Any, # Layer-specific value (ID, formula, etc.) "show": bool, # Whether to show the layer "sourceType": str, # "NATIVE", "line", "table", etc. "color": str, # Layer color "opacity": str, # "opacityLow", "opacityMedium", "opacityHigh" "style": str, # "solid", "dashed", "dotted", "longDashed" "width": float, # Line width "showMarkers": bool, # Show markers on line annotations "showLabel": bool, # Always show label "hideLine": bool, # Hide line (show markers only) "timeColumn": str, # Column with timestamps "intervalEndColumn": str, # For interval annotations "titleColumn": str, # Column for titles "descriptionColumns": list, # Columns for descriptions "overrides": dict # Override query properties }
Annotation Types:
"FORMULA": Simple formula overlays (e.g., constant line) - filtered out, don't affect query"NATIVE": Native Superset annotations stored in DB"line" / "table": Annotations from other charts/queriesUsage Notes:
Example:
query_object.annotation_layers = [ { "annotationType": "NATIVE", "name": "Important Events", "value": 1, # annotation_layer_id "show": True, "sourceType": "NATIVE", "color": "#ff0000" }, { "annotationType": "line", "name": "Baseline", "value": 42, # chart_id "show": True, "sourceType": "line", "style": "dashed", "overrides": { "time_range": "Last year" } } ]
url_paramsType: dict[str, str] (keys and values are strings)
Description: Optional query parameters passed from dashboard or Explore view URLs. Used for dynamic filtering and Jinja template variables.
Default: Not set (from kwargs)
Usage Notes:
url_param functionExample:
query_object.url_params = { "country_filter": "USA", "min_date": "2024-01-01" } # In Jinja template: # WHERE country = '{{ url_param("country_filter") }}'
where (DEPRECATED)Status: ⚠️ DEPRECATED - Use extras["where"] instead
Type: Field moved to extras during initialization
Description: Raw SQL WHERE clause. Automatically moved to extras["where"].
Migration: Use extras["where"] directly instead of passing where parameter.
having (DEPRECATED)Status: ⚠️ DEPRECATED - Use extras["having"] instead
Type: Field moved to extras during initialization
Description: Raw SQL HAVING clause. Automatically moved to extras["having"].
Migration: Use extras["having"] directly instead of passing having parameter.
metric_namesType: list[str] (read-only property)
Description: Returns the metric labels as strings. Converts adhoc metrics to their labels.
Usage: Accessing metric names without dealing with adhoc metric dictionaries.
Example:
query_object.metrics = ["count", {"expressionType": "SQL", "label": "revenue", ...}] query_object.metric_names # ["count", "revenue"]
column_namesType: list[str] (read-only property)
Description: Returns the column labels as strings. Converts adhoc columns to their labels.
Usage: Accessing column names without dealing with adhoc column dictionaries.
Example:
query_object.columns = ["country", {"label": "year", "sqlExpression": "...", ...}] query_object.column_names # ["country", "year"]
validate(raise_exceptions: bool = True) -> QueryObjectValidationError | NoneDescription: Validates the query object for correctness. Checks for duplicate labels, missing series columns, invalid time offsets, and sanitizes filters.
Parameters:
raise_exceptions: If True, raises exception on validation error. If False, returns the error object.Validates:
Returns: None if valid, or QueryObjectValidationError if raise_exceptions=False
Example:
try: query_object.validate() except QueryObjectValidationError as e: print(f"Validation failed: {e.message}")
to_dict() -> dict[str, Any]Description: Serializes the QueryObject to a dictionary. Used for passing to datasource connectors and caching.
Returns: Dictionary representation of the query object
Usage: Called internally when executing queries on datasources.
Example:
query_dict = query_object.to_dict() # { # "columns": ["country"], # "metrics": ["count"], # "filters": [...], # ... # }
cache_key(**extra: Any) -> strDescription: Generates a unique cache key for this query. Uses MD5/SHA hash of query parameters.
Parameters:
**extra: Additional key-value pairs to include in cache keyKey Components:
to_dict()time_range (not from_dttm/to_dttm - for relative time caching)datasource.uidresult_typepost_processingtime_offsetsannotation_layers (if present)Returns: MD5/SHA hash string
Example:
cache_key = query_object.cache_key( time_offset="1 week ago", time_grain="P1D" ) # "a3f5c8e9d..."
exec_post_processing(df: DataFrame) -> DataFrameDescription: Applies all post-processing operations to a DataFrame in sequence.
Parameters:
df: Pandas DataFrame from query executionReturns: Transformed DataFrame
Raises: InvalidPostProcessingError if operation is invalid
Example:
df = datasource.query(query_object.to_dict()).df processed_df = query_object.exec_post_processing(df)
ColumnType: Union[AdhocColumn, str]
AdhocColumn Structure:
{ "label": str, # Column label "sqlExpression": str, # SQL expression "hasCustomLabel": bool, # Whether label is custom "columnType": str, # "BASE_AXIS" or "SERIES" "timeGrain": str # Time grain if temporal }
MetricType: Union[AdhocMetric, str]
AdhocMetric Structure:
{ "expressionType": "SIMPLE" | "SQL", "label": str, "hasCustomLabel": bool, # For SIMPLE: "aggregate": "SUM" | "AVG" | "COUNT" | "MIN" | "MAX" | "COUNT_DISTINCT", "column": AdhocMetricColumn, # For SQL: "sqlExpression": str }
OrderByType: tuple[Union[Metric, Column], bool]
Format: (column_or_metric, is_ascending)
Example: ("revenue", False) means “ORDER BY revenue DESC”
query_object = QueryObject( columns=["country", "city"], metrics=["count"], row_limit=100 )
query_object = QueryObject( columns=["__timestamp", "country"], metrics=["sum__revenue"], granularity="order_date", time_range="Last 30 days", extras={"time_grain_sqla": "P1D"}, is_timeseries=True )
query_object = QueryObject( columns=["country"], metrics=["sum__revenue"], time_range="Last 7 days", time_offsets=["1 week ago", "1 year ago"], granularity="order_date" )
query_object = QueryObject( columns=["age"], row_limit=10000, post_processing=[ { "operation": "histogram", "options": {"column": "age", "bins": 25} } ] )
query_object = QueryObject( columns=["country", "product"], metrics=["sum__revenue", "count"], filter=[ {"col": "status", "op": "==", "val": "completed"}, {"col": "revenue", "op": ">=", "val": 100} ], orderby=[("sum__revenue", False)], # Descending row_limit=50 )