:::info Apache Druid supports two query languages: Druid SQL and native queries. This document describes the native language. For information about functions available in SQL, refer to the SQL documentation. :::
Virtual columns are queryable column “views” created from a set of columns during a query.
A virtual column can potentially draw from multiple underlying columns, although a virtual column always presents itself as a single column.
Virtual columns can be referenced by their output names to be used as dimensions or as inputs to filters and aggregators.
Each Apache Druid query can accept a list of virtual columns as a parameter. The following scan query is provided as an example:
{
 "queryType": "scan",
 "dataSource": "page_data",
 "columns":[],
 "virtualColumns": [
    {
      "type": "expression",
      "name": "fooPage",
      "expression": "concat('foo' + page)",
      "outputType": "STRING"
    },
    {
      "type": "expression",
      "name": "tripleWordCount",
      "expression": "wordCount * 3",
      "outputType": "LONG"
    }
  ],
 "intervals": [
   "2013-01-01/2019-01-02"
 ]
}
Expression virtual columns use Druid's native expression system to allow defining query time transforms of inputs from one or more columns.
The expression virtual column has the following syntax:
{ "type": "expression", "name": <name of the virtual column>, "expression": <row expression>, "outputType": <output value type of expression> }
| property | description | required? | 
|---|---|---|
| type | Must be "expression"to indicate that this is an expression virtual column. | yes | 
| name | The name of the virtual column. | yes | 
| expression | An expression that takes a row as input and outputs a value for the virtual column. | yes | 
| outputType | The expression's output will be coerced to this type. Can be LONG, FLOAT, DOUBLE, STRING, ARRAY types, or COMPLEX types. | no, default is FLOAT | 
The nested field virtual column is an optimized virtual column that can provide direct access into various paths of a COMPLEX<json> column, including using their indexes.
This virtual column is used for the SQL operators JSON_VALUE (if processFromRaw is set to false) or JSON_QUERY (if processFromRaw is true), and accepts ‘JSONPath’ or ‘jq’ syntax string representations of paths, or a parsed list of “path parts” in order to determine what should be selected from the column.
You can define a nested field virtual column with any of the following equivalent syntaxes. The examples all produce the same output value, with each example showing a different way to specify how to access the nested value. The first is using JSONPath syntax path, the second with a jq path, and the third uses pathParts.
{ "type": "nested-field", "columnName": "shipTo", "outputName": "v0", "expectedType": "STRING", "path": "$.phoneNumbers[1].number" }
{ "type": "nested-field", "columnName": "shipTo", "outputName": "v1", "expectedType": "STRING", "path": ".phoneNumbers[1].number", "useJqSyntax": true }
{ "type": "nested-field", "columnName": "shipTo", "outputName": "v2", "expectedType": "STRING", "pathParts": [ { "type": "field", "field": "phoneNumbers" }, { "type": "arrayElement", "index": 1 }, { "type": "field", "field": "number" } ] }
| property | description | required? | 
|---|---|---|
| type | Must be "nested-field"to indicate that this is a nested field virtual column. | yes | 
| columnName | The name of the COMPLEX<json>input column. | yes | 
| outputName | The name of the virtual column. | yes | 
| expectedType | The native Druid output type of the column, Druid will coerce output to this type if it does not match the underlying data. This can be STRING,LONG,FLOAT,DOUBLE, orCOMPLEX<json>. ExtractingARRAYtypes is not yet supported. | no, default STRING | 
| pathParts | The parsed path parts used to locate the nested values. pathwill be translated intopathPartsinternally. One ofpathorpathPartsmust be set | no, if pathis defined | 
| processFromRaw | If set to true, the virtual column will process the “raw” JSON data to extract values rather than using an optimized “literal” value selector. This option allows extracting non-literal values (such as nested JSON objects or arrays) as a COMPLEX<json>at the cost of much slower performance. | no, default false | 
| path | ‘JSONPath’ (or ‘jq’) syntax path. One of pathorpathPartsmust be set. | no, if pathPartsis defined | 
| useJqSyntax | If true, parse pathusing ‘jq’ syntax instead of ‘JSONPath’. | no, default is false | 
Specify pathParts as an array of objects that describe each component of the path to traverse. Each object can take the following properties:
| property | description | required? | 
|---|---|---|
| type | Must be ‘field’ or ‘arrayElement’. Use fieldwhen accessing a specific field in a nested structure. UsearrayElementwhen accessing a specific integer position of an array (zero based). | yes | 
| field | The name of the ‘field’ in a ‘field’ typepath part | yes, if typeis ‘field’ | 
| index | The array element index if typeisarrayElement | yes, if typeis ‘arrayElement’ | 
See Nested columns for more information on ingesting and storing nested data.
This virtual column provides an alternative way to use ‘list filtered’ dimension spec as a virtual column. It has optimized access to the underlying column value indexes that can provide a small performance improvement in some cases.
{ "type": "mv-filtered", "name": "filteredDim3", "delegate": "dim3", "values": ["hello", "world"], "isAllowList": true }
| property | description | required? | 
|---|---|---|
| type | Must be "mv-filtered"to indicate that this is a list filtered virtual column. | yes | 
| name | The output name of the virtual column | yes | 
| delegate | The name of the multi-value STRING input column to filter | yes | 
| values | Set of STRING values to allow or deny | yes | 
| isAllowList | If true, the output of the virtual column will be limited to the set specified by values, else it will provide all values except those specified. | No, default true |