Drill has a several XML configuration options to allow you to configure how Drill interprets XML files.
XML data often contains a considerable amount of nesting which is not necessarily useful for data analysis. This parameter allows you to set the nesting level where the data actually starts. The levels start at 1
.
One of the challenges of querying APIs is inconsistent data. Drill allows you to provide a schema for individual endpoints. You can do this in one of three ways:
Note: At the time of writing Drill's XML reader only supports provided schema with scalar data types.
You can set either of these options on a per-endpoint basis as shown below:
"xmlOptions": { "dataLevel": 1 }
Or,
"xmlOptions": { "dataLevel": 2, "schema": { "type": "tuple_schema", "columns": [ { "name": "custom_field", "type": "VARCHAR } ] } }