See the current usage using datafusion-cli --help:
Apache Arrow <dev@arrow.apache.org> Command Line Client for DataFusion query engine. USAGE: datafusion-cli [OPTIONS] OPTIONS: -b, --batch-size <BATCH_SIZE> The batch size of each query, or use DataFusion default -c, --command <COMMAND>... Execute the given command string(s), then exit --color Enables console syntax highlighting -f, --file <FILE>... Execute commands from file(s), then exit --format <FORMAT> [default: table] [possible values: csv, tsv, table, json, nd-json] -h, --help Print help information -m, --memory-limit <MEMORY_LIMIT> The memory pool limitation (e.g. '10g'), default to None (no limit) --maxrows <MAXROWS> The max number of rows to display for 'Table' format [possible values: numbers(0/10/...), inf(no limit)] [default: 40] --mem-pool-type <MEM_POOL_TYPE> Specify the memory pool type 'greedy' or 'fair', default to 'greedy' --top-memory-consumers <TOP_MEMORY_CONSUMERS> The number of top memory consumers to display when query fails due to memory exhaustion. To disable memory consumer tracking, set this value to 0 [default: 3] -d, --disk-limit <DISK_LIMIT> Available disk space for spilling queries (e.g. '10g'), default to None (uses DataFusion's default value of '100g') -p, --data-path <DATA_PATH> Path to your data, default to current directory -q, --quiet Reduce printing other than the results and work quietly -r, --rc <RC>... Run the provided files on startup instead of ~/.datafusionrc -V, --version Print version information
Available commands inside DataFusion CLI are:
> \q
> \?
> \d
> \d table_name
> \quiet [true|false]
> \h
> \h function
In addition to the normal SQL supported in DataFusion, datafusion-cli also supports additional statements and commands:
SHOW ALL [VERBOSE]Show configuration options
> show all; +-------------------------------------------------+---------+ | name | value | +-------------------------------------------------+---------+ | datafusion.execution.batch_size | 8192 | | datafusion.execution.coalesce_batches | true | | datafusion.execution.time_zone | UTC | | datafusion.explain.logical_plan_only | false | | datafusion.explain.physical_plan_only | false | | datafusion.optimizer.filter_null_join_keys | false | | datafusion.optimizer.skip_failed_rules | true | +-------------------------------------------------+---------+
SHOW <OPTION>>Show specific configuration option
> show datafusion.execution.batch_size; +-------------------------------------------------+---------+ | name | value | +-------------------------------------------------+---------+ | datafusion.execution.batch_size | 8192 | +-------------------------------------------------+---------+
SET <OPTION> TO <VALUE>> SET datafusion.execution.batch_size to 1024;
All available configuration options can be seen using SHOW ALL as described above.
You can change the configuration options using environment variables. datafusion-cli looks in the corresponding environment variable with an upper case name and all . converted to _.
For example, to set datafusion.execution.batch_size to 1024 you would set the DATAFUSION_EXECUTION_BATCH_SIZE environment variable appropriately:
$ DATAFUSION_EXECUTION_BATCH_SIZE=1024 datafusion-cli DataFusion CLI v12.0.0 > show all; +-------------------------------------------------+---------+ | name | value | +-------------------------------------------------+---------+ | datafusion.execution.batch_size | 1024 | | datafusion.execution.coalesce_batches | true | | datafusion.execution.time_zone | UTC | | datafusion.explain.logical_plan_only | false | | datafusion.explain.physical_plan_only | false | | datafusion.optimizer.filter_null_join_keys | false | | datafusion.optimizer.skip_failed_rules | true | +-------------------------------------------------+---------+ 8 rows in set. Query took 0.002 seconds.
You can change the configuration options using SET statement as well
$ datafusion-cli DataFusion CLI v13.0.0 > show datafusion.execution.batch_size; +---------------------------------+---------+ | name | value | +---------------------------------+---------+ | datafusion.execution.batch_size | 8192 | +---------------------------------+---------+ 1 row in set. Query took 0.011 seconds. > set datafusion.execution.batch_size to 1024; 0 rows in set. Query took 0.000 seconds. > show datafusion.execution.batch_size; +---------------------------------+---------+ | name | value | +---------------------------------+---------+ | datafusion.execution.batch_size | 1024 | +---------------------------------+---------+ 1 row in set. Query took 0.005 seconds.
datafusion-cli comes with build-in functions that are not included in the DataFusion SQL engine, see DataFusion CLI specific functions section for details.