commit | 3e825a718c1ae25ca3e3a7ba397c096af0e1c0a5 | [log] [tgz] |
---|---|---|
author | Andrew Lamb <andrew@nerdnetworks.org> | Tue Apr 06 07:11:09 2021 -0400 |
committer | Andrew Lamb <andrew@nerdnetworks.org> | Tue Apr 06 07:11:09 2021 -0400 |
tree | 04097869c89e28e2bd0cab110975ec9b894df30c | |
parent | 0de0de792c7eb3ce6a663a9c056883361628ec95 [diff] |
ARROW-12109: [Rust][DataFusion] Implement SHOW COLUMNS # Rationale Accessing the list of columns via `select * from information_schema.columns` (introduced in https://github.com/apache/arrow/pull/9840) is a lot to type See the doc for background: https://docs.google.com/document/d/12cpZUSNPqVH9Z0BBx6O8REu7TFqL-NPPAYCUPpDls1k/edit# This is a sister PR to `SHOW TABLES` here: https://github.com/apache/arrow/pull/9847 # Proposal Add support for `SHOW COLUMNS FROM <table>` command. Following the MySQL syntax supported by sqlparser: https://dev.mysql.com/doc/refman/8.0/en/show-columns.html # Example Use Setup: ``` echo "1,Foo,44.9" > /tmp/table.csv echo "2,Bar,22.1" >> /tmp/table.csv cargo run --bin datafusion-cli ``` Then run : ``` > CREATE EXTERNAL TABLE t(a int, b varchar, c float) STORED AS CSV LOCATION '/tmp/table.csv'; 0 rows in set. Query took 0 seconds. > show columns from t; +---------------+--------------+------------+-------------+-----------+-------------+ | table_catalog | table_schema | table_name | column_name | data_type | is_nullable | +---------------+--------------+------------+-------------+-----------+-------------+ | datafusion | public | t | a | Int32 | NO | | datafusion | public | t | b | Utf8 | NO | | datafusion | public | t | c | Float32 | NO | +---------------+--------------+------------+-------------+-----------+-------------+ 3 row in set. Query took 0 seconds. ``` # Commentary Note that the identifiers are case sensitive (which is a more general problem that affects all name resolution, not just `SHOW COLUMNS`). Ideally this should also work: ``` > show columns from T; Plan("Unknown relation for SHOW COLUMNS: T") > select * from T; Plan("Table or CTE with name \'T\' not found") ``` Closes #9866 from alamb/alamb/show_columns Authored-by: Andrew Lamb <andrew@nerdnetworks.org> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast.
Major components of the project include:
Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.
The reference Arrow libraries contain many distinct software components:
The official Arrow libraries in this repository are in different stages of implementing the Arrow format and related features. See our current feature matrix on git master.
Please read our latest project contribution guide.
Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: