| commit | 76ec175d20678771ee69e22b46b530d4b73217af | [log] [tgz] |
|---|---|---|
| author | Nickolay Ponomarev <asqueella@gmail.com> | Mon Jan 21 01:28:59 2019 +0300 |
| committer | Nickolay Ponomarev <asqueella@gmail.com> | Thu Jan 31 03:57:17 2019 +0300 |
| tree | f87acb95d6d9f9f9de46fa5f350a72fd29f52552 | |
| parent | 536fa6e428d733071c352fbd105388834197385a [diff] |
Support table aliases without `AS` (7/8) ...as in `FROM foo bar WHERE bar.x > 1`. To avoid ambiguity as to whether a token is an alias or a keyword, we maintain a blacklist of keywords, that can follow a "table factor", to prevent parsing them as an alias. This "context-specific reserved keyword" approach lets us accept more SQL that's valid in some dialects, than a list of globally reserved keywords. Also some dialects (e.g. Oracle) apparently don't reserve some keywords (like JOIN), while presumably they won't accept them as an alias (`FROM foo JOIN` meaning `FROM foo AS JOIN`).
The goal of this project is to build a SQL lexer and parser capable of parsing SQL that conforms with the ANSI SQL:2011 standard but also making it easy to support custom dialects so that this crate can be used as a foundation for vendor-specific parsers.
This parser is currently being used by the DataFusion query engine and LocustDB.
The current code is capable of parsing some trivial SELECT and CREATE TABLE statements.
let sql = "SELECT a, b, 123, myfunc(b) \ FROM table_1 \ WHERE a > b AND b < 100 \ ORDER BY a DESC, b"; let dialect = GenericSqlDialect{}; // or AnsiSqlDialect, or your own dialect ... let ast = Parser::parse_sql(&dialect,sql.to_string()).unwrap(); println!("AST: {:?}", ast);
This outputs
AST: SQLSelect { projection: [SQLIdentifier("a"), SQLIdentifier("b"), SQLLiteralLong(123), SQLFunction { id: "myfunc", args: [SQLIdentifier("b")] }], relation: Some(SQLIdentifier("table_1")), selection: Some(SQLBinaryExpr { left: SQLBinaryExpr { left: SQLIdentifier("a"), op: Gt, right: SQLIdentifier("b") }, op: And, right: SQLBinaryExpr { left: SQLIdentifier("b"), op: Lt, right: SQLLiteralLong(100) } }), order_by: Some([SQLOrderBy { expr: SQLIdentifier("a"), asc: false }, SQLOrderBy { expr: SQLIdentifier("b"), asc: true }]), group_by: None, having: None, limit: None }
This parser is implemented using the Pratt Parser design, which is a top-down operator-precedence parser.
I am a fan of this design pattern over parser generators for the following reasons:
This is a work in progress but I started some notes on writing a custom SQL parser.
Contributors are welcome! Please see the current issues and feel free to file more!