(cargo-release)  version 0.1.4
(cargo-release) version 0.1.4
1 file changed
tree: a6accc1f6b93dcc315ea6573c60216df40901c97
  1. docs/
  2. examples/
  3. src/
  4. .gitignore
  5. .travis.yml
  6. Cargo.toml
  7. LICENSE.TXT
  8. README.md
README.md

ANSI SQL Lexer and Parser for Rust

License Version Build Status Coverage Status Gitter chat

The primary goal of this project is to build a SQL lexer and parser capable of parsing SQL that conforms with the ANSI SQL:2011 standard.

A secondary goal is to make it easy for others to use this library as a foundation for building custom SQL parsers for vendor-specific dialects.

This parser is currently being used by the DataFusion query engine.

Example

The current code is capable of parsing some trivial SELECT and CREATE TABLE statements.

let sql = "SELECT a, b, 123, myfunc(b) \
    FROM table_1 \
    WHERE a > b AND b < 100 \
    ORDER BY a DESC, b";

let ast = Parser::parse_sql(sql.to_string()).unwrap();

println!("AST: {:?}", ast);

This outputs

AST: SQLSelect { projection: [SQLIdentifier("a"), SQLIdentifier("b"), SQLLiteralLong(123), SQLFunction { id: "myfunc", args: [SQLIdentifier("b")] }], relation: Some(SQLIdentifier("table_1")), selection: Some(SQLBinaryExpr { left: SQLBinaryExpr { left: SQLIdentifier("a"), op: Gt, right: SQLIdentifier("b") }, op: And, right: SQLBinaryExpr { left: SQLIdentifier("b"), op: Lt, right: SQLLiteralLong(100) } }), order_by: Some([SQLOrderBy { expr: SQLIdentifier("a"), asc: false }, SQLOrderBy { expr: SQLIdentifier("b"), asc: true }]), group_by: None, having: None, limit: None }

Design

This parser is implemented using the Pratt Parser design, which is a top-down operator-precedence parser.

I am a fan of this design pattern over parser generators for the following reasons:

  • Code is simple to write and can be concise and elegant (this is far from true for this current implementation unfortunately, but I hope to fix that using some macros)
  • Performance is generally better than code generated by parser generators
  • Debugging is much easier with hand-written code
  • It is far easier to extend and make dialect-specific extensions compared to using a parser generator

Supporting custom SQL dialects

This is a work in progress but I started some notes on writing a custom SQL parser.

Contributing

Contributors are welcome! Please see the current issues and feel free to file more!