WARNING: This document is a work in progress, just like JSONSelect itself. View or contribute to the latest version on github

JSONSelect

  1. introduction
  2. levels
  3. language overview
  4. grouping
  5. selectors
  6. pseudo classes
  7. expressions
  8. combinators
  9. grammar
  10. conformance tests
  11. references

Introduction

JSONSelect defines a language very similar in syntax and structure to CSS3 Selectors. JSONSelect expressions are patterns which can be matched against JSON documents.

Potential applications of JSONSelect include:

  • Simplified programmatic matching of nodes within JSON documents.
  • Stream filtering, allowing efficient and incremental matching of documents.
  • As a query language for a document database.

Levels

The specification of JSONSelect is broken into three levels. Higher levels include more powerful constructs, and are likewise more complicated to implement and use.

JSONSelect Level 1 is a small subset of CSS3. Every feature is derived from a CSS construct that directly maps to JSON. A level 1 implementation is not particularly complicated while providing basic querying features.

JSONSelect Level 2 builds upon Level 1 adapting more complex CSS constructs which allow expressions to include constraints such as patterns that match against values, and those which consider a node's siblings. Level 2 is still a direct adaptation of CSS, but includes constructs whose semantic meaning is significantly changed.

JSONSelect Level 3 adds constructs which do not necessarily have a direct analog in CSS, and are added to increase the power and convenience of the selector language. These include aliases, wholly new pseudo class functions, and more blue sky dreaming.

Language Overview

Grouping

Selectors

Pseudo Classes

Expressions

Combinators

Grammar

(Adapted from CSS3 and json.org)

selectors_group
  : selector [ `,` selector ]*
  ;

selector
  : simple_selector_sequence [ combinator simple_selector_sequence ]*
  ;

combinator
  : `>` | \s+
  ;

simple_selector_sequence
  /* why allow multiple HASH entities in the grammar? */
  : [ type_selector | universal ]
    [ class | pseudo ]*
  | [ class | pseudo ]+
  ;

type_selector
  : `object` | `array` | `number` | `string` | `boolean` | `null`
  ;

universal
  : '*'
  ;

class
  : `.` name
  | `.` json_string
  ;

pseudo
  /* Note that pseudo-elements are restricted to one per selector and */
  /* occur only in the last simple_selector_sequence. */
  : `:` pseudo_class_name
  | `:` nth_function_name `(` nth_expression `)`
  | `:has` `(`  selectors_group `)`
  | `:expr` `(`  expr `)`
  | `:contains` `(`  json_string `)`
  | `:val` `(` val `)`
  ;

pseudo_class_name
  : `root` | `first-child` | `last-child` | `only-child`

nth_function_name
  : `nth-child` | `nth-last-child`

nth_expression
  /* expression is and of the form "an+b" */
  : TODO
  ;

expr
  : expr binop expr
  | '(' expr ')'
  | val
  ;

binop
  : '*' | '/' | '%' | '+' | '-' | '<=' | '>=' | '$='
  | '^=' | '*=' | '>' | '<' | '=' | '!=' | '&&' | '||'
  ;

val
  : json_number | json_string | 'true' | 'false' | 'null' | 'x'
  ;

json_string
  : `"` json_chars* `"`
  ;

json_chars
  : any-Unicode-character-except-"-or-\-or-control-character
  |  `\"`
  |  `\\`
  |  `\/`
  |  `\b`
  |  `\f`
  |  `\n`
  |  `\r`
  |  `\t`
  |   \u four-hex-digits
  ;

name
  : nmstart nmchar*
  ;

nmstart
  : escape | [_a-zA-Z] | nonascii
  ;

nmchar
  : [_a-zA-Z0-9-]
  | escape
  | nonascii
  ;

escape
  : \\[^\r\n\f0-9a-fA-F]
  ;

nonascii
  : [^\0-0177]
  ;

Conformance Tests

See https://github.com/lloyd/JSONSelectTests

References

In no particular order.