#Generator functions

##What are generator functions? Generator functions are functions that can return a sequence of values rather than a single value. In many programming languages, they look like functions but behave like iterators.

In Quickstep, a generator function takes a list of constant arguments at query compile time, and populates a relation at query run time. One example is the generate_series function, which in the simplest case takes two integer arguments (start, end) and generates a single-column table with values start, start+1, ..., end.

##Use generator functions Usage of a generator function is similar to that of a normal function except that it is placed inside a SQL query's FROM clause.

For example:

FROM generate_series(1, 5);

And the output will be:

|              1|
|              2|
|              3|
|              4|
|              5|

Here generate_series(1, 5) stands for a relation with 5 rows.

A more complex example is:

SELECT i, j, i*j
FROM generate_series(1, 3) AS gs1(i),
     generate_series(1, 3) AS gs2(j)
WHERE i <= j;

It is a theta-join between two generated relations. The output will be:

|i          |j          |(i*j)      |
|          1|          1|          1|
|          1|          2|          2|
|          1|          3|          3|
|          2|          2|          4|
|          2|          3|          6|
|          3|          3|          9|

##Overview of the design that supports the feature

  1. Given a SQL query, at first each generator function call is parsed as an abstract syntax tree node carrying the function's name and arguments information.
  • See ParseGeneratorTableReference as the abstract syntax tree node for generator functions.
  1. Then, in Resolver, the generator function is resolved and its arguments are validated. If this process is successful, we obtain a function handle that contains the actual implementation of this function. A logic plan node is then created to hold the function handle and to represent the relation that this function will generate.
  • See GeneratorFunctionFactory as the factory class that resolves a function from its name.
  • See GeneratorFunction as the abstract function class that checks the validity of arguments and creates the function handle.
  • See GeneratorFunctionHandle as the function handle class that provides the actual implementation of a particular generator function.
  • See logical::TableGenerator as the logical plan node class that represents the relation that a generator function will create.
  1. The logical plan node logical::TableGenerator is then transformed into a physical plan node physical::TableGenerator.
  • See physical::TableGenerator as the physical plan node class that represents the relation that a generator function will create.
  1. At the final stage of query planning, the physical plan node is transformed into a relational operator.
  • See TableGeneratorOperator as the relational operator class that represents a operator that will produce run-time work orders.
  1. At the time of query execution, the function handle's populateColumns() method is invoked to actually generate a relation.
  • See TableGeneratorWorkOrder as the work order class that actually invokes the generator function to populate a relation at run time.

##Implement a new generator function There are just three steps to implement a new generator function.

  1. Subclass GeneratorFunctionHandle and implement all virtual methods.
  • See GenerateSeriesHandle as an example.
  1. Subclass GeneratorFunction and implement all virtual methods. Also define a static Instance() method in the subclass to return a singleton instance reference.
  • See GenerateSeries as an example.
  1. Register the GeneratorFunction subclass into GeneratorFunctionFactory by adding into GeneratorFunctionFactory's constructor a new line:
GeneratorFunctionFactory::GeneratorFunctionFactory() {
  // Register all generator functions here.