blob: 89f6909d5c5d8417ca4a0ca726443274b5ab0fed [file] [log] [blame] [view]
#Generator functions
##What are generator functions?
Generator functions are functions that can return a sequence of values rather than a single value. In [many programming languages](https://en.wikipedia.org/wiki/Generator_%28computer_programming%29), they look like functions but behave like iterators.
In Quickstep, a generator function takes a list of constant arguments at query compile time, and populates a relation at query run time. One example is the `generate_series` function, which in the simplest case takes two integer arguments `(start, end)` and generates a single-column table with values `start, start+1, ..., end`.
##Use generator functions
Usage of a generator function is similar to that of a normal function except that it is placed inside a SQL query's `FROM` clause.
For example:
```
SELECT *
FROM generate_series(1, 5);
```
And the output will be:
```
+---------------+
|generate_series|
+---------------+
| 1|
| 2|
| 3|
| 4|
| 5|
+---------------+
```
Here `generate_series(1, 5)` stands for a relation with 5 rows.
A more complex example is:
```
SELECT i, j, i*j
FROM generate_series(1, 3) AS gs1(i),
generate_series(1, 3) AS gs2(j)
WHERE i <= j;
```
It is a theta-join between two generated relations. The output will be:
```
+-----------+-----------+-----------+
|i |j |(i*j) |
+-----------+-----------+-----------+
| 1| 1| 1|
| 1| 2| 2|
| 1| 3| 3|
| 2| 2| 4|
| 2| 3| 6|
| 3| 3| 9|
+-----------+-----------+-----------+
```
##Overview of the design that supports the feature
1. Given a SQL query, at first each generator function call is parsed as an abstract syntax tree node carrying the function's name and arguments information.
- *See `ParseGeneratorTableReference` as the abstract syntax tree node for generator functions.*
2. Then, in `Resolver`, the generator function is resolved and its arguments are validated. If this process is successful, we obtain a function handle that contains the actual implementation of this function. A logic plan node is then created to hold the function handle and to represent the relation that this function will generate.
- *See `GeneratorFunctionFactory` as the factory class that resolves a function from its name.*
- *See `GeneratorFunction` as the abstract function class that checks the validity of arguments and creates the function handle.*
- *See `GeneratorFunctionHandle` as the function handle class that provides the actual implementation of a particular generator function.*
- *See `logical::TableGenerator` as the logical plan node class that represents the relation that a generator function will create.*
3. The logical plan node `logical::TableGenerator` is then transformed into a physical plan node `physical::TableGenerator`.
- *See `physical::TableGenerator` as the physical plan node class that represents the relation that a generator function will create.*
4. At the final stage of query planning, the physical plan node is transformed into a relational operator.
- *See `TableGeneratorOperator` as the relational operator class that represents a operator that will produce run-time work orders.*
5. At the time of query execution, the function handle's `populateColumns()` method is invoked to actually generate a relation.
- *See `TableGeneratorWorkOrder` as the work order class that actually invokes the generator function to populate a relation at run time.*
##Implement a new generator function
There are just three steps to implement a new generator function.
1. Subclass **`GeneratorFunctionHandle`** and implement all virtual methods.
- *See `GenerateSeriesHandle` as an example.*
2. Subclass **`GeneratorFunction`** and implement all virtual methods. Also define a static `Instance()` method in the subclass to return a singleton instance reference.
- *See `GenerateSeries` as an example.*
3. Register the **`GeneratorFunction`** subclass into **`GeneratorFunctionFactory`** by adding into `GeneratorFunctionFactory`'s constructor a new line:
```
GeneratorFunctionFactory::GeneratorFunctionFactory() {
...
// Register all generator functions here.
REGISTER_GENERATOR_FUNCTION_(YourGeneratorFunctionSubclass);
...
}
```