blob: fb0a1e13f3dbba7ca74f96e684dbe47c2f12cb14 [file] [log] [blame]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
===============================================================================
Code Generation Interface
===============================================================================
The codegen directory houses code which is compiled with LLVM code generation
utilities. The point of code generation is to have code that is generated at
run time which is optimized to run on data specific to usage that can only be
described at run time. For instance, code which projects rows during a scan
relies on the types of the data stored in each of the columns, but these are
only determined by a run time schema. To alleviate this issue, a row projector
can be compiled with schema-specific machine code to run on the current rows.
Note the following classes, whose headers are LLVM-independent and thus intended
to be used by the rest of project without introducing additional dependencies:
CompilationManager (compilation_manager.h)
RowProjector (row_projector.h)
(Other classes also avoid LLVM headers, but they have little external use).
CompilationManager
------------------
The compilation manager takes care of asynchronous compilation tasks. It
accepts requests to compile new objects. If the requested object is already
cached, then the compiled object is returned. Otherwise, the compilation request
is enqueued and eventually carried out.
The manager can be accessed (and thus compiled code requests can be made)
by using the GetSingleton() method. Yes - there's a universal singleton for
compilation management. See the header for details.
The manager allows for waiting for all current compilations to finish, and can
register its metrics (which include code cache performance) upon request.
No cleanup is necessary for the CompilationManager. It registers a shutdown method
with the exit handler.
Generated objects
-----------------
* codegen::RowProjector - A row projector has the same interface as a
common::RowProjector, but supports a narrower scope of row types and arenas.
It does not allow its schema to be reset (indeed, that's the point of compiling
to a specific schema). The row projector's behavior is fully determined by
the base and projection schemas. As such, the compilation manager expects those
two items when retrieving a row projector.
================================================================================
Code Generation Implementation Details
================================================================================
Code generation works by creating what is essentially an assembly language
file for the desired object, then handing off that assembly to the LLVM
MCJIT compiler. The LLVM backend handles generating target-dependent machine
code. After code generation, the machine code, which is represented as a
shared object in memory, is dynamically linked to the invoking application
(i.e., this one), and the newly generated code becomes available.
Overview of LLVM-interfacing classes
------------------------------------
Most of the interfacing with LLVM is handled by the CodeGenerator
(code_generator.h) and ModuleBuilder (module_builder.h) classes. The CodeGenerator
takes care of setting up static intializations that LLVM is dependent on and
provides an interface which wraps around various calls to LLVM compilation
functions.
The ModuleBuilder takes care of the one-time construction of a module, which is
LLVM's unit of code. A module is its own namespace containing functions that
are compiled together. Currently, LLVM does not support having multiple
modules per execution engine so the code is coupled with an ExecutionEngine
instance which owns the generated code behind the scenes (the ExecutionEngine is
the LLVM class responsible for actual compilation and running of the dynamically
linked code). Note throughout the directory the execution engine is referred to
(actually typedef-ed as) a JITCodeOwner, because to every single class except
the ModuleBuilder that is all the execution engine is good for. Once the
destructor to a JITCodeOwner object is called, the associated data is deleted.
In turn, the ModuleBuilder provides a minimal interface to code-generating
classes (classes that accept data specific to a certain request and create the
LLVM IR - the assembly that was mentioned earlier - that is appropriate for
the specific data). The classes fill up the module with the desired assembly.
Sequence of operation
---------------------
The parts come together as follows (in the case that the code cache is empty).
1. External component requests some compiled object for certain runtime-
dependent data (e.g. a row projector for a base and projection schemas).
2. The CompilationManager accepts the request, but finds no such object
is cached.
3. The CompilationManager enqueues a request to compile said object to its
own threadpool, and responds with failure to the external component.
4. Eventually, a thread becomes available to take on the compilation task. The
task is dequeued and the CodeGenerator's compilation method for the request is
called.
5. The code generator checks that code generation is enabled, and makes a call
to the appropriate code-generating classes.
6. The classes rely on the ModuleBuilder to compile their code, after which
they return pointers to the requested functions.
Code-generating classes
-----------------------
As mentioned in steps (5) and (6), the code-generating classes are responsible
for generating the LLVM IR which is compiled at run time for whatever specific
requests the external components have.
The "code-generating classes" implement the JITWrapper (jit_wrapper.h) interface.
The base class requires an owning reference to a JITCodeOwner, intended to be the
owner of the JIT-compiled code that the JITWrapper derived class refers to.
On top of containing the JITCodeOwner and pointers to JIT-compiled functions,
the JITWrapper also provides methods which enable code caching. Caching compiled
code is essential because compilation times are prohibitively slow, so satisfying
any single request with freshly compiled code is not an option. As such, each
piece of compiled code should be associated with some run time determined data.
In the case of a row projector, this data is a pair of schemas, for the base
and the projection. In order to work for arbitrary types (so we do not need
multiple code caches for each different compiled object), the JITWrapper
implementation must be able to provide a byte string key encoding of its
associated data. This provides the key for the aforementioned cache. Similarly,
there should be a static method which allows encoding such a key without
generating a new instance (every time there is a request made to the manager,
the manager needs to generate the byte string key to look it up in the cache).
For instance, the JITWrapper for RowProjector code, RowProjectorFunctions, has
the following method:
static Status EncodeKey(const Schema& base, const Schema& proj,
faststring* out);
For any given input (pair of schemas), the JITWrapper generates a unique key
so that the cache can be looked up for the generated row projector in later
requests (the manager handles the cache lookups).
In order to keep one homogeneous cache of all the generated code, the keys
need to be unique across classes, which is difficult to maintain because the
encodings could conflict by accident. For this reason, a type identifier should
be prefixed to the beginning of every key. This identifier is an enum, with
values for each JITWrapper derived type, thus guaranteeing uniqueness between
classes.
Guide to creating new codegenned classes
----------------------------------------
To add new classes with code generation, one needs to generate the appropriate
JITWrapper and update the higher-level classes.
First, the inputs to code generation need to be established (henceforth referred
to as just "inputs").
1. Making a new JITWrapper
A new JITWrapper should derive from the JITWrapper class and expose a static
key-generation method which returns a key given the inputs for the class. To
satisfy the prefix condition, a new enum value must be added in
JITWrapper::JITWrapperType.
The JITWrapper derived class should have a creation method that generates
a shared reference to an instance of itself. The JITWrappers should only
be handled through shared references because this ensures that the code owner
within the class is kept alive exactly as long as references to code pointing with
it exist (the derived class is the only class that should contain members which
are pointers to the desired compiled functions for the given input).
The actual creation of the compiled code is perhaps the hardest part. See the
section below.
2. Updating top-level classes
On top of adding the new enum value in the JITWrapper enumeration, several other
top-level classes should provide the interfaces necessary to use the new
codegen class (the layer of interface classes enables separate components
of kudu to be independent of LLVM headers).
In the CodeGenerator, there should be a Compile...(inputs) function which
creates a scoped_refptr to the derived JITWrapper class by invoking the
class' creation method. Note that the CodeGenerator should also print
the appropriate LLVM disassembly if the flag is activated.
The compilation manager should likewise offer a Request...(inputs) function
that returns the requested compiled functions by looking up the cache for the
inputs by generating a key with the static encoding method mentioned above. If the
cache lookup fails, the manager should submit a new compilation request. The
cache hit metrics should be incremented appropriately.
Guide to code generation
------------------------
The resources at the bottom of this document provide a good reference for
LLVM IR. However, there should be little need to use much LLVM IR because the
majority of the LLVM code can be precompiled.
If you wish to execute certain functions A, B, or C based on the input data which
takes on values 1, 2, or 3, then do the following:
1. Write A, B, and C in an extern "C" namespace (to avoid name mangling) in
codegen/precompiled.cc.
2. When creating your derived JITWrapper class, create a ModuleBuilder. The
builder should load your functions A, B, and C automatically.
3. Create an LLVM IR function dependent on the inputs. I.e., if the input
for code generation is 1, then the desired function would be A. In that case,
request the module builder for a function called "A". The builder, when compiled,
will offer a pointer to the compiled function.
Note in the above example the only utility of code generation is avoiding
a couple of branches which decide on A, B, or C based on input data 1, 2, or 3.
Code generation gets much more mileage from constant propagation. To utilize this,
one needs to generate a new function in LLVM IR at run time which passes
arguments to the precompiled functions, with hopefully some relevant constants
based on the input data. When LLVM compiles the module, it will propagate those
constants, creating more efficient machine code.
To create a function in a module at run time, you need to use a
ModuleBuilder::LLVMBuilder. The builder emits LLVM IR dynamically. It is an
alias for the llvm::IRBuilder<> class, whose API is available in the links at
the bottom of this document. A worked example is available in row_projector.cc.
Useful resources
----------------
http://llvm.org/docs/doxygen/html/index.html
http://llvm.org/docs/tutorial/
http://llvm.org/docs/LangRef.html
Debugging
---------
Debug info is available by printing the generated code. See the flags declared
in code_generator.cc for further details.