src/kudu/codegen/README - kudu - Git at Google


 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.

 ===============================================================================
 Code Generation Interface
 ===============================================================================

 The codegen directory houses code which is compiled with LLVM code generation
 utilities. The point of code generation is to have code that is generated at
 run time which is optimized to run on data specific to usage that can only be
 described at run time. For instance, code which projects rows during a scan
 relies on the types of the data stored in each of the columns, but these are
 only determined by a run time schema. To alleviate this issue, a row projector
 can be compiled with schema-specific machine code to run on the current rows.

 Note the following classes, whose headers are LLVM-independent and thus intended
 to be used by the rest of project without introducing additional dependencies:

 CompilationManager (compilation_manager.h)
 RowProjector (row_projector.h)

 (Other classes also avoid LLVM headers, but they have little external use).

 CompilationManager
 ------------------

 The compilation manager takes care of asynchronous compilation tasks. It
 accepts requests to compile new objects. If the requested object is already
 cached, then the compiled object is returned. Otherwise, the compilation request
 is enqueued and eventually carried out.

 The manager can be accessed (and thus compiled code requests can be made)
 by using the GetSingleton() method. Yes - there's a universal singleton for
 compilation management. See the header for details.

 The manager allows for waiting for all current compilations to finish, and can
 register its metrics (which include code cache performance) upon request.

 No cleanup is necessary for the CompilationManager. It registers a shutdown method
 with the exit handler.

 Generated objects
 -----------------

 * codegen::RowProjector - A row projector has the same interface as a
 common::RowProjector, but supports a narrower scope of row types and arenas.
 It does not allow its schema to be reset (indeed, that's the point of compiling
 to a specific schema). The row projector's behavior is fully determined by
 the base and projection schemas. As such, the compilation manager expects those
 two items when retrieving a row projector.

 ================================================================================
 Code Generation Implementation Details
 ================================================================================

 Code generation works by creating what is essentially an assembly language
 file for the desired object, then handing off that assembly to the LLVM
 MCJIT compiler. The LLVM backend handles generating target-dependent machine
 code. After code generation, the machine code, which is represented as a
 shared object in memory, is dynamically linked to the invoking application
 (i.e., this one), and the newly generated code becomes available.

 Overview of LLVM-interfacing classes
 ------------------------------------

 Most of the interfacing with LLVM is handled by the CodeGenerator
 (code_generator.h) and ModuleBuilder (module_builder.h) classes. The CodeGenerator
 takes care of setting up static intializations that LLVM is dependent on and
 provides an interface which wraps around various calls to LLVM compilation
 functions.

 The ModuleBuilder takes care of the one-time construction of a module, which is
 LLVM's unit of code. A module is its own namespace containing functions that
 are compiled together. Currently, LLVM does not support having multiple
 modules per execution engine so the code is coupled with an ExecutionEngine
 instance which owns the generated code behind the scenes (the ExecutionEngine is
 the LLVM class responsible for actual compilation and running of the dynamically
 linked code). Note throughout the directory the execution engine is referred to
 (actually typedef-ed as) a JITCodeOwner, because to every single class except
 the ModuleBuilder that is all the execution engine is good for. Once the
 destructor to a JITCodeOwner object is called, the associated data is deleted.

 In turn, the ModuleBuilder provides a minimal interface to code-generating
 classes (classes that accept data specific to a certain request and create the
 LLVM IR - the assembly that was mentioned earlier - that is appropriate for
 the specific data). The classes fill up the module with the desired assembly.

 Sequence of operation
 ---------------------

 The parts come together as follows (in the case that the code cache is empty).

 1. External component requests some compiled object for certain runtime-
 dependent data (e.g. a row projector for a base and projection schemas).
 2. The CompilationManager accepts the request, but finds no such object
 is cached.
 3. The CompilationManager enqueues a request to compile said object to its
 own threadpool, and responds with failure to the external component.
 4. Eventually, a thread becomes available to take on the compilation task. The
 task is dequeued and the CodeGenerator's compilation method for the request is
 called.
 5. The code generator checks that code generation is enabled, and makes a call
 to the appropriate code-generating classes.
 6. The classes rely on the ModuleBuilder to compile their code, after which
 they return pointers to the requested functions.

 Code-generating classes
 -----------------------

 As mentioned in steps (5) and (6), the code-generating classes are responsible
 for generating the LLVM IR which is compiled at run time for whatever specific
 requests the external components have.

 The "code-generating classes" implement the JITWrapper (jit_wrapper.h) interface.
 The base class requires an owning reference to a JITCodeOwner, intended to be the
 owner of the JIT-compiled code that the JITWrapper derived class refers to.

 On top of containing the JITCodeOwner and pointers to JIT-compiled functions,
 the JITWrapper also provides methods which enable code caching. Caching compiled
 code is essential because compilation times are prohibitively slow, so satisfying
 any single request with freshly compiled code is not an option. As such, each
 piece of compiled code should be associated with some run time determined data.

 In the case of a row projector, this data is a pair of schemas, for the base
 and the projection. In order to work for arbitrary types (so we do not need
 multiple code caches for each different compiled object), the JITWrapper
 implementation must be able to provide a byte string key encoding of its
 associated data. This provides the key for the aforementioned cache. Similarly,
 there should be a static method which allows encoding such a key without
 generating a new instance (every time there is a request made to the manager,
 the manager needs to generate the byte string key to look it up in the cache).

 For instance, the JITWrapper for RowProjector code, RowProjectorFunctions, has
 the following method:

 static Status EncodeKey(const Schema& base, const Schema& proj,
                         faststring* out);

 For any given input (pair of schemas), the JITWrapper generates a unique key
 so that the cache can be looked up for the generated row projector in later
 requests (the manager handles the cache lookups).

 In order to keep one homogeneous cache of all the generated code, the keys
 need to be unique across classes, which is difficult to maintain because the
 encodings could conflict by accident. For this reason, a type identifier should
 be prefixed to the beginning of every key. This identifier is an enum, with
 values for each JITWrapper derived type, thus guaranteeing uniqueness between
 classes.

 Guide to creating new codegenned classes
 ----------------------------------------

 To add new classes with code generation, one needs to generate the appropriate
 JITWrapper and update the higher-level classes.

 First, the inputs to code generation need to be established (henceforth referred
 to as just "inputs").

 1. Making a new JITWrapper

 A new JITWrapper should derive from the JITWrapper class and expose a static
 key-generation method which returns a key given the inputs for the class. To
 satisfy the prefix condition, a new enum value must be added in
 JITWrapper::JITWrapperType.

 The JITWrapper derived class should have a creation method that generates
 a shared reference to an instance of itself. The JITWrappers should only
 be handled through shared references because this ensures that the code owner
 within the class is kept alive exactly as long as references to code pointing with
 it exist (the derived class is the only class that should contain members which
 are pointers to the desired compiled functions for the given input).

 The actual creation of the compiled code is perhaps the hardest part. See the
 section below.

 2. Updating top-level classes

 On top of adding the new enum value in the JITWrapper enumeration, several other
 top-level classes should provide the interfaces necessary to use the new
 codegen class (the layer of interface classes enables separate components
 of kudu to be independent of LLVM headers).

 In the CodeGenerator, there should be a Compile...(inputs) function which
 creates a scoped_refptr to the derived JITWrapper class by invoking the
 class' creation method. Note that the CodeGenerator should also print
 the appropriate LLVM disassembly if the flag is activated.

 The compilation manager should likewise offer a Request...(inputs) function
 that returns the requested compiled functions by looking up the cache for the
 inputs by generating a key with the static encoding method mentioned above. If the
 cache lookup fails, the manager should submit a new compilation request. The
 cache hit metrics should be incremented appropriately.

 Guide to code generation
 ------------------------

 The resources at the bottom of this document provide a good reference for
 LLVM IR. However, there should be little need to use much LLVM IR because the
 majority of the LLVM code can be precompiled.

 If you wish to execute certain functions A, B, or C based on the input data which
 takes on values 1, 2, or 3, then do the following:

 1. Write A, B, and C in an extern "C" namespace (to avoid name mangling) in
 codegen/precompiled.cc.
 2. When creating your derived JITWrapper class, create a ModuleBuilder. The
 builder should load your functions A, B, and C automatically.
 3. Create an LLVM IR function dependent on the inputs. I.e., if the input
 for code generation is 1, then the desired function would be A. In that case,
 request the module builder for a function called "A". The builder, when compiled,
 will offer a pointer to the compiled function.

 Note in the above example the only utility of code generation is avoiding
 a couple of branches which decide on A, B, or C based on input data 1, 2, or 3.

 Code generation gets much more mileage from constant propagation. To utilize this,
 one needs to generate a new function in LLVM IR at run time which passes
 arguments to the precompiled functions, with hopefully some relevant constants
 based on the input data. When LLVM compiles the module, it will propagate those
 constants, creating more efficient machine code.

 To create a function in a module at run time, you need to use a
 ModuleBuilder::LLVMBuilder. The builder emits LLVM IR dynamically. It is an
 alias for the llvm::IRBuilder<> class, whose API is available in the links at
 the bottom of this document. A worked example is available in row_projector.cc.

 Useful resources
 ----------------
 http://llvm.org/docs/doxygen/html/index.html
 http://llvm.org/docs/tutorial/
 http://llvm.org/docs/LangRef.html

 Debugging
 ---------

 Debug info is available by printing the generated code. See the flags declared
 in code_generator.cc for further details.

	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.

	===============================================================================
	Code Generation Interface
	===============================================================================

	The codegen directory houses code which is compiled with LLVM code generation
	utilities. The point of code generation is to have code that is generated at
	run time which is optimized to run on data specific to usage that can only be
	described at run time. For instance, code which projects rows during a scan
	relies on the types of the data stored in each of the columns, but these are
	only determined by a run time schema. To alleviate this issue, a row projector
	can be compiled with schema-specific machine code to run on the current rows.

	Note the following classes, whose headers are LLVM-independent and thus intended
	to be used by the rest of project without introducing additional dependencies:

	CompilationManager (compilation_manager.h)
	RowProjector (row_projector.h)

	(Other classes also avoid LLVM headers, but they have little external use).

	CompilationManager
	------------------

	The compilation manager takes care of asynchronous compilation tasks. It
	accepts requests to compile new objects. If the requested object is already
	cached, then the compiled object is returned. Otherwise, the compilation request
	is enqueued and eventually carried out.

	The manager can be accessed (and thus compiled code requests can be made)
	by using the GetSingleton() method. Yes - there's a universal singleton for
	compilation management. See the header for details.

	The manager allows for waiting for all current compilations to finish, and can
	register its metrics (which include code cache performance) upon request.

	No cleanup is necessary for the CompilationManager. It registers a shutdown method
	with the exit handler.

	Generated objects
	-----------------

	* codegen::RowProjector - A row projector has the same interface as a
	common::RowProjector, but supports a narrower scope of row types and arenas.
	It does not allow its schema to be reset (indeed, that's the point of compiling
	to a specific schema). The row projector's behavior is fully determined by
	the base and projection schemas. As such, the compilation manager expects those
	two items when retrieving a row projector.

	================================================================================
	Code Generation Implementation Details
	================================================================================

	Code generation works by creating what is essentially an assembly language
	file for the desired object, then handing off that assembly to the LLVM
	MCJIT compiler. The LLVM backend handles generating target-dependent machine
	code. After code generation, the machine code, which is represented as a
	shared object in memory, is dynamically linked to the invoking application
	(i.e., this one), and the newly generated code becomes available.

	Overview of LLVM-interfacing classes
	------------------------------------

	Most of the interfacing with LLVM is handled by the CodeGenerator
	(code_generator.h) and ModuleBuilder (module_builder.h) classes. The CodeGenerator
	takes care of setting up static intializations that LLVM is dependent on and
	provides an interface which wraps around various calls to LLVM compilation
	functions.

	The ModuleBuilder takes care of the one-time construction of a module, which is
	LLVM's unit of code. A module is its own namespace containing functions that
	are compiled together. Currently, LLVM does not support having multiple
	modules per execution engine so the code is coupled with an ExecutionEngine
	instance which owns the generated code behind the scenes (the ExecutionEngine is
	the LLVM class responsible for actual compilation and running of the dynamically
	linked code). Note throughout the directory the execution engine is referred to
	(actually typedef-ed as) a JITCodeOwner, because to every single class except
	the ModuleBuilder that is all the execution engine is good for. Once the
	destructor to a JITCodeOwner object is called, the associated data is deleted.

	In turn, the ModuleBuilder provides a minimal interface to code-generating
	classes (classes that accept data specific to a certain request and create the
	LLVM IR - the assembly that was mentioned earlier - that is appropriate for
	the specific data). The classes fill up the module with the desired assembly.

	Sequence of operation
	---------------------

	The parts come together as follows (in the case that the code cache is empty).

	1. External component requests some compiled object for certain runtime-
	dependent data (e.g. a row projector for a base and projection schemas).
	2. The CompilationManager accepts the request, but finds no such object
	is cached.
	3. The CompilationManager enqueues a request to compile said object to its
	own threadpool, and responds with failure to the external component.
	4. Eventually, a thread becomes available to take on the compilation task. The
	task is dequeued and the CodeGenerator's compilation method for the request is
	called.
	5. The code generator checks that code generation is enabled, and makes a call
	to the appropriate code-generating classes.
	6. The classes rely on the ModuleBuilder to compile their code, after which
	they return pointers to the requested functions.

	Code-generating classes
	-----------------------

	As mentioned in steps (5) and (6), the code-generating classes are responsible
	for generating the LLVM IR which is compiled at run time for whatever specific
	requests the external components have.

	The "code-generating classes" implement the JITWrapper (jit_wrapper.h) interface.
	The base class requires an owning reference to a JITCodeOwner, intended to be the
	owner of the JIT-compiled code that the JITWrapper derived class refers to.

	On top of containing the JITCodeOwner and pointers to JIT-compiled functions,
	the JITWrapper also provides methods which enable code caching. Caching compiled
	code is essential because compilation times are prohibitively slow, so satisfying
	any single request with freshly compiled code is not an option. As such, each
	piece of compiled code should be associated with some run time determined data.

	In the case of a row projector, this data is a pair of schemas, for the base
	and the projection. In order to work for arbitrary types (so we do not need
	multiple code caches for each different compiled object), the JITWrapper
	implementation must be able to provide a byte string key encoding of its
	associated data. This provides the key for the aforementioned cache. Similarly,
	there should be a static method which allows encoding such a key without
	generating a new instance (every time there is a request made to the manager,
	the manager needs to generate the byte string key to look it up in the cache).

	For instance, the JITWrapper for RowProjector code, RowProjectorFunctions, has
	the following method:

	static Status EncodeKey(const Schema& base, const Schema& proj,
	faststring* out);

	For any given input (pair of schemas), the JITWrapper generates a unique key
	so that the cache can be looked up for the generated row projector in later
	requests (the manager handles the cache lookups).

	In order to keep one homogeneous cache of all the generated code, the keys
	need to be unique across classes, which is difficult to maintain because the
	encodings could conflict by accident. For this reason, a type identifier should
	be prefixed to the beginning of every key. This identifier is an enum, with
	values for each JITWrapper derived type, thus guaranteeing uniqueness between
	classes.

	Guide to creating new codegenned classes
	----------------------------------------

	To add new classes with code generation, one needs to generate the appropriate
	JITWrapper and update the higher-level classes.

	First, the inputs to code generation need to be established (henceforth referred
	to as just "inputs").

	1. Making a new JITWrapper

	A new JITWrapper should derive from the JITWrapper class and expose a static
	key-generation method which returns a key given the inputs for the class. To
	satisfy the prefix condition, a new enum value must be added in
	JITWrapper::JITWrapperType.

	The JITWrapper derived class should have a creation method that generates
	a shared reference to an instance of itself. The JITWrappers should only
	be handled through shared references because this ensures that the code owner
	within the class is kept alive exactly as long as references to code pointing with
	it exist (the derived class is the only class that should contain members which
	are pointers to the desired compiled functions for the given input).

	The actual creation of the compiled code is perhaps the hardest part. See the
	section below.

	2. Updating top-level classes

	On top of adding the new enum value in the JITWrapper enumeration, several other
	top-level classes should provide the interfaces necessary to use the new
	codegen class (the layer of interface classes enables separate components
	of kudu to be independent of LLVM headers).

	In the CodeGenerator, there should be a Compile...(inputs) function which
	creates a scoped_refptr to the derived JITWrapper class by invoking the
	class' creation method. Note that the CodeGenerator should also print
	the appropriate LLVM disassembly if the flag is activated.

	The compilation manager should likewise offer a Request...(inputs) function
	that returns the requested compiled functions by looking up the cache for the
	inputs by generating a key with the static encoding method mentioned above. If the
	cache lookup fails, the manager should submit a new compilation request. The
	cache hit metrics should be incremented appropriately.

	Guide to code generation
	------------------------

	The resources at the bottom of this document provide a good reference for
	LLVM IR. However, there should be little need to use much LLVM IR because the
	majority of the LLVM code can be precompiled.

	If you wish to execute certain functions A, B, or C based on the input data which
	takes on values 1, 2, or 3, then do the following:

	1. Write A, B, and C in an extern "C" namespace (to avoid name mangling) in
	codegen/precompiled.cc.
	2. When creating your derived JITWrapper class, create a ModuleBuilder. The
	builder should load your functions A, B, and C automatically.
	3. Create an LLVM IR function dependent on the inputs. I.e., if the input
	for code generation is 1, then the desired function would be A. In that case,
	request the module builder for a function called "A". The builder, when compiled,
	will offer a pointer to the compiled function.

	Note in the above example the only utility of code generation is avoiding
	a couple of branches which decide on A, B, or C based on input data 1, 2, or 3.

	Code generation gets much more mileage from constant propagation. To utilize this,
	one needs to generate a new function in LLVM IR at run time which passes
	arguments to the precompiled functions, with hopefully some relevant constants
	based on the input data. When LLVM compiles the module, it will propagate those
	constants, creating more efficient machine code.

	To create a function in a module at run time, you need to use a
	ModuleBuilder::LLVMBuilder. The builder emits LLVM IR dynamically. It is an
	alias for the llvm::IRBuilder<> class, whose API is available in the links at
	the bottom of this document. A worked example is available in row_projector.cc.

	Useful resources
	----------------
	http://llvm.org/docs/doxygen/html/index.html
	http://llvm.org/docs/tutorial/
	http://llvm.org/docs/LangRef.html

	Debugging
	---------

	Debug info is available by printing the generated code. See the flags declared
	in code_generator.cc for further details.