This document describes the initial design of the LLVM-based code generator backend.
The LLVM generator reuses the existing operator fusion optimizer. It has to compile LLVM IR and execute from C++ based operator templates. I will add the support for cellwise operation for dense matrices.
I will add a folder to put the LLVM header files and the jni_bridge files (header and cpp) to interact through JNI in src/main/llvm, also I will use the already written helper functions GET/RELEASE_ARRAY to handle input arrays from java code. However, eventually only a native proxy shared library will be added to the repository to avoid unnecessary dependencies, while LLVM libraries will be loaded from the native library path similar to native BLAS libraries.
The following method will be exposed:
I will add CMakeLists.txt to support the compilation and linking pass as it was done for the CUDA files. Following the LLVM documentation I‘ve made a simple example (10.0.0 version) of the usage of the LLVM api that can be found here. Technical note: I don’t know which LLVM version will be better to use since it has changed every two months the last year and there aren't any suggestions online about it, so any suggestions regarding this choice will be useful.
The SpoofLLVMContext class will have the following structure:
class SpoofLLVMContext{ private: std::unique_ptr<LLVMContext> context; std::unique_ptr<SMDiagnostic> error; std::string targetTriple; // target hardware specification std::map<std::string, Module> loadedModules; // store the spoof operator std::unique_ptr<ExecutionEngine> executionEngine; // runtime executor public: bool loadModule(const std::string& modulePath); GenericValue executeModuleFunction(std::string& functionName, GenericValue* params); // execute operation };
I will add the needed LLVM header manually through Maven to handle the build process ,as it was done for the CUDA files.
I will introduce the new GeneratorAPI value “LLVM” inside the SpoofCompiler class. After that I will add:
I will create a folder llvm inside the cplan folder hops/codegen/cplan/llvm and I will create a CellWise class that follows the structure of the java/CellWise but will return LLVM IR code as a template when the getTemplate(SpoofCellwise.CellType ct) method is called. Then, following the CUDA implemented structure I will create a SpoofLLVM class that store the name of the CNodeTpl generated. This SpoofLLVMs will be stored inside CodeGenUtils new HashMap<String, SpoofLLVM> data structure. The SpoofLLVM will have a native method for passing the operands and execute the computation.
I will first implement the syntactic part and then the runtime part. I will follow the following steps: