The Nemo IR can be flexibly modified, both in its logical structure and annotations, through an interface called Nemo optimization pass. An optimization pass is basically a function that takes an Nemo IR and outputs an optimized Nemo IR.
The modification during compile-time can be categorized in different ways:
After the compilation and compile-time optimizations, the Nemo IR gets laid out as a physical execution plan to be submitted to and executed by the Nemo Execution Runtime. While execution, an run-time optimization pass can be performed to perform dynamic optimizations, like solving data skew, using runtime statistics. It takes the old Nemo IR and metric data of runtime statistics, and sends the newly optimized Nemo IR to execution runtime for the physical plan to be updated accordingly.
Below are some example optimization passes that are used for different use cases:
Reshaping passes:
Annotating passes:
An optimization policy is composed of a specific combination of optimization passes.
Using a carefully chosen series of optimization passes, we can optimize an application to exploit specific deployement characteristics, by providing appropriate configurations and plan for the execution runtime. A complete series of optimization passes is called a policy, which together performs a specific goal.
For example, to optimize an application to run on evictable transient resources, we can use a specialized executor placement pass, that places computations appropriately on different types of resources, and data flow model pass, that determines the fashion in which each computation should fetch its input data, with a number of other passes for further optimization.
Using different optimization policies for specific goals enables users to flexibly customize and perform data processing for different deployment characteristics. This greatly simplifies the work by replacing the work of exploring and rewriting system internals for modifying runtime behaviors with a simple process of using pluggable policies. It also makes it possible for the system to promptly meet new requirements through easy extension of system capabilities.