Substrait is a project aiming to create a well-defined, cross-language specification for data compute operations. Since it is still under active development, there are some lacking representations for Gluten needed computing operations. At the same time, some existing representations need to be modified a bit to satisfy the needs of computing.
In Gluten, the base version of Substrait is v0.23.0. This page records all the Gluten changes to Substrait proto files for reference. It is preferred to upstream these changes to Substrait, but for those cannot be upstreamed, alternatives like AdvancedExtension could be considered.
JsonReadOptions and TextReadOptions in FileOrFiles(#1584).JOIN_TYPE_SEMI to JOIN_TYPE_LEFT_SEMI and JOIN_TYPE_RIGHT_SEMI(#408).WindowRel, added column_name and window_type in WindowFunction, changed Unbounded in WindowFunction into Unbounded_Preceding and Unbounded_Following, and added WindowType(#485).output_schema in RelRoot(#1901).ExpandRel(#1361).GenerateRel(#574).PartitionColumn in LocalFiles(#2405).WriteRel (#3690).TopNRel (#5409).ref field in window bound Preceding and Following (#5626).BucketSpec field in WriteRel(#8386)StreamKafka in ReadRel(#8321)