blob: c91f2a3fb84707cf3e0f174d43d6b76cd35b0aba [file] [log] [blame] [view]
---
layout: page
title: Substrait Modifications
nav_order: 9
parent: Developer Overview
---
# Substrait Modifications in Gluten
Substrait is a project aiming to create a well-defined, cross-language specification for data compute operations.
Since it is still under active development, there are some lacking representations for Gluten needed computing
operations. At the same time, some existing representations need to be modified a bit to satisfy the needs of computing.
In Gluten, the base version of Substrait is `v0.23.0`. This page records all the Gluten changes to Substrait proto
files for reference. It is preferred to upstream these changes to Substrait, but for those cannot be upstreamed,
alternatives like `AdvancedExtension` could be considered.
## Modifications to algebra.proto
* Added `JsonReadOptions` and `TextReadOptions` in `FileOrFiles`([#1584](https://github.com/apache/gluten/pull/1584)).
* Changed join type `JOIN_TYPE_SEMI` to `JOIN_TYPE_LEFT_SEMI` and `JOIN_TYPE_RIGHT_SEMI`([#408](https://github.com/apache/gluten/pull/408)).
* Added `WindowRel`, added `column_name` and `window_type` in `WindowFunction`,
changed `Unbounded` in `WindowFunction` into `Unbounded_Preceding` and `Unbounded_Following`, and added WindowType([#485](https://github.com/apache/gluten/pull/485)).
* Added `output_schema` in RelRoot([#1901](https://github.com/apache/gluten/pull/1901)).
* Added `ExpandRel`([#1361](https://github.com/apache/gluten/pull/1361)).
* Added `GenerateRel`([#574](https://github.com/apache/gluten/pull/574)).
* Added `PartitionColumn` in `LocalFiles`([#2405](https://github.com/apache/gluten/pull/2405)).
* Added `WriteRel` ([#3690](https://github.com/apache/gluten/pull/3690)).
* Added `TopNRel` ([#5409](https://github.com/apache/gluten/pull/5409)).
* Added `ref` field in window bound `Preceding` and `Following` ([#5626](https://github.com/apache/gluten/pull/5626)).
* Added `BucketSpec` field in `WriteRel`([#8386](https://github.com/apache/gluten/pull/8386))
* Added `StreamKafka` in `ReadRel`([#8321](https://github.com/apache/gluten/pull/8321))
## Modifications to type.proto
* Added `Nothing` in `Type`([#791](https://github.com/apache/gluten/pull/791)).
* Added `names` in `Struct`([#1878](https://github.com/apache/gluten/pull/1878)).
* Added `PartitionColumns` in `NamedStruct`([#320](https://github.com/apache/gluten/pull/320)).
* Remove `PartitionColumns` and add `column_types` in `NamedStruct`([#2405](https://github.com/apache/gluten/pull/2405)).