| The two Parquet files (nullable.parq and nonnullable_orc.parq) were generated |
| as testdata/data/schemas/nested/README stated. |
| |
| The two ORC files (nullable.orc and nonnullable.orc) were generated by the orc-tools |
| which can convert JSON files into ORC format. However, we need to modify nullable.json |
| and nonnullable.json to meet the format it requires. The whole file should not be a array. |
| It should be JSON objects of each row joined by '\n'. Assume the JSON files are |
| nullable_orc.json and nonnullable_orc.json. |
| |
| The ORC files can be regenerated by running the following commands in current directory: |
| |
| wget https://search.maven.org/remotecontent?filepath=org/apache/orc/orc-tools/1.5.4/orc-tools-1.5.4-uber.jar \ |
| -O orc-tools-1.5.4-uber.jar |
| |
| java -jar orc-tools-1.5.4-uber.jar convert \ |
| -s "struct<id:bigint,int_array:array<int>,int_array_Array:array<array<int>>,int_map:map<string,int>,int_Map_Array:array<map<string,int>>,nested_struct:struct<A:int,b:array<int>,C:struct<d:array<array<struct<E:int,F:string>>>>,g:map<string,struct<H:struct<i:array<double>>>>>>" \ |
| -o nullable.orc \ |
| nullable_orc.json |
| |
| java -jar orc-tools-1.5.4-uber.jar convert \ |
| -s "struct<ID:bigint,Int_Array:array<int>,int_array_array:array<array<int>>,Int_Map:map<string,int>,int_map_array:array<map<string,int>>,nested_Struct:struct<a:int,B:array<int>,c:struct<D:array<array<struct<e:int,f:string>>>>,G:map<string,struct<h:struct<i:array<double>>>>>>" \ |
| -o nonnullable.orc \ |
| nonnullable_orc.json |