| ================================================================================================ |
| Benchmark for performance of JSON parsing |
| ================================================================================================ |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| JSON schema inferring: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| No encoding 2188 2222 52 2.3 437.5 1.0X |
| UTF-8 is set 4801 4804 3 1.0 960.3 0.5X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| count a short column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| No encoding 1970 1977 6 2.5 394.0 1.0X |
| UTF-8 is set 4490 4507 18 1.1 897.9 0.4X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| count a wide column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| No encoding 4286 4299 13 0.2 4286.2 1.0X |
| UTF-8 is set 4468 4485 17 0.2 4467.9 1.0X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| select wide row: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| No encoding 9089 9187 96 0.0 181776.3 1.0X |
| UTF-8 is set 10274 10302 37 0.0 205480.9 0.9X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| Select a subset of 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| Select 10 columns 1621 1635 12 0.6 1620.8 1.0X |
| Select 1 column 1129 1143 18 0.9 1128.8 1.4X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| creation of JSON parser per line: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| Short column without encoding 632 633 2 1.6 631.9 1.0X |
| Short column with UTF-8 1115 1119 6 0.9 1114.7 0.6X |
| Wide column without encoding 5330 5358 27 0.2 5329.6 0.1X |
| Wide column with UTF-8 6811 6828 15 0.1 6811.0 0.1X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| JSON functions: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| Text read 55 58 4 18.1 55.3 1.0X |
| from_json 1101 1107 6 0.9 1101.1 0.1X |
| json_tuple 1006 1012 8 1.0 1006.3 0.1X |
| get_json_object wholestage off 1054 1056 3 0.9 1053.8 0.1X |
| get_json_object wholestage on 985 988 2 1.0 985.3 0.1X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| Dataset of json strings: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| Text read 229 229 0 21.8 45.8 1.0X |
| schema inferring 1717 1724 9 2.9 343.5 0.1X |
| parsing 2575 2587 11 1.9 514.9 0.1X |
| |
| Preparing data for benchmarking ... |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| Json files in the per-line mode: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| Text read 568 574 7 8.8 113.6 1.0X |
| Schema inferring 2362 2371 9 2.1 472.5 0.2X |
| Parsing without charset 2838 2841 3 1.8 567.5 0.2X |
| Parsing with UTF-8 5374 5389 14 0.9 1074.8 0.1X |
| |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| Create a dataset of timestamps 103 107 4 9.7 102.6 1.0X |
| to_json(timestamp) 632 634 2 1.6 631.5 0.2X |
| write timestamps to files 666 670 3 1.5 666.3 0.2X |
| Create a dataset of dates 124 126 2 8.1 123.9 0.8X |
| to_json(date) 453 455 2 2.2 452.8 0.2X |
| write dates to files 452 454 3 2.2 451.9 0.2X |
| |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| read timestamp text from files 148 149 2 6.8 147.6 1.0X |
| read timestamps from files 1093 1095 1 0.9 1093.2 0.1X |
| infer timestamps from files 2033 2037 6 0.5 2032.7 0.1X |
| read date text from files 137 139 3 7.3 136.8 1.1X |
| read date from files 715 717 2 1.4 715.2 0.2X |
| timestamp strings 130 131 1 7.7 129.8 1.1X |
| parse timestamps from Dataset[String] 1235 1237 2 0.8 1235.3 0.1X |
| infer timestamps from Dataset[String] 2147 2158 18 0.5 2147.3 0.1X |
| date strings 197 200 3 5.1 197.1 0.7X |
| parse dates from Dataset[String] 984 987 4 1.0 984.0 0.1X |
| from_json(timestamp) 1712 1721 7 0.6 1712.5 0.1X |
| from_json(date) 1470 1471 1 0.7 1470.1 0.1X |
| infer error timestamps from Dataset[String] with default format 1346 1351 5 0.7 1346.5 0.1X |
| infer error timestamps from Dataset[String] with user-provided format 1350 1353 2 0.7 1350.3 0.1X |
| infer error timestamps from Dataset[String] with legacy format 1377 1382 8 0.7 1376.8 0.1X |
| |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| w/o filters 5608 5621 11 0.0 56080.0 1.0X |
| pushdown disabled 5437 5450 17 0.0 54365.8 1.0X |
| w/ filters 666 675 8 0.2 6663.8 8.4X |
| |
| OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Linux 6.11.0-1018-azure |
| AMD EPYC 7763 64-Core Processor |
| Partial JSON results: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| ------------------------------------------------------------------------------------------------------------------------ |
| parse invalid JSON 2354 2528 294 0.0 235361.2 1.0X |
| |
| |