blob: 7cfafdb7d374b1d841c729ed3ac475082b1f57fc [file] [log] [blame]
================================================================================================
Benchmark to measure CSV read/write performance
================================================================================================
OpenJDK 64-Bit Server VM 17.0.1+12-LTS on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Parsing quoted values: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string 26649 26975 422 0.0 532974.9 1.0X
OpenJDK 64-Bit Server VM 17.0.1+12-LTS on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Wide rows with 1000 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns 77261 78006 1256 0.0 77260.7 1.0X
Select 100 columns 28285 28729 415 0.0 28284.7 2.7X
Select one column 22758 22984 203 0.0 22758.4 3.4X
count() 4594 4671 72 0.2 4593.5 16.8X
Select 100 columns, one bad input field 44143 44966 930 0.0 44142.6 1.8X
Select 100 columns, corrupt record field 50907 51857 999 0.0 50907.2 1.5X
OpenJDK 64-Bit Server VM 17.0.1+12-LTS on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Count a dataset with 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count() 12711 13119 395 0.8 1271.1 1.0X
Select 1 column + count() 8115 8191 68 1.2 811.5 1.6X
count() 2759 2766 8 3.6 275.9 4.6X
OpenJDK 64-Bit Server VM 17.0.1+12-LTS on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps 1408 1423 17 7.1 140.8 1.0X
to_csv(timestamp) 10168 10290 138 1.0 1016.8 0.1X
write timestamps to files 9474 9593 103 1.1 947.4 0.1X
Create a dataset of dates 1328 1394 96 7.5 132.8 1.1X
to_csv(date) 6294 6448 212 1.6 629.4 0.2X
write dates to files 5401 5452 44 1.9 540.1 0.3X
OpenJDK 64-Bit Server VM 17.0.1+12-LTS on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
read timestamp text from files 1750 1820 60 5.7 175.0 1.0X
read timestamps from files 23206 23263 51 0.4 2320.6 0.1X
infer timestamps from files 47985 48634 566 0.2 4798.5 0.0X
read date text from files 1757 1767 11 5.7 175.7 1.0X
read date from files 8430 8455 22 1.2 843.0 0.2X
infer date from files 18182 18681 610 0.5 1818.2 0.1X
timestamp strings 1930 1993 85 5.2 193.0 0.9X
parse timestamps from Dataset[String] 24227 24352 108 0.4 2422.7 0.1X
infer timestamps from Dataset[String] 48124 48502 328 0.2 4812.4 0.0X
date strings 2126 2200 83 4.7 212.6 0.8X
parse dates from Dataset[String] 9386 9449 55 1.1 938.6 0.2X
from_csv(timestamp) 24146 24210 76 0.4 2414.6 0.1X
from_csv(date) 8563 8674 114 1.2 856.3 0.2X
OpenJDK 64-Bit Server VM 17.0.1+12-LTS on Linux 5.11.0-1022-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters 14825 14935 100 0.0 148247.4 1.0X
pushdown disabled 14777 14829 62 0.0 147768.1 1.0X
w/ filters 981 985 4 0.1 9810.0 15.1X