blob: ca2734a8704ab797e9cda0d9f5427b5588a1bedf [file] [log] [blame]
================================================================================================
Benchmark to measure CSV read/write performance
================================================================================================
OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Parsing quoted values: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string 24317 24376 95 0.0 486343.1 1.0X
OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Wide rows with 1000 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns 56420 56835 704 0.0 56420.3 1.0X
Select 100 columns 20565 20673 113 0.0 20564.7 2.7X
Select one column 17105 17145 38 0.1 17105.4 3.3X
count() 3378 3428 68 0.3 3378.0 16.7X
Select 100 columns, one bad input field 24702 24731 37 0.0 24702.1 2.3X
Select 100 columns, corrupt record field 28027 28093 91 0.0 28026.7 2.0X
OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Count a dataset with 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count() 10764 10804 35 0.9 1076.4 1.0X
Select 1 column + count() 7422 7424 1 1.3 742.2 1.5X
count() 1679 1682 3 6.0 167.9 6.4X
OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps 829 834 7 12.1 82.9 1.0X
to_csv(timestamp) 5601 5649 49 1.8 560.1 0.1X
write timestamps to files 5733 5743 11 1.7 573.3 0.1X
Create a dataset of dates 923 931 8 10.8 92.3 0.9X
to_csv(date) 4069 4071 4 2.5 406.9 0.2X
write dates to files 4030 4035 6 2.5 403.0 0.2X
OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------
read timestamp text from files 1157 1161 4 8.6 115.7 1.0X
read timestamps from files 11666 11677 12 0.9 1166.6 0.1X
infer timestamps from files 23313 23345 47 0.4 2331.3 0.0X
read date text from files 1061 1072 10 9.4 106.1 1.1X
read date from files 10393 10406 11 1.0 1039.3 0.1X
infer date from files 20923 20949 27 0.5 2092.3 0.1X
timestamp strings 1215 1220 5 8.2 121.5 1.0X
parse timestamps from Dataset[String] 13441 13464 22 0.7 1344.1 0.1X
infer timestamps from Dataset[String] 24868 24942 91 0.4 2486.8 0.0X
date strings 1681 1682 1 5.9 168.1 0.7X
parse dates from Dataset[String] 12086 12095 8 0.8 1208.6 0.1X
from_csv(timestamp) 11219 11323 92 0.9 1121.9 0.1X
from_csv(date) 10647 10658 10 0.9 1064.7 0.1X
infer error timestamps from Dataset[String] with default format 14771 14788 17 0.7 1477.1 0.1X
infer error timestamps from Dataset[String] with user-provided format 14792 14816 23 0.7 1479.2 0.1X
infer error timestamps from Dataset[String] with legacy format 14780 14818 33 0.7 1478.0 0.1X
OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters 4307 4312 4 0.0 43066.8 1.0X
pushdown disabled 4358 4388 26 0.0 43575.3 1.0X
w/ filters 727 734 8 0.1 7267.3 5.9X
OpenJDK 64-Bit Server VM 21.0.8+9-LTS on Linux 6.11.0-1018-azure
AMD EPYC 7763 64-Core Processor
Interval: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Read as Intervals 761 764 2 0.4 2537.6 1.0X
Read Raw Strings 336 337 1 0.9 1120.9 2.3X