commit | f0bdcdbcc4c2dde60abc94561589fe9bd1585259 | [log] [tgz] |
---|---|---|
author | Dominik Moritz <domoritz@gmail.com> | Mon Apr 12 16:28:30 2021 -0700 |
committer | Neal Richardson <neal.p.richardson@gmail.com> | Mon Apr 12 16:28:30 2021 -0700 |
tree | 034ab82db21838390910ffe9ac4da533d81ab2ad | |
parent | b385fcd82659bea6ef030fba142e62855b467d9c [diff] |
ARROW-12303: [JS] Use iterator instead of yield Thanks to @ankoh for the suggestion to use iterators and @trxcllnt for suggesting to remove recursions in the iterators. Running `yarn build -t es2015 -m cjs && node perf/index.js` Master ``` Running apache-arrow performance tests... Parse "tracks": Table.from x 4,386 ops/sec ±17.33% (61 runs sampled) avg: 0.23ms 1.38% of a frame @ 60FPS Parse "tracks": readBatches x 7,813 ops/sec ±1.48% (87 runs sampled) avg: 0.13ms 0.78% of a frame @ 60FPS Get "tracks" values by index: name: 'lat', length: 1000000, type: Float32 x 38.18 ops/sec ±0.79% (57 runs sampled) avg: 26.19ms 157.14% of a frame @ 60FPS Get "tracks" values by index: name: 'lng', length: 1000000, type: Float32 x 36.92 ops/sec ±2.63% (48 runs sampled) avg: 27.09ms 162.54% of a frame @ 60FPS Get "tracks" values by index: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 0.25 ops/sec ±17.42% (5 runs sampled) avg: 4004.23ms 24025.38% of a frame @ 60FPS Get "tracks" values by index: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 0.20 ops/sec ±20.92% (5 runs sampled) avg: 4902.57ms 29415.42% of a frame @ 60FPS Iterate "tracks" vectors: name: 'lat', length: 1000000, type: Float32 x 21.37 ops/sec ±3.48% (39 runs sampled) avg: 46.79ms 280.74% of a frame @ 60FPS Iterate "tracks" vectors: name: 'lng', length: 1000000, type: Float32 x 22.65 ops/sec ±0.86% (40 runs sampled) avg: 44.16ms 264.96% of a frame @ 60FPS Iterate "tracks" vectors: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 0.28 ops/sec ±4.71% (5 runs sampled) avg: 3587.66ms 21525.96% of a frame @ 60FPS Iterate "tracks" vectors: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 0.27 ops/sec ±3.62% (5 runs sampled) avg: 3646.73ms 21880.38% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'lat', length: 1000000, type: Float32 x 561 ops/sec ±3.52% (76 runs sampled) avg: 1.78ms 10.68% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'lng', length: 1000000, type: Float32 x 567 ops/sec ±2.70% (46 runs sampled) avg: 1.76ms 10.56% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 0.28 ops/sec ±7.41% (5 runs sampled) avg: 3631.37ms 21788.22% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 0.23 ops/sec ±7.36% (5 runs sampled) avg: 4296.18ms 25777.08% of a frame @ 60FPS Slice "tracks" vectors: name: 'lat', length: 1000000, type: Float32 x 1,996,897 ops/sec ±0.57% (92 runs sampled) avg: 0ms 0% of a frame @ 60FPS Slice "tracks" vectors: name: 'lng', length: 1000000, type: Float32 x 2,114,550 ops/sec ±1.11% (83 runs sampled) avg: 0ms 0% of a frame @ 60FPS Slice "tracks" vectors: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 2,354,063 ops/sec ±0.99% (83 runs sampled) avg: 0ms 0% of a frame @ 60FPS Slice "tracks" vectors: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 2,230,019 ops/sec ±1.67% (84 runs sampled) avg: 0ms 0% of a frame @ 60FPS Table Iterate "tracks": length: 1000000 x 11.14 ops/sec ±2.99% (32 runs sampled) avg: 89.73ms 538.38% of a frame @ 60FPS DataFrame Count By "tracks": name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 437 ops/sec ±0.98% (87 runs sampled) avg: 2.29ms 13.74% of a frame @ 60FPS DataFrame Count By "tracks": name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 436 ops/sec ±0.92% (84 runs sampled) avg: 2.29ms 13.74% of a frame @ 60FPS DataFrame Filter-Scan Count "tracks": name: 'lat', length: 1000000, type: Float32, test: gt, value: 0 x 63.33 ops/sec ±1.33% (63 runs sampled) avg: 15.79ms 94.74% of a frame @ 60FPS DataFrame Filter-Scan Count "tracks": name: 'lng', length: 1000000, type: Float32, test: gt, value: 0 x 64.66 ops/sec ±1.33% (64 runs sampled) avg: 15.47ms 92.82% of a frame @ 60FPS DataFrame Filter-Scan Count "tracks": name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle x 76.14 ops/sec ±1.09% (63 runs sampled) avg: 13.13ms 78.78% of a frame @ 60FPS DataFrame Direct Count "tracks": name: 'lat', length: 1000000, type: Float32, test: gt, value: 0 x 157 ops/sec ±1.26% (77 runs sampled) avg: 6.37ms 38.22% of a frame @ 60FPS DataFrame Direct Count "tracks": name: 'lng', length: 1000000, type: Float32, test: gt, value: 0 x 148 ops/sec ±1.52% (73 runs sampled) avg: 6.76ms 40.56% of a frame @ 60FPS DataFrame Direct Count "tracks": name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle x 0.28 ops/sec ±17.18% (5 runs sampled) avg: 3543.03ms 21258.18% of a frame @ 60FPS ``` This branch ``` Running apache-arrow performance tests... Parse "tracks": Table.from x 8,357 ops/sec ±7.74% (84 runs sampled) avg: 0.12ms 0.72% of a frame @ 60FPS Parse "tracks": readBatches x 8,842 ops/sec ±1.79% (87 runs sampled) avg: 0.11ms 0.66% of a frame @ 60FPS Get "tracks" values by index: name: 'lat', length: 1000000, type: Float32 x 38.98 ops/sec ±1.39% (50 runs sampled) avg: 25.66ms 153.96% of a frame @ 60FPS Get "tracks" values by index: name: 'lng', length: 1000000, type: Float32 x 39.00 ops/sec ±1.88% (50 runs sampled) avg: 25.64ms 153.84% of a frame @ 60FPS Get "tracks" values by index: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 0.29 ops/sec ±3.27% (5 runs sampled) avg: 3495.78ms 20974.68% of a frame @ 60FPS Get "tracks" values by index: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 0.22 ops/sec ±4.01% (5 runs sampled) avg: 4592.66ms 27555.96% of a frame @ 60FPS Iterate "tracks" vectors: name: 'lat', length: 1000000, type: Float32 x 57.56 ops/sec ±1.73% (57 runs sampled) avg: 17.37ms 104.22% of a frame @ 60FPS Iterate "tracks" vectors: name: 'lng', length: 1000000, type: Float32 x 58.53 ops/sec ±1.04% (57 runs sampled) avg: 17.08ms 102.48% of a frame @ 60FPS Iterate "tracks" vectors: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 0.28 ops/sec ±2.04% (5 runs sampled) avg: 3618.14ms 21708.84% of a frame @ 60FPS Iterate "tracks" vectors: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 0.27 ops/sec ±2.79% (5 runs sampled) avg: 3662.97ms 21977.82% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'lat', length: 1000000, type: Float32 x 615 ops/sec ±3.79% (65 runs sampled) avg: 1.63ms 9.78% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'lng', length: 1000000, type: Float32 x 627 ops/sec ±2.56% (48 runs sampled) avg: 1.59ms 9.54% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 0.26 ops/sec ±2.74% (5 runs sampled) avg: 3790.33ms 22741.98% of a frame @ 60FPS Slice toArray "tracks" vectors: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 0.26 ops/sec ±1.50% (5 runs sampled) avg: 3776.96ms 22661.76% of a frame @ 60FPS Slice "tracks" vectors: name: 'lat', length: 1000000, type: Float32 x 2,212,355 ops/sec ±1.29% (87 runs sampled) avg: 0ms 0% of a frame @ 60FPS Slice "tracks" vectors: name: 'lng', length: 1000000, type: Float32 x 2,215,643 ops/sec ±0.94% (84 runs sampled) avg: 0ms 0% of a frame @ 60FPS Slice "tracks" vectors: name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 2,445,650 ops/sec ±1.21% (86 runs sampled) avg: 0ms 0% of a frame @ 60FPS Slice "tracks" vectors: name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 2,432,760 ops/sec ±0.92% (88 runs sampled) avg: 0ms 0% of a frame @ 60FPS Table Iterate "tracks": length: 1000000 x 25.91 ops/sec ±1.61% (45 runs sampled) avg: 38.59ms 231.54% of a frame @ 60FPS DataFrame Count By "tracks": name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8> x 456 ops/sec ±1.12% (83 runs sampled) avg: 2.19ms 13.14% of a frame @ 60FPS DataFrame Count By "tracks": name: 'destination', length: 1000000, type: Dictionary<Int8, Utf8> x 459 ops/sec ±0.77% (85 runs sampled) avg: 2.18ms 13.08% of a frame @ 60FPS DataFrame Filter-Scan Count "tracks": name: 'lat', length: 1000000, type: Float32, test: gt, value: 0 x 61.19 ops/sec ±2.16% (63 runs sampled) avg: 16.34ms 98.04% of a frame @ 60FPS DataFrame Filter-Scan Count "tracks": name: 'lng', length: 1000000, type: Float32, test: gt, value: 0 x 62.97 ops/sec ±1.29% (63 runs sampled) avg: 15.88ms 95.28% of a frame @ 60FPS DataFrame Filter-Scan Count "tracks": name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle x 72.84 ops/sec ±1.08% (70 runs sampled) avg: 13.73ms 82.38% of a frame @ 60FPS DataFrame Direct Count "tracks": name: 'lat', length: 1000000, type: Float32, test: gt, value: 0 x 164 ops/sec ±1.05% (79 runs sampled) avg: 6.09ms 36.54% of a frame @ 60FPS DataFrame Direct Count "tracks": name: 'lng', length: 1000000, type: Float32, test: gt, value: 0 x 166 ops/sec ±1.27% (80 runs sampled) avg: 6.02ms 36.12% of a frame @ 60FPS DataFrame Direct Count "tracks": name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle x 0.25 ops/sec ±1.56% (5 runs sampled) avg: 3947.73ms 23686.38% of a frame @ 60FPS DataFrame Filter-Iterate "tracks": name: 'lat', length: 1000000, type: Float32, test: gt, value: 0 x 32.07 ops/sec ±1.53% (53 runs sampled) avg: 31.18ms 187.08% of a frame @ 60FPS DataFrame Filter-Iterate "tracks": name: 'lng', length: 1000000, type: Float32, test: gt, value: 0 x 31.75 ops/sec ±0.90% (53 runs sampled) avg: 31.5ms 189% of a frame @ 60FPS DataFrame Filter-Iterate "tracks": name: 'origin', length: 1000000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle x 50.33 ops/sec ±0.95% (61 runs sampled) avg: 19.87ms 119.22% of a frame @ 60FPS ``` Closes #9962 from domoritz/iter Authored-by: Dominik Moritz <domoritz@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast.
Major components of the project include:
Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.
The reference Arrow libraries contain many distinct software components:
The official Arrow libraries in this repository are in different stages of implementing the Arrow format and related features. See our current feature matrix on git master.
Please read our latest project contribution guide.
Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: