commit | c4844e3ea3d0308c513728193a404988ffbdcd71 | [log] [tgz] |
---|---|---|
author | Dewey Dunnington <dewey@dunnington.ca> | Thu Jan 25 11:14:42 2024 -0400 |
committer | GitHub <noreply@github.com> | Thu Jan 25 11:14:42 2024 -0400 |
tree | 71f13aaea04056790c375d98e291e0f15da8c064 | |
parent | e0a5d9d9c4188dd1fe81ead7fa9737f6322098d9 [diff] |
feat: Add decimal support to integration tester (#361) This PR adds decimal support to the integration test utility. Because decimal buffers are implemented in the integration test JSON format as strings containing the integer representation of the decimal, it meant that nanoarrow needed an implementation of arbitrarily large integer to/from string. I modified this from Arrow C++ (links in comments next to the implementation) with a few differences to avoid porting the complete int128 implementation and the C++ standard library. - [x] Parse strings containing arbitrarily large integers into decimal words - [x] Convert decimal words into arbitrarily large integer strings - [x] Wire the converters into the integration tester The gaps in test coverage are from big-endian parts, which I are tested as part of weekly verification and that I tested locally with: ```shell export NANOARROW_ARCH=s390x docker compose run --rm verify ``` With `archery integration --with-cpp=true --with-nanoarrow=true --run-c-data`, the decimal tests now pass: <details> ``` ########################################################## C Data Interface: C++ exporting, C++ importing ########################################################## ====================================================================== Testing C ArrowSchema from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_zerolength' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null_trivial' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal256' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'datetime' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duration' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval_mdn' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map_non_canonical' -- Skipping test because producer C++ does not support C ArrowSchema ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'recursive_nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'union' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'custom_metadata' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duplicate_fieldnames' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary_unsigned' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'run_end_encoded' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'binary_view' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'extension' -- Skipping test because producer C++ does not support C ArrowSchema ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_zerolength' ... with record batch #0 ... with record batch #1 ... with record batch #2 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null_trivial' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal256' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'datetime' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duration' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval_mdn' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map_non_canonical' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'recursive_nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'union' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'custom_metadata' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duplicate_fieldnames' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary_unsigned' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'run_end_encoded' ... with record batch #0 ... with record batch #1 ... with record batch #2 ====================================================================== ====================================================================== Testing C ArrowArray from file 'binary_view' ... with record batch #0 ... with record batch #1 ... with record batch #2 ====================================================================== ====================================================================== Testing C ArrowArray from file 'extension' -- Skipping test because producer C++ does not support C ArrowArray ====================================================================== ########################################################## C Data Interface: C++ exporting, nanoarrow importing ########################################################## ====================================================================== Testing C ArrowSchema from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_zerolength' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null_trivial' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal256' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'datetime' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duration' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval_mdn' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map_non_canonical' -- Skipping test because producer C++ does not support C ArrowSchema ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'recursive_nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'union' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'custom_metadata' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duplicate_fieldnames' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary_unsigned' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'run_end_encoded' Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 471, in _run_c_schema_test_case do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 454, in do_run importer.import_schema_and_compare_to_json(json_path, c_schema_ptr) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 138, in import_schema_and_compare_to_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'binary_view' Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 471, in _run_c_schema_test_case do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 454, in do_run importer.import_schema_and_compare_to_json(json_path, c_schema_ptr) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 138, in import_schema_and_compare_to_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'extension' -- Skipping test because producer C++ does not support C ArrowSchema ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_zerolength' ... with record batch #0 ... with record batch #1 ... with record batch #2 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null_trivial' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal256' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'datetime' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duration' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval_mdn' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map_non_canonical' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'recursive_nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'union' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'custom_metadata' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duplicate_fieldnames' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary_unsigned' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'run_end_encoded' ... with record batch #0 Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 521, in _run_c_array_test_cases do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 501, in do_run importer.import_batch_and_compare_to_json(json_path, File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 144, in import_batch_and_compare_to_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' ====================================================================== ====================================================================== Testing C ArrowArray from file 'binary_view' ... with record batch #0 Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 521, in _run_c_array_test_cases do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 501, in do_run importer.import_batch_and_compare_to_json(json_path, File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 144, in import_batch_and_compare_to_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' ====================================================================== ====================================================================== Testing C ArrowArray from file 'extension' -- Skipping test because producer C++ does not support C ArrowArray ====================================================================== ########################################################## C Data Interface: nanoarrow exporting, C++ importing ########################################################## ====================================================================== Testing C ArrowSchema from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_zerolength' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null_trivial' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal256' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'datetime' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duration' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval_mdn' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map_non_canonical' -- Skipping test because consumer C++ does not support C ArrowSchema ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'recursive_nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'union' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'custom_metadata' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duplicate_fieldnames' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary_unsigned' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'run_end_encoded' Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 471, in _run_c_schema_test_case do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 453, in do_run exporter.export_schema_from_json(json_path, c_schema_ptr) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 117, in export_schema_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'binary_view' Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 471, in _run_c_schema_test_case do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 453, in do_run exporter.export_schema_from_json(json_path, c_schema_ptr) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 117, in export_schema_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'extension' -- Skipping test because consumer C++ does not support C ArrowSchema ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_zerolength' ... with record batch #0 ... with record batch #1 ... with record batch #2 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null_trivial' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal256' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'datetime' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duration' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval_mdn' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map_non_canonical' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'recursive_nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'union' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'custom_metadata' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duplicate_fieldnames' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary_unsigned' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'run_end_encoded' ... with record batch #0 Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 521, in _run_c_array_test_cases do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 498, in do_run exporter.export_batch_from_json(json_path, File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 123, in export_batch_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' ====================================================================== ====================================================================== Testing C ArrowArray from file 'binary_view' ... with record batch #0 Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 521, in _run_c_array_test_cases do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 498, in do_run exporter.export_batch_from_json(json_path, File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 123, in export_batch_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' ====================================================================== ====================================================================== Testing C ArrowArray from file 'extension' -- Skipping test because consumer C++ does not support C ArrowArray ====================================================================== ########################################################## C Data Interface: nanoarrow exporting, nanoarrow importing ########################################################## ====================================================================== Testing C ArrowSchema from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_zerolength' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'primitive_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'null_trivial' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'decimal256' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'datetime' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duration' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'interval_mdn' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'map_non_canonical' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'recursive_nested' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_large_offsets' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'union' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'custom_metadata' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'duplicate_fieldnames' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'dictionary_unsigned' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'nested_dictionary' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'run_end_encoded' Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 471, in _run_c_schema_test_case do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 453, in do_run exporter.export_schema_from_json(json_path, c_schema_ptr) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 117, in export_schema_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'binary_view' Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 471, in _run_c_schema_test_case do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 453, in do_run exporter.export_schema_from_json(json_path, c_schema_ptr) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 117, in export_schema_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' ====================================================================== ====================================================================== Testing C ArrowSchema from file 'extension' ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_no_batches' ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_zerolength' ... with record batch #0 ... with record batch #1 ... with record batch #2 ====================================================================== ====================================================================== Testing C ArrowArray from file 'primitive_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'null_trivial' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'decimal256' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'datetime' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duration' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'interval_mdn' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'map_non_canonical' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'recursive_nested' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_large_offsets' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'union' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'custom_metadata' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'duplicate_fieldnames' ... with record batch #0 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'dictionary_unsigned' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'nested_dictionary' ... with record batch #0 ... with record batch #1 ====================================================================== ====================================================================== Testing C ArrowArray from file 'run_end_encoded' ... with record batch #0 Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 521, in _run_c_array_test_cases do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 498, in do_run exporter.export_batch_from_json(json_path, File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 123, in export_batch_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' ====================================================================== ====================================================================== Testing C ArrowArray from file 'binary_view' ... with record batch #0 Traceback (most recent call last): File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 521, in _run_c_array_test_cases do_run() File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/runner.py", line 498, in do_run exporter.export_batch_from_json(json_path, File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 123, in export_batch_from_json self._check_nanoarrow_error(na_error) File "/Users/deweydunnington/Desktop/rscratch/arrow/dev/archery/archery/integration/tester_nanoarrow.py", line 109, in _check_nanoarrow_error raise RuntimeError(f"nanoarrow C Data Integration call failed: {error}") RuntimeError: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' ====================================================================== ====================================================================== Testing C ArrowArray from file 'extension' ... with record batch #0 ... with record batch #1 ====================================================================== ################# FAILURES ################# FAILED TEST: run_end_encoded C++ producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' FAILED TEST: binary_view C++ producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' FAILED TEST: run_end_encoded C++ producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' FAILED TEST: binary_view C++ producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' FAILED TEST: run_end_encoded nanoarrow producing, C++ consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' FAILED TEST: binary_view nanoarrow producing, C++ consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' FAILED TEST: run_end_encoded nanoarrow producing, C++ consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' FAILED TEST: binary_view nanoarrow producing, C++ consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' FAILED TEST: run_end_encoded nanoarrow producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' FAILED TEST: binary_view nanoarrow producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' FAILED TEST: run_end_encoded nanoarrow producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'runendencoded' FAILED TEST: binary_view nanoarrow producing, nanoarrow consuming <class 'RuntimeError'>: nanoarrow C Data Integration call failed: Unsupported Type name: 'binaryview' 12 failures, 9 skips ``` </details
The nanoarrow library is a set of helper functions to interpret and generate Arrow C Data Interface and Arrow C Stream Interface structures. The library is in active early development and users should update regularly from the main branch of this repository.
Whereas the current suite of Arrow implementations provide the basis for a comprehensive data analysis toolkit, this library is intended to support clients that wish to produce or interpret Arrow C Data and/or Arrow C Stream structures where linking to a higher level Arrow binding is difficult or impossible.
The nanoarrow C library is intended to be copied and vendored. This can be done using CMake or by using the bundled nanoarrow.h/nanorrow.c distribution available in the dist/ directory in this repository. Examples of both can be found in the examples/ directory in this repository.
A simple producer example:
#include "nanoarrow.h" int make_simple_array(struct ArrowArray* array_out, struct ArrowSchema* schema_out) { struct ArrowError error; array_out->release = NULL; schema_out->release = NULL; NANOARROW_RETURN_NOT_OK(ArrowArrayInitFromType(array_out, NANOARROW_TYPE_INT32)); NANOARROW_RETURN_NOT_OK(ArrowArrayStartAppending(array_out)); NANOARROW_RETURN_NOT_OK(ArrowArrayAppendInt(array_out, 1)); NANOARROW_RETURN_NOT_OK(ArrowArrayAppendInt(array_out, 2)); NANOARROW_RETURN_NOT_OK(ArrowArrayAppendInt(array_out, 3)); NANOARROW_RETURN_NOT_OK(ArrowArrayFinishBuildingDefault(array_out, &error)); NANOARROW_RETURN_NOT_OK(ArrowSchemaInitFromType(schema_out, NANOARROW_TYPE_INT32)); return NANOARROW_OK; }
A simple consumer example:
#include <stdio.h> #include "nanoarrow.h" int print_simple_array(struct ArrowArray* array, struct ArrowSchema* schema) { struct ArrowError error; struct ArrowArrayView array_view; NANOARROW_RETURN_NOT_OK(ArrowArrayViewInitFromSchema(&array_view, schema, &error)); if (array_view.storage_type != NANOARROW_TYPE_INT32) { printf("Array has storage that is not int32\n"); } int result = ArrowArrayViewSetArray(&array_view, array, &error); if (result != NANOARROW_OK) { ArrowArrayViewReset(&array_view); return result; } for (int64_t i = 0; i < array->length; i++) { printf("%d\n", (int)ArrowArrayViewGetIntUnsafe(&array_view, i)); } ArrowArrayViewReset(&array_view); return NANOARROW_OK; }