Check field nullability for custom extension types (#69)

For custom extension types (currently automatically supported for `Char`
and `Symbol` types), we were failing to take into account whether the
field was nullable or not; this led to the case where a column might be
`['a', missing]`, but when deserializing, the column type was just
`Char` instead of `Union{Char, Missing}`. The fix is to enhance the
`ArrowTypes.extensiontype` function to also take the `field` argument
and check the nullability before returning.
2 files changed
tree: 7121a95dfe343a176ae457b2eec281915dc3ea2e
  1. .github/
  2. docs/
  3. src/
  4. test/
  5. .gitignore
  6. .travis.yml
  7. LICENSE.md
  8. Project.toml
  9. README.md
README.md

Arrow

docs Build Status codecov

This is a pure Julia implementation of the Apache Arrow data standard. This package provides Julia AbstractVector objects for referencing data that conforms to the Arrow standard. This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.

Please see this document for a description of the Arrow memory layout.

Format Support

This implementation supports the 1.0 version of the specification, including support for:

  • All primitive data types
  • All nested data types
  • Dictionary encodings and messages
  • Extension types
  • Streaming, file, record batch, and replacement and isdelta dictionary messages

It currently doesn't include support for:

  • Tensors or sparse tensors
  • Flight RPC
  • C data interface

Third-party data formats:

  • csv and parquet support via the existing CSV.jl and Parquet.jl packages
  • Other Tables.jl-compatible packages automatically supported (DataFrames.jl, JSONTables.jl, JuliaDB.jl, SQLite.jl, MySQL.jl, JDBC.jl, ODBC.jl, XLSX.jl, etc.)
  • No current Julia packages support ORC or Avro data formats

See the full documentation for details on reading and writing arrow data.