tag	c37d0116e86e29cf53596d6915de7bc1e61f8230
tagger	github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	Fri Apr 16 19:02:01 2021 +0000
object	1af27ba8acc137509d6d7bc2882177d77115359a

tag

c37d0116e86e29cf53596d6915de7bc1e61f8230

tagger

github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Fri Apr 16 19:02:01 2021 +0000

object

1af27ba8acc137509d6d7bc2882177d77115359a

## Arrow v1.4.0 [Diff since v1.3.0](https://github.com/JuliaData/Arrow.jl/compare/v1.3.0...v1.4.0) **Closed issues:** - reconsidering the current type registration/serialization mechanism (and its internal usage) (#88) - provide mechanism to free metadata stored in OBJ_METADATA? (#90) - Arrow.write slow perf with ZonedDateTime (#95) - Implement DataAPI pool/dict encoding methods for DictEncoded (#120) - Slower materialization Feather vs Arrow (#131) - Usage with MPI (#151) - Reading CSV (#157) - Reading an Arrow file with no message batches after the schema seems to produce a partly initialized Table? (#158) - DictEncoded methods for refpool, refarray and levels (#159) - MethodError `Int64(::Arrow.Timestamp...` when reading arrow file saved by `pandas`. (#166) - Improve printing? (#168) **Merged pull requests:** - Add refpool, refarray and levels for DictEncoded (#161) (@dmbates) - Tweak promoteunion to always avoid abstract types (#162) (@quinnj) - Restructure ArrowTypes so it can be registered as its own package (#163) (@quinnj) - DataAPI methods (#164) (@quinnj) - Don't store table metadata globally (#165) (@quinnj) - document guarantee that `getmetadata` returns alias not copy (#169) (@jrevels) - add missing setmedata! method for Arrow.Table (#170) (@jrevels) - use actual deprecation for `registertype!` (#171) (@ericphanson) - Warn when converting Arrow.Timestamps to Dates.DateTime or ZonedDateTime (#172) (@quinnj) - Introduce Arrow.ToTimestamp for performant ZonedDateTime encoding (#173) (@quinnj) - Fix () -> {} typo in docs (#174) (@etpinard) - Fix case when ipc stream has no record batches, only schema (#175) (@quinnj) - Fix slight perf hit when checking validity bitmap (#176) (@quinnj)

commit	1af27ba8acc137509d6d7bc2882177d77115359a	[log] [tgz]
author	Jacob Quinn <quinn.jacobd@gmail.com>	Fri Apr 16 12:37:50 2021 -0600
committer	GitHub <noreply@github.com>	Fri Apr 16 12:37:50 2021 -0600
tree	9820b26ac436a09b8aed2da8489fc9412cc0aab1
parent	bdd0e5473cffe0f1eec6c0752f909dcdf77cac07 [diff]

commit

1af27ba8acc137509d6d7bc2882177d77115359a

[log] [tgz]

author

Jacob Quinn <quinn.jacobd@gmail.com>

Fri Apr 16 12:37:50 2021 -0600

committer

GitHub <noreply@github.com>

Fri Apr 16 12:37:50 2021 -0600

tree

9820b26ac436a09b8aed2da8489fc9412cc0aab1

parent

bdd0e5473cffe0f1eec6c0752f909dcdf77cac07 [diff]

tree: 9820b26ac436a09b8aed2da8489fc9412cc0aab1

README.md

Arrow

This is a pure Julia implementation of the Apache Arrow data standard. This package provides Julia AbstractVector objects for referencing data that conforms to the Arrow standard. This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.

Please see this document for a description of the Arrow memory layout.

Installation

The package can be installed by typing in the following in a Julia REPL:

julia> using Pkg; Pkg.add("Arrow")

or to use the official-apache code that follows the official apache release process, you can do:

julia> using Pkg; Pkg.add(url="https://github.com/apache/arrow", subdir="julia/Arrow.jl")

Difference between this code and the apache/arrow/julia/Arrow repository

The code in the apache/arrow repository is officially part of the apache/arrow project and as such follows the regulated release cadence of the entire project, following standard community voting protocols. The JuliaData/Arrow.jl repository can be viewed as a sort of “dev” or “latest” branch of this code that may release more frequently, but without following official apache release guidelines. The two repositories are synced, however, so any bugfix patches in JuliaData will be upstreamed to apache/arrow for each release.

Format Support

This implementation supports the 1.0 version of the specification, including support for:

All primitive data types
All nested data types
Dictionary encodings and messages
Extension types
Streaming, file, record batch, and replacement and isdelta dictionary messages

It currently doesn't include support for:

Tensors or sparse tensors
Flight RPC
C data interface

Third-party data formats:

csv and parquet support via the existing CSV.jl and Parquet.jl packages
Other Tables.jl-compatible packages automatically supported (DataFrames.jl, JSONTables.jl, JuliaDB.jl, SQLite.jl, MySQL.jl, JDBC.jl, ODBC.jl, XLSX.jl, etc.)
No current Julia packages support ORC or Avro data formats

See the full documentation for details on reading and writing arrow data.