layout: default title: Apache Arrow 0.10.0 Release permalink: /release/0.10.0.html

Apache Arrow 0.10.0 (6 August 2018)

This is a major release.

Download

Contributors

$ git shortlog -sn apache-arrow-0.9.0..apache-arrow-0.10.0
    70  Antoine Pitrou
    49  Kouhei Sutou
    40  Korn, Uwe
    37  Wes McKinney
    32  Krisztián Szűcs
    30  Andy Grove
    20  Philipp Moritz
    13  Phillip Cloud
    11  Bryan Cutler
    11  yosuke shiro
     7  Dimitri Vorona
     6  Zhijun Fu
     5  Bruce Mitchener
     5  Joshua Storck
     5  Robert Nishihara
     5  ptaylor
     4  Maximilian Roos
     4  Sebastien Binet
     3  Alex
     3  Brian Hulette
     3  Chao Sun
     3  Dominik Moritz
     3  Kenji Okimoto
     3  Marco Neumann
     3  Yuhong Guo
     2  Abhi
     2  Dhruv Madeka
     2  Dmitry Kalinkin
     2  Donal Simmie
     2  Frank Wessels
     2  Julius Neuffer
     2  Manabu Ejima
     2  Omer Katz
     2  Paddy
     2  Paddy Horan
     2  Robert Gruener
     2  Teddy Choi
     2  Vanco Buca
     2  Venki Korukanti
     2  bomeng
     2  fjetter
     2  liurenjie1024
     2  songqing
     1  284km
     1  Adrian Dorr
     1  Albert Shieh
     1  Alessandro Andrioni
     1  Alok Singh
     1  Aneesh Karve
     1  Atul Dambalkar
     1  Ben Wolfson
     1  Brent Kerby
     1  Daniel Chalef
     1  Daniel Compton
     1  Florian Rathgeber
     1  Gatis Seja
     1  HE, Tao
     1  James Lamb
     1  Jeff Zhang
     1  Juan Paulo Gutierrez
     1  Kane
     1  Kee Chong Tan
     1  Kelsey Jordahl
     1  Kendall Willets
     1  Li Jin
     1  Licht-T
     1  Lizhou Gao
     1  Louis Potok
     1  Markus Klein
     1  Matt Topol
     1  Matthew Topol
     1  Michael Sarahan
     1  Paul Taylor
     1  Peter Schafhalter
     1  Philipp Hoch
     1  Renato Marroquin
     1  Richard Gowers
     1  Robbie Gruener

Patch Committers

The following Apache committers committed contributed patches to the repository.

$ git shortlog -csn apache-arrow-0.9.0..apache-arrow-0.10.0
   120  Wes McKinney
   119  Korn, Uwe
    63  Antoine Pitrou
    50  Uwe L. Korn
    28  Kouhei Sutou
    27  Philipp Moritz
    15  Bryan Cutler
    15  Phillip Cloud
     8  Robert Nishihara
     6  Sidd
     4  Brian Hulette
     2  GitHub
     1  Your Name Here
     1  ptaylor

Changelog

New Features and Improvements

  • ARROW-1018 - [C++] Add option to create FileOutputStream, ReadableFile from OS file descriptor
  • ARROW-1163 - [Plasma][Java] Java client for Plasma
  • ARROW-1388 - [Python] Add Table.drop method for removing columns
  • ARROW-1454 - [Python] More informative error message when attempting to write an unsupported Arrow type to Parquet format
  • ARROW-1715 - [Python] Implement pickling for Column, ChunkedArray, RecordBatch, Table
  • ARROW-1722 - [C++] Add linting script to look for C++/CLI issues
  • ARROW-1731 - [Python] Provide for selecting a subset of columns to convert in RecordBatch/Table.from_pandas
  • ARROW-1744 - [Plasma] Provide TensorFlow operator to read tensors from plasma
  • ARROW-1780 - [Java] JDBC Adapter for Apache Arrow
  • ARROW-1858 - [Python] Add documentation about parquet.write_to_dataset and related methods
  • ARROW-1868 - [Java] Change vector getMinorType to use MinorType instead of Types.MinorType
  • ARROW-1886 - [Python] Add function to “flatten” structs within tables
  • ARROW-1913 - [Java] Fix Javadoc generation bugs with JDK8
  • ARROW-1928 - [C++] Add benchmarks comparing performance of internal::BitmapReader/Writer with naive approaches
  • ARROW-1954 - [Python] Add metadata accessor to pyarrow.Field
  • ARROW-1964 - [Python] Expose Builder classes
  • ARROW-2014 - [Python] Document read_pandas method in pyarrow.parquet
  • ARROW-2055 - [Java] Upgrade to Java 8
  • ARROW-2060 - [Python] Documentation for creating StructArray using from_arrays or a sequence of dicts
  • ARROW-2061 - [C++] Run ASAN builds in Travis CI
  • ARROW-2074 - [Python] Allow type inference for struct arrays
  • ARROW-2097 - [Python] Suppress valgrind stdout/stderr in Travis CI builds when there are no errors
  • ARROW-2100 - [Python] Drop Python 3.4 support
  • ARROW-2140 - [Python] Conversion from Numpy float16 array unimplemented
  • ARROW-2141 - [Python] Conversion from Numpy object array to varsize binary unimplemented
  • ARROW-2147 - [Python] Type inference doesn't work on lists of Numpy arrays
  • ARROW-2207 - [GLib] Support decimal type
  • ARROW-2222 - [C++] Add option to validate Flatbuffers messages
  • ARROW-2224 - [C++] Get rid of boost regex usage
  • ARROW-2241 - [Python] Simple script for running all current ASV benchmarks at a commit or tag
  • ARROW-2264 - [Python] Efficiently serialize numpy arrays with dtype of unicode fixed length string
  • ARROW-2267 - Rust bindings
  • ARROW-2276 - [Python] Tensor could implement the buffer protocol
  • ARROW-2281 - [Python] Expose MakeArray to construct arrays from buffers
  • ARROW-2285 - [Python] Can't convert Numpy string arrays
  • ARROW-2286 - [Python] Allow subscripting pyarrow.lib.StructValue
  • ARROW-2287 - [Python] chunked array not iterable, not indexable
  • ARROW-2299 - [Go] Go language implementation
  • ARROW-2301 - [Python] Add source distribution publishing instructions to package / release management documentation
  • ARROW-2302 - [GLib] Run autotools and meson Linux builds in same Travis CI build entry
  • ARROW-2308 - Serialized tensor data should be 64-byte aligned.
  • ARROW-2315 - [C++/Python] Add method to flatten a struct array
  • ARROW-2319 - [C++] Add buffered output class implementing OutputStream interface
  • ARROW-2322 - Document requirements to run dev/release/01-perform.sh
  • ARROW-2325 - [Python] Update setup.py to use Markdown project description
  • ARROW-2330 - [C++] Optimize delta buffer creation with partially finishable array builders
  • ARROW-2332 - [Python] Provide API for reading multiple Feather files
  • ARROW-2334 - [C++] Update boost to 1.66.0
  • ARROW-2335 - [Go] Move Go README one directory higher
  • ARROW-2340 - [Website] Add blog post about Go codebase donation
  • ARROW-2341 - [Python] pa.union() mode argument unintuitive
  • ARROW-2343 - [Java/Packaging] Run mvn clean in API doc builds
  • ARROW-2344 - [Go] Run Go unit tests in Travis CI
  • ARROW-2345 - [Documentation] Fix bundle exec and set sphinx nosidebar to True
  • ARROW-2348 - [GLib] Remove Go example
  • ARROW-2350 - Shrink size of spark_integration Docker container
  • ARROW-2353 - Test correctness of built wheel on AppVeyor
  • ARROW-2361 - [Rust] Start native Rust Implementation
  • ARROW-2364 - [Plasma] PlasmaClient::Get() could take vector of object ids
  • ARROW-2376 - [Rust] Travis should run tests for Rust library
  • ARROW-2378 - [Rust] Use rustfmt to format source code
  • ARROW-2381 - [Rust] Buffer should have an Iterator
  • ARROW-2384 - Rust: Use Traits rather than defining methods directly
  • ARROW-2385 - [Rust] Implement to_json() for Field and DataType
  • ARROW-2388 - [C++] Arrow::StringBuilder::Append() uses null_bytes not valid_bytes
  • ARROW-2389 - [C++] Add StatusCode::OverflowError
  • ARROW-2390 - [C++/Python] CheckPyError() could inspect exception type
  • ARROW-2395 - [Python] Correct flake8 errors outside of pyarrow/ directory
  • ARROW-2396 - Unify Rust Errors
  • ARROW-2397 - Document changes in Tensor encoding in IPC.md.
  • ARROW-2398 - [Rust] Provide a zero-copy builder for type-safe Buffer
  • ARROW-2400 - [C++] Status destructor is expensive
  • ARROW-2401 - Support filters on Hive partitioned Parquet files
  • ARROW-2402 - [C++] FixedSizeBinaryBuilder::Append lacks “const char*” overload
  • ARROW-2404 - Fix declaration of ‘type_id’ hides class member warning in msvc build
  • ARROW-2407 - [GLib] Add garrow_string_array_builder_append_values()
  • ARROW-2408 - [Rust] It should be possible to get a &mut[T] from Builder
  • ARROW-2411 - [C++] Add method to append batches of null-terminated strings to StringBuilder
  • ARROW-2413 - [Rust] Remove useless use of `format!`
  • ARROW-2414 - [Documentation] Fix miscellaneous documentation typos
  • ARROW-2415 - [Rust] Fix using references in pattern matching
  • ARROW-2416 - [C++] Support system libprotobuf
  • ARROW-2417 - [Rust] Review APIs for safety
  • ARROW-2422 - [Python] Support more filter operators on Hive partitioned Parquet files
  • ARROW-2427 - [C++] ReadAt implementations suboptimal
  • ARROW-2430 - MVP for branch based packaging automation
  • ARROW-2433 - [Rust] Add Builder.push_slice(&[T])
  • ARROW-2434 - [Rust] Add windows support
  • ARROW-2435 - [Rust] Add memory pool abstraction.
  • ARROW-2436 - [Rust] Add windows CI
  • ARROW-2440 - [Rust] Implement ListBuilder
  • ARROW-2442 - [C++] Disambiguate Builder::Append overloads
  • ARROW-2445 - [Rust] Add documentation and make some fields private
  • ARROW-2448 - Segfault when plasma client goes out of scope before buffer.
  • ARROW-2451 - Handle more dtypes efficiently in custom numpy array serializer.
  • ARROW-2453 - [Python] Improve Table column access
  • ARROW-2458 - [Plasma] PlasmaClient uses global variable
  • ARROW-2463 - [C++] Update flatbuffers to 1.9.0
  • ARROW-2464 - [Python] Use a python_version marker instead of a condition
  • ARROW-2469 - Make out arguments last in ReadMessage API.
  • ARROW-2470 - [C++] FileGetSize() should not seek
  • ARROW-2472 - [Rust] The Schema and Fields types should not have public attributes
  • ARROW-2477 - [Rust] Set up code coverage in CI
  • ARROW-2478 - [C++] Introduce a checked_cast function that performs a dynamic_cast in debug mode
  • ARROW-2479 - [C++] Have a global thread pool
  • ARROW-2480 - [C++] Enable casting the value of a decimal to int32_t or int64_t
  • ARROW-2481 - [Rust] Move calls to free() into memory.rs
  • ARROW-2482 - [Rust] support nested types
  • ARROW-2484 - [C++] Document ABI compliance checking
  • ARROW-2485 - [C++] Output diff when run_clang_format.py reports a change
  • ARROW-2486 - [C++/Python] Provide a Docker image that contains all dependencies for development
  • ARROW-2488 - [C++] List Boost 1.67 as supported version
  • ARROW-2493 - [Python] Add support for pickling to buffers and arrays
  • ARROW-2494 - Return status codes from PlasmaClient::Seal
  • ARROW-2498 - [Java] Upgrade to JDK 1.8
  • ARROW-2499 - [C++] Add iterator facility for Python sequences
  • ARROW-2505 - [C++] Disable MSVC warning C4800
  • ARROW-2506 - [Plasma] Build error on macOS
  • ARROW-2507 - [Rust] Don't take a reference when not needed
  • ARROW-2508 - [Python] pytest API changes make tests fail
  • ARROW-2513 - [Python] DictionaryType should give access to index type and dictionary array
  • ARROW-2516 - AppVeyor Build Matrix should be specific to the changes made in a PR
  • ARROW-2521 - [Rust] Refactor Rust API to use traits and generics
  • ARROW-2522 - [C++] Version shared library files
  • ARROW-2525 - [GLib] Add garrow_struct_array_flatten()
  • ARROW-2526 - [GLib] Update .gitignore
  • ARROW-2527 - [GLib] Enable GPU document
  • ARROW-2529 - [C++] Update mention of clang-format to 5.0 in the docs
  • ARROW-2531 - [C++] Update clang bits to 6.0
  • ARROW-2533 - [CI] Fast finish failing AppVeyor builds
  • ARROW-2536 - [Rust] ListBuilder uses wrong initial size for offset builder
  • ARROW-2537 - [Ruby] Import
  • ARROW-2539 - [Plasma] Use unique_ptr instead of raw pointer
  • ARROW-2540 - [Plasma] add constructor/destructor to make sure dlfree is called automatically
  • ARROW-2541 - [Plasma] Clean up macro usage
  • ARROW-2543 - [Rust] CI should cache dependencies for faster builds
  • ARROW-2544 - [CI] Run C++ tests with two jobs on Travis-CI
  • ARROW-2547 - [Format] Fix off-by-one in List<List> example
  • ARROW-2548 - [Format] Clarify `List` Array example
  • ARROW-2549 - [GLib] Apply arrow::StatusCodes changes to GArrowError
  • ARROW-2550 - [C++] Add missing status codes into arrow::StatusCode::CodeAsString()
  • ARROW-2551 - [Plasma] Improve notification logic
  • ARROW-2553 - [Python] Set MACOSX_DEPLOYMENT_TARGET in wheel build
  • ARROW-2558 - [Plasma] avoid walk through all the objects when a client disconnects
  • ARROW-2562 - [C++] Upload coverage data to codecov.io
  • ARROW-2563 - [Rust] Poor caching in Travis-CI
  • ARROW-2566 - [CI] Add codecov.io badge to README
  • ARROW-2567 - [C++/Python] Unit is ignored on comparison of TimestampArrays
  • ARROW-2568 - [Python] Expose thread pool size setting to Python, and deprecate “nthreads”
  • ARROW-2569 - [C++] Improve thread pool size heuristic
  • ARROW-2574 - [CI] Collect and publish Python coverage
  • ARROW-2576 - [GLib] Add abs functions for Decimal128.
  • ARROW-2577 - [Plasma] Add ASV benchmarks
  • ARROW-2580 - [GLib] Fix abs functions for Decimal128
  • ARROW-2582 - [GLib] Add negate functions for Decimal128
  • ARROW-2585 - [C++] Add Decimal128::FromBigEndian
  • ARROW-2586 - [C++] Make child builders of ListBuilder and StructBuilder shared_ptr's
  • ARROW-2595 - [Plasma] operator[] creates entries in map
  • ARROW-2596 - [GLib] Use the default value of GTK-Doc
  • ARROW-2597 - [Plasma] remove UniqueIDHasher
  • ARROW-2604 - [Java] Add method overload for VarCharVector.set(int,String)
  • ARROW-2608 - [Java/Python] Add pyarrow.{Array,Field}.from_jvm / jvm_buffer
  • ARROW-2611 - [Python] Python 2 integer serialization
  • ARROW-2612 - [Plasma] Fix deprecated PLASMA_DEFAULT_RELEASE_DELAY
  • ARROW-2613 - [Docs] Update the gen_apidocs docker script
  • ARROW-2614 - [CI] Remove ‘group: deprecated’ in Travis
  • ARROW-2626 - [Python] pandas ArrowInvalid message should include failing column name
  • ARROW-2634 - [Go] Add LICENSE additions for Go subproject
  • ARROW-2635 - [Ruby] LICENSE.txt isn't suitable
  • ARROW-2636 - [Ruby] “Unofficial” package note is missing
  • ARROW-2638 - [Python] Prevent calling extension class constructors directly
  • ARROW-2639 - [Python] Remove unnecessary _check_nullptr methods
  • ARROW-2641 - [C++] Investigate spurious memset() calls
  • ARROW-2645 - [Java] ArrowStreamWriter accumulates DictionaryBatch ArrowBlocks
  • ARROW-2649 - [C++] Add std::generate()-like function for faster bitmap writing
  • ARROW-2656 - [Python] Improve ParquetManifest creation time
  • ARROW-2660 - [Python] Experiment with zero-copy pickling
  • ARROW-2661 - [Python/C++] Allow passing HDFS Config values via map/dict instead of needing an hdfs-site.xml file
  • ARROW-2662 - [Python] Add to_pandas / to_numpy to ChunkedArray
  • ARROW-2663 - [Python] Make dictionary_encode and unique accesible on Column / ChunkedArray
  • ARROW-2664 - [Python] Implement __getitem__ / slicing on Buffer
  • ARROW-2666 - [Python] numpy.asarray should trigger to_pandas on Array/ChunkedArray
  • ARROW-2672 - [Python] Build ORC extension in manylinux1 wheels
  • ARROW-2674 - [Packaging] Start building nightlies
  • ARROW-2676 - [Packaging] Deploy build artifacts to github releases
  • ARROW-2677 - [Python] Expose Parquet ZSTD compression
  • ARROW-2678 - [GLib] Add extra information to common build problems on macOS
  • ARROW-2680 - [Python] Add documentation about type inference in Table.from_pandas
  • ARROW-2682 - [CI] Notify in Slack about broken builds
  • ARROW-2689 - [Python] Remove references to timestamps_to_ms argument from documentation
  • ARROW-2692 - [Python] Add test for writing dictionary encoded columns to chunked Parquet files
  • ARROW-2695 - [Python] Prevent calling scalar contructors directly
  • ARROW-2696 - [JAVA] enhance AllocationListener with an onFailedAllocation() call
  • ARROW-2699 - [C++/Python] Add Table method that replaces a column with a new supplied column
  • ARROW-2700 - [Python] Add simple examples to Array.cast docstring
  • ARROW-2701 - [C++] Make MemoryMappedFile resizable
  • ARROW-2704 - [Java] IPC stream handling should be more friendly to low level processing
  • ARROW-2713 - [Packaging] Fix linux package builds
  • ARROW-2717 - [Packaging] Postfix conda artifacts with target arch
  • ARROW-2718 - [Packaging] GPG sign downloaded artifacts
  • ARROW-2724 - [Packaging] Determine whether all the expected artifacts are uploaded
  • ARROW-2725 - [JAVA] make Accountant.AllocationOutcome publicly visible
  • ARROW-2729 - [GLib] Add decimal128 array builder
  • ARROW-2731 - Allow usage of external ORC library
  • ARROW-2732 - Update brew packages for macOS
  • ARROW-2733 - [GLib] Cast garrow_decimal128 to gint64
  • ARROW-2738 - [GLib] Use Brewfile on installation process
  • ARROW-2739 - [GLib] Use G_DECLARE_DERIVABLE_TYPE for GArrowDecimalDataType and GArrowDecimal128ArrayBuilder
  • ARROW-2740 - [Python] Add address property to Buffer
  • ARROW-2742 - [Python] Allow Table.from_batches to use Iterator of ArrowRecordBatches
  • ARROW-2748 - [GLib] Add garrow_decimal_data_type_get_scale() (and _precision())
  • ARROW-2749 - [GLib] Rename *garrow_decimal128_array_get_value to *garrow_decimal128_array_format_value
  • ARROW-2751 - [GLib] Add garrow_table_replace_column()
  • ARROW-2752 - [GLib] Document garrow_decimal_data_type_new()
  • ARROW-2753 - [GLib] Add garrow_schema_*_field()
  • ARROW-2755 - [Python] Allow using Ninja to build extension
  • ARROW-2756 - [Python] Remove redundant imports and minor fixes in parquet tests
  • ARROW-2758 - [Plasma] Use Scope enum in Plasma
  • ARROW-2760 - [Python] Remove legacy property definition syntax from parquet module and test them
  • ARROW-2761 - Support set filter operators on Hive partitioned Parquet files
  • ARROW-2763 - [Python] Make parquet _metadata file accessible from ParquetDataset
  • ARROW-2780 - [Go] Run code coverage analysis
  • ARROW-2784 - [C++] MemoryMappedFile::WriteAt allow writing past the end
  • ARROW-2790 - [C++] Buffers contain uninitialized memory
  • ARROW-2791 - [Packaging] Build Ubuntu 18.04 packages
  • ARROW-2792 - [Packaging] Consider uploading tarballs to avoid naming conflicts
  • ARROW-2794 - [Plasma] Add Delete method for multiple objects
  • ARROW-2798 - [Plasma] Use hashing function that takes into account all UniqueID bytes
  • ARROW-2802 - [Docs] Move release management guide to project wiki
  • ARROW-2804 - [Website] Link to Developer wiki (Confluence) from front page
  • ARROW-2805 - [Python] TensorFlow import workaround not working with tensorflow-gpu if CUDA is not installed
  • ARROW-2809 - [C++] Decrease verbosity of lint checks in Travis CI
  • ARROW-2811 - [Python] Test serialization for determinism
  • ARROW-2815 - [CI] Suppress DEBUG logging when building Java library in C++ CI entries
  • ARROW-2816 - [Python] Add __iter__ method to NativeFile
  • ARROW-2821 - [C++] Only zero memory in BooleanBuilder in one place
  • ARROW-2822 - [C++] Zero padding bytes in PoolBuffer::Resize
  • ARROW-2824 - [GLib] Add garrow_decimal128_array_get_value()
  • ARROW-2825 - [C++] Need AllocateBuffer / AllocateResizableBuffer variant with default memory pool
  • ARROW-2826 - [C++] Clarification needed between ArrayBuilder::Init(), Resize() and Reserve()
  • ARROW-2827 - [C++] LZ4 and Zstd build may be failed in parallel build
  • ARROW-2829 - [GLib] Add GArrowORCFileReader
  • ARROW-2830 - [Packaging] Enable parallel build for deb package build again
  • ARROW-2833 - [Python] Column.__repr__ will lock up Jupyter with large datasets
  • ARROW-2834 - [GLib] Remove “enable_” prefix from Meson options
  • ARROW-2836 - [Packaging] Expand build matrices to multiple tasks
  • ARROW-2837 - [C++] ArrayBuilder::null_bitmap returns PoolBuffer
  • ARROW-2838 - [Python] Speed up null testing with Pandas semantics
  • ARROW-2844 - [Packaging] Test OSX wheels after build
  • ARROW-2845 - [Packaging] Upload additional debian artifacts
  • ARROW-2846 - [Packaging] Update nightly build in crossbow as well as the sample configuration
  • ARROW-2847 - [Packaging] Fix artifact name matching for conda forge packages
  • ARROW-2848 - [Packaging] lib*.deb package name doesn't match so version
  • ARROW-2849 - [Ruby] Arrow::Table#load supports ORC
  • ARROW-2855 - [C++] Blog post that outlines the benefits of using jemalloc
  • ARROW-2859 - [Python] Handle objects exporting the buffer protocol in open_stream, open_file, and RecordBatch*Reader APIs
  • ARROW-2861 - [Python] Add extra tips about using Parquet to store index-less pandas data
  • ARROW-2864 - [Plasma] Add deletion cache to delete objects later
  • ARROW-2868 - [Packaging] Fix centos-7 build
  • ARROW-2869 - [Python] Add documentation for Array.to_numpy
  • ARROW-2875 - [Packaging] Don't attempt to download arrow archive in linux builds
  • ARROW-2881 - [Website] Add Community tab to website
  • ARROW-2884 - [Packaging] Options to build packages from apache source archive
  • ARROW-2886 - [Release] An unused variable exists
  • ARROW-2890 - [Plasma] Make Python PlasmaClient.release private
  • ARROW-2893 - [C++] Remove PoolBuffer class from public API and hide implementation details behind factory functions
  • ARROW-2897 - Organize supported Ubuntu versions
  • ARROW-2898 - [Packaging] Setuptools_scm just shipped a new version which fails to parse `apache-arrow-` tag
  • ARROW-2906 - [Website] Remove the link to slack channel
  • ARROW-2907 - [GitHub] Improve “How to contribute patches”
  • ARROW-2908 - [Rust] Update version to 0.10.0
  • ARROW-2914 - [Integration] Add WindowPandasUDFTests to Spark Integration
  • ARROW-2915 - [Packaging] Remove artifact form ubuntu-trusty build
  • ARROW-2918 - [C++] Improve formatting of Struct pretty prints
  • ARROW-2921 - [Release] Update .deb/.rpm changelos in preparation
  • ARROW-2922 - [Release] Make python command name customizable
  • ARROW-2923 - [Doc] Add instructions for running Spark integration tests
  • ARROW-2924 - [Java] mvn release fails when an older maven javadoc plugin is installed
  • ARROW-2927 - [Packaging] AppVeyor wheel task is failing on initial checkout
  • ARROW-2928 - [Packaging] AppVeyor crossbow conda builds are picking up boost 1.63.0 instead of the installed version
  • ARROW-2929 - [C++] ARROW-2826 Breaks parquet-cpp 1.4.0 builds
  • ARROW-2934 - [Packaging] Add checksums creation to sign subcommand
  • ARROW-2935 - [Packaging] Add verify_binary_artifacts function to verify-release-candidate.sh
  • ARROW-2937 - [Java] Follow-up changes to ARROW-2704
  • ARROW-2943 - [C++] Implement BufferedOutputStream::Flush
  • ARROW-2944 - [Format] Arrow columnar format docs mentions VectorLayout that does not exist anymore
  • ARROW-2946 - [Packaging] Stop to use PWD in debian/rules
  • ARROW-2947 - [Packaging] Remove Ubuntu Artful
  • ARROW-2949 - [CI] repo.continuum.io can be flaky in builds
  • ARROW-2951 - [CI] Changes in format/ should cause Appveyor builds to run
  • ARROW-2953 - [Plasma] Store memory usage
  • ARROW-2954 - [Plasma] Store object_id only once in object table
  • ARROW-2962 - [Packaging] Bintray descriptor files are no longer needed
  • ARROW-2977 - [Packaging] Release verification script should check rust too
  • ARROW-2985 - [Ruby] Run unit tests in verify-release-candidate.sh
  • ARROW-2988 - [Release] More automated release verification on Windows
  • ARROW-2990 - [GLib] Fail to build with rpath-ed Arrow C++ on macOS
  • ARROW-530 - C++/Python: Provide subpools for better memory allocation tracking
  • ARROW-564 - [Python] Add methods to return vanilla NumPy arrays (plus boolean mask array if there are nulls)
  • ARROW-889 - [C++] Implement arrow::PrettyPrint for ChunkedArray
  • ARROW-902 - [C++] Build C++ project including thirdparty dependencies from local tarballs
  • ARROW-906 - [C++] Serialize Field metadata to IPC metadata

Bug Fixes

  • ARROW-2059 - [Python] Possible performance regression in Feather read/write path
  • ARROW-2101 - [Python] from_pandas reads ‘str’ type as binary Arrow data with Python 2
  • ARROW-2122 - [Python] Pyarrow fails to serialize dataframe with timestamp.
  • ARROW-2182 - [Python] ASV benchmark setup does not account for C++ library changing
  • ARROW-2193 - [Plasma] plasma_store has runtime dependency on Boost shared libraries when ARROW_BOOST_USE_SHARED=on
  • ARROW-2195 - [Plasma] Segfault when retrieving RecordBatch from plasma store
  • ARROW-2247 - [Python] Statically-linking boost_regex in both libarrow and libparquet results in segfault
  • ARROW-2273 - Cannot deserialize pandas SparseDataFrame
  • ARROW-2300 - [Python] python/testing/test_hdfs.sh no longer works
  • ARROW-2305 - [Python] Cython 0.25.2 compilation failure
  • ARROW-2314 - [Python] Union array slicing is defective
  • ARROW-2326 - [Python] cannot import pip installed pyarrow on OS X (10.9)
  • ARROW-2328 - Writing a slice with feather ignores the offset
  • ARROW-2331 - [Python] Fix indexing implementations
  • ARROW-2333 - [Python] boost bundling fails in setup.py
  • ARROW-2342 - [Python] Aware timestamp type fails pickling
  • ARROW-2346 - [Python] PYARROW_CXXFLAGS doesn't accept multiple options
  • ARROW-2349 - [Python] Boost shared library bundling is broken for MSVC
  • ARROW-2351 - [C++] StringBuilder::append(vector...) not implemented
  • ARROW-2354 - [C++] PyDecimal_Check() is much too slow
  • ARROW-2355 - [Python] Unable to import pyarrow [0.9.0] OSX
  • ARROW-2357 - Benchmark PandasObjectIsNull
  • ARROW-2368 - DecimalVector#setBigEndian is not padding correctly for negative values
  • ARROW-2369 - Large (>~20 GB) files written to Parquet via PyArrow are corrupted
  • ARROW-2370 - [GLib] include path is wrong on Meson build
  • ARROW-2371 - [GLib] gio-2.0 isn't required on GNU Autotools build
  • ARROW-2372 - [Python] ArrowIOError: Invalid argument when reading Parquet file
  • ARROW-2375 - [Rust] Buffer should release memory when dropped
  • ARROW-2377 - [GLib] Travis-CI failures
  • ARROW-2380 - [Python] Correct issues in numpy_to_arrow conversion routines
  • ARROW-2382 - [Rust] List was not using memory safely
  • ARROW-2383 - [C++] Debian packages need to depend on libprotobuf
  • ARROW-2387 - [Python] negative decimal values get spurious rescaling error
  • ARROW-2391 - [Python] Segmentation fault from PyArrow when mapping Pandas datetime column to pyarrow.date64
  • ARROW-2393 - [C++] arrow/status.h does not define ARROW_CHECK needed for ARROW_CHECK_OK
  • ARROW-2403 - [C++] arrow::CpuInfo::model_name_ destructed twice on exit
  • ARROW-2405 - [C++] is missing in plasma/client.h
  • ARROW-2418 - [Rust] List builder fails due to memory not being reserved correctly
  • ARROW-2419 - [Site] Website generation depends on local timezone
  • ARROW-2420 - [Rust] Memory is never released
  • ARROW-2423 - [Python] PyArrow datatypes raise ValueError on equality checks against non-PyArrow objects
  • ARROW-2424 - [Rust] Missing import causing broken build
  • ARROW-2425 - [Rust] Array::from missing mapping for u8 type
  • ARROW-2426 - [CI] glib build failure
  • ARROW-2432 - [Python] from_pandas fails when converting decimals if have None values
  • ARROW-2437 - [C++] Change of arrow::ipc::ReadMessage signature breaks ABI compability
  • ARROW-2441 - [Rust] Builder::slice_mut assertions are too strict
  • ARROW-2443 - [Python] Conversion from pandas of empty categorical fails with ArrowInvalid
  • ARROW-2450 - [Python] Saving to parquet fails for empty lists
  • ARROW-2452 - [TEST] Spark integration test fails with permission error
  • ARROW-2454 - [Python] Empty chunked array slice crashes
  • ARROW-2455 - [C++] The bytes_allocated_ in CudaContextImpl isn't initialized
  • ARROW-2457 - garrow_array_builder_append_values() won't work for large arrays
  • ARROW-2459 - pyarrow: Segfault with pyarrow.deserialize_pandas
  • ARROW-2462 - [C++] Segfault when writing a parquet table containing a dictionary column from Record Batch Stream
  • ARROW-2465 - [Plasma] plasma_store fails to find libarrow_gpu.so
  • ARROW-2466 - [C++] misleading “append” flag to FileOutputStream
  • ARROW-2468 - [Rust] Builder::slice_mut should take mut self
  • ARROW-2471 - [Rust] Assertion when pushing value to Builder/ListBuilder with zero capacity
  • ARROW-2473 - [Rust] List assertion error with list of zero length
  • ARROW-2474 - [Rust] Add windows support for memory pool abstraction
  • ARROW-2489 - [Plasma] test_plasma.py crashes
  • ARROW-2491 - [Python] Array.from_buffers does not work for ListArray
  • ARROW-2492 - [Python] Prevent segfault on accidental call of pyarrow.Array
  • ARROW-2500 - [Java] IPC Writers/readers are not always setting validity bits correctly
  • ARROW-2502 - [Rust] Restore Windows Compatibility
  • ARROW-2503 - [Python] Trailing space character in RowGroup statistics of pyarrow.parquet.ParquetFile
  • ARROW-2509 - [CI] Intermittent npm failures
  • ARROW-2511 - BaseVariableWidthVector.allocateNew is not throwing OOM when it can't allocate memory
  • ARROW-2514 - [Python] Inferring / converting nested Numpy array is very slow
  • ARROW-2515 - Errors with DictionaryArray inside of ListArray or other DictionaryArray
  • ARROW-2518 - [Java] Restore Java unit tests and javadoc test to CI matrix
  • ARROW-2530 - [GLib] Out-of-source build is failed
  • ARROW-2534 - [C++] libarrow.so leaks zlib symbols
  • ARROW-2545 - [Python] Arrow fails linking against statically-compiled Python
  • ARROW-2554 - pa.array type inference bug when using NS-timestamp
  • ARROW-2557 - [Rust] Add badge for code coverage in README
  • ARROW-2561 - [C++] Crash in cuda-test shutdown with coverage enabled
  • ARROW-2564 - [C++] Rowwise Tutorial is out of date
  • ARROW-2565 - [Plasma] new subscriber cannot receive notifications about existing objects
  • ARROW-2570 - [Python] Add support for writing parquet files with LZ4 compression
  • ARROW-2571 - [C++] Lz4Codec doesn't properly handle empty data
  • ARROW-2575 - [Python] Exclude hidden files when reading Parquet dataset
  • ARROW-2578 - [Plasma] Valgrind errors related to std::random_device
  • ARROW-2589 - [Python] test_parquet.py regression with Pandas 0.23.0
  • ARROW-2593 - [Python] TypeError: data type “mixed-integer” not understood
  • ARROW-2594 - [Java] Vector reallocation does not properly clear reused buffers
  • ARROW-2601 - [Python] MemoryPool bytes_allocated causes seg
  • ARROW-2603 - [Python] from pandas raises ArrowInvalid for date(time) subclasses
  • ARROW-2615 - [Rust] Refactor introduced a bug around Arrays of String
  • ARROW-2629 - [Plasma] Iterator invalidation for pending_notifications_
  • ARROW-2630 - [Java] Typo in the document
  • ARROW-2632 - [Java] ArrowStreamWriter accumulates ArrowBlock but does not use them
  • ARROW-2640 - JS Writer should serialize schema metadata
  • ARROW-2643 - [C++] Travis-CI build failure with cpp toolchain enabled
  • ARROW-2644 - [Python] parquet binding fails building on AppVeyor
  • ARROW-2655 - [C++] Failure with -Werror=conversion on gcc 7.3.0
  • ARROW-2657 - Segfault when importing TensorFlow after Pyarrow
  • ARROW-2668 - [C++] -Wnull-pointer-arithmetic warning with dlmalloc.c on clang 6.0, Ubuntu 14.04
  • ARROW-2669 - [C++] EP_CXX_FLAGS not passed on when building gbenchmark
  • ARROW-2675 - Arrow build error with clang-10 (Apple Clang / LLVM)
  • ARROW-2683 - [Python] Resource Warning (Unclosed File) when using pyarrow.parquet.read_table()
  • ARROW-2690 - [C++] Plasma does not follow style conventions for variable and function names
  • ARROW-2691 - [Rust] Travis fails due to formatting diff
  • ARROW-2693 - [Python] pa.chunked_array causes a segmentation fault on empty input
  • ARROW-2694 - [Python] ArrayValue string conversion returns the representation instead of the converted python object string
  • ARROW-2698 - [Python] Exception when passing a string to Table.column
  • ARROW-2711 - [Python/C++] Pandas-Arrow doesn't roundtrip when column of lists has empty first element
  • ARROW-2716 - [Python] Make manylinux1 base image independent of Python patch releases
  • ARROW-2721 - [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7
  • ARROW-2722 - [Python] ndarray to arrow conversion fails when downcasted from pandas to_numeric
  • ARROW-2723 - [C++] arrow-orc.pc is missing
  • ARROW-2726 - [C++] The latest Boost version is wrong
  • ARROW-2727 - [Java] Unable to build java/adapters module
  • ARROW-2741 - [Python] pa.array from np.datetime[D] and type=pa.date64 produces invalid results
  • ARROW-2744 - [Python] Writing to parquet crashes when writing a ListArray of empty lists
  • ARROW-2745 - [C++] ORC ExternalProject needs to declare dependency on vendored protobuf
  • ARROW-2747 - [CI] [Plasma] huge tables test failure on Travis
  • ARROW-2754 - [Python] When installing pyarrow via pip, a debug build is created
  • ARROW-2770 - [Packaging] Account for conda-forge compiler migration in conda recipes
  • ARROW-2773 - [Python] Corrected parquet docs partition_cols parameter name
  • ARROW-2781 - [Python] Download boost using curl in manylinux1 image
  • ARROW-2787 - [Python] Memory Issue passing table from python to c++ via cython
  • ARROW-2795 - [Python] Run TensorFlow import workaround only on Linux
  • ARROW-2806 - [Python] Inconsistent handling of np.nan
  • ARROW-2810 - [Plasma] Plasma public headers leak flatbuffers.h
  • ARROW-2812 - [Ruby] StructArray#[] raises NoMethodError
  • ARROW-2820 - [Python] RecordBatch.from_arrays does not validate array lengths are all equal
  • ARROW-2823 - [C++] Search for flatbuffers in /lib64
  • ARROW-2841 - [Go] Fix recent Go build failures in Travis CI
  • ARROW-2850 - [C++/Python] PARQUET_RPATH_ORIGIN=ON missing in manylinux1 build
  • ARROW-2851 - [C++] Update RAT excludes for new install file names
  • ARROW-2852 - [Rust] Mark Array as Sync and Send
  • ARROW-2862 - [C++] Ensure thirdparty download directory has been created in thirdparty/download_thirdparty.sh
  • ARROW-2867 - [Python] Incorrect example for Cython usage
  • ARROW-2871 - [Python] Array.to_numpy is invalid for boolean arrays
  • ARROW-2872 - [Python] Add pytest mark to opt into TensorFlow-related unit tests
  • ARROW-2876 - [Packaging] Crossbow builds can hang if you cloned using SSH
  • ARROW-2877 - [Packaging] crossbow submit results in duplicate Travis CI build
  • ARROW-2878 - [Packaging] README.md does not mention setting GitHub API token in user's crossbow repo settings
  • ARROW-2883 - [Plasma] Compilation warnings
  • ARROW-2891 - Preserve schema in write_to_dataset
  • ARROW-2894 - [Glib] Format tests broken due to recent refactor
  • ARROW-2895 - [Ruby] CI isn't ran when C++ is changed
  • ARROW-2896 - [GLib] export are missing
  • ARROW-2901 - [Java] Build is failing on Java9
  • ARROW-2902 - [Python] HDFS Docker integration tests leave around files created by root
  • ARROW-2911 - [Python] Parquet binary statistics that end in ‘\0’ truncate last byte
  • ARROW-2917 - [Python] Tensor requiring gradiant cannot be serialized with pyarrow.serialize
  • ARROW-2920 - [Python] Segfault with pytorch 0.4
  • ARROW-2926 - [Python] ParquetWriter segfaults in example where passed schema and table schema do not match
  • ARROW-2930 - [C++] Trying to set target properties on not existing CMake target
  • ARROW-2940 - [Python] Import error with pytorch 0.3
  • ARROW-2945 - [Packaging] Update argument check for 02-source.sh
  • ARROW-2955 - [Python] Typo in pyarrow's HDFS API result
  • ARROW-2963 - [Python] Deadlock during fork-join and use_threads=True
  • ARROW-2978 - [Rust] Travis CI build is failing
  • ARROW-2982 - The “--show-progress” option is only supported in wget 1.16 and higher
  • ARROW-640 - [Python] Arrow scalar values should have a sensible __hash__ and comparison