commit | 3274d081e11e6716329828eafaf0d31c9a96e75f | [log] [tgz] |
---|---|---|
author | David Li <li.davidm96@gmail.com> | Tue Apr 06 08:00:48 2021 -0400 |
committer | David Li <li.davidm96@gmail.com> | Tue Apr 06 08:00:48 2021 -0400 |
tree | 0ffb71ce87c67375081e36c173fd62073df8daad | |
parent | 3e825a718c1ae25ca3e3a7ba397c096af0e1c0a5 [diff] |
ARROW-10882: [Python] Allow writing dataset from iterator of batches This binds InMemoryDataset to Python, allowing us to create and write back out datasets from iterables of record batches and various other objects. Closes #9802 from lidavidm/arrow-10882 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to process and move data fast.
Major components of the project include:
Arrow is an Apache Software Foundation project. Learn more at arrow.apache.org.
The reference Arrow libraries contain many distinct software components:
The official Arrow libraries in this repository are in different stages of implementing the Arrow format and related features. See our current feature matrix on git master.
Please read our latest project contribution guide.
Even if you do not plan to contribute to Apache Arrow itself or Arrow integrations in other projects, we'd be happy to have you involved: