d93c637895ca40d6ec5371c6399757dac7a6f6ea - arrow-cookbook

commit	d93c637895ca40d6ec5371c6399757dac7a6f6ea	[log] [tgz]
author	Alessandro Molina <amol@turbogears.org>	Wed Jul 28 16:38:20 2021 +0200
committer	GitHub <noreply@github.com>	Wed Jul 28 09:38:20 2021 -0500
tree	efb4bb9e80788aa80a592688f20f6b5c9a952d6b
parent	a9352414df66e5387f478bee92d3de430d59cd47 [diff]

Initial content for Arrow Cookbook for Python and R (#1)

* Initial Import

* R cookbook initial commit (#1)

* R Cookbook skeleton and initial chapter

* Move r test script to a separate directory

* Add Apache 2 license

* Add parquet section

* Delete files used to demonstrate failing tests in CI

* Licensing

* Add content for different formats and rearrange headings

* Small change to make the tests run on macOS

* Completed the IO section and added intersphinx with PyArrow

* Add workflow to deploy to GH pages

* Update path

* Rename chapters and fill in section titles

* Commit whitespace to trigger build

* Update bookdown job

* try new job config

* Install nightly Arrow

* Evaluate all relevant bits!

* Deploy to r dir

* Try new workflow

* update build path

* Add email and update paths

* Update job to build all cookbooks

* Delete whitespace to trigger build

* Swap order to see if this fixes build

* Install system dependencies

* Put it back on Mac so it's faster

* Separate steps to diagnose issue

* Brew not sudo

* Switching to ubuntu as I don't understand why python 2

* Don't put results in r directory

* Capitalise 'C'

* Update bookdown link so can click to fork/edit

* Add CI stage that runs tests

* Add examples of manually creating Arrow objects and writing to various formats

* Add S3 parquet

* Partitioned data

* Partitioned Data from S3

* Rename record_batch_create chunk

* CSV recipe requires pandas

* Filter parquet data on read

* Reading/Writing feather files

* remove duplicated chunk name

* tweak create

* Categorical data

* Speed up compiling

* Fix tests

* tests pass

* Data manipulation functions

* Link to compute functions

* Tweak naming

* Add contribution file

* landing page style tweak

* Improve contribution documentation

* Explicitly reference the contribution docs

* ignore build directory

* Change branch name

* Update contents

* Update CONTRIBUTING.md

* Suggestions from Grammarly

* Rename initial chapter

* Update Makefile to allow Arrow version to be specified

* Truncate license file to relevant part

* typo

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>

* Add link to code of conduct

Co-authored-by: Ian Cook <ianmcook@gmail.com>

* Capitalise "Array"

* Update r/CONTRIBUTING.md

Co-authored-by: Ian Cook <ianmcook@gmail.com>

* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>

* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>

* Update r/content/manipulating_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>

* Update r/content/creating_arrow_objects.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>

* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>

* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>

* Apply suggestions from code review

Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>

* Mention dependencies

* Mention that this is not the documentation

* rewording

* Add -jauto by default and indent a print

* The Apache Software Foundation

* reword

* Correct ambiguous and incorrect phrasing

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>

* Update r/content/reading_and_writing_data.Rmd

Co-authored-by: Weston Pace <weston.pace@gmail.com>

* Reorder sections

* Update r/content/manipulating_data.Rmd

Co-authored-by: Ian Cook <ianmcook@gmail.com>

* Remove redundant code snippet

* Update reading CSVs

* Add in section on converting from/to Arrow Tables and tibbles

* rephrase list of numbers

* rephrase list of numbers

* Add missing bracket

* Rephrase about parquet containing multiple cols

* rephrased

* Adapt to Arrow 5.0 output

Co-authored-by: Nic <thisisnic@gmail.com>
Co-authored-by: Jonathan Keane <jkeane@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Co-authored-by: Ian Cook <ianmcook@gmail.com>

.github/.gitignore[Added - diff]
.github/workflows/deploy_cookbooks.yml[Added - diff]
.gitignore[diff]
CONTRIBUTING.md[Added - diff]
LICENSE[Added - diff]
Makefile[Added - diff]
README.rst[Added - diff]
build/arrow.png[Added - diff]
build/index.html[Added - diff]
python/CONTRIBUTING.rst[Added - diff]
python/Makefile[Added - diff]
python/make.bat[Added - diff]
python/requirements.txt[Added - diff]
python/source/conf.py[Added - diff]
python/source/create.rst[Added - diff]
python/source/data.rst[Added - diff]
python/source/index.rst[Added - diff]
python/source/io.rst[Added - diff]
r/.Rbuildignore[Added - diff]
r/CONTRIBUTING.md[Added - diff]
r/content/_bookdown.yml[Added - diff]
r/content/creating_arrow_objects.Rmd[Added - diff]
r/content/index.Rmd[Added - diff]
r/content/manipulating_data.Rmd[Added - diff]
r/content/reading_and_writing_data.Rmd[Added - diff]
r/content/unpublished/configure_arrow.Rmd[Added - diff]
r/content/unpublished/create_arrow_objects_from_r.Rmd[Added - diff]
r/content/unpublished/manipulate_data.Rmd[Added - diff]
r/content/unpublished/specify_data_types_and_schemas.Rmd[Added - diff]
r/content/unpublished/work_with_arrow_in_both_python_and_r.Rmd[Added - diff]
r/content/unpublished/work_with_compressed_or_partitioned_data.Rmd[Added - diff]
r/content/unpublished/work_with_data_in_different_formats.Rmd[Added - diff]
r/scripts/install_dependencies.R[Added - diff]
r/scripts/test.R[Added - diff]

34 files changed

tree: efb4bb9e80788aa80a592688f20f6b5c9a952d6b