commit	0cd186c899fca8362c9c68eb63d171a13ff25af3	[log] [tgz]
author	Vinoth Chandar <vinoth@uber.com>	Sun Dec 10 23:31:54 2017 -0800
committer	vinoth chandar <vinothchandar@users.noreply.github.com>	Wed Jan 17 23:34:21 2018 -0800
tree	ca379d5624ce9ae848ab59e04e615968ec7e73cf
parent	44839b88c6e190ab4293425430c4070dd08958dc [diff]

Multi FS Support - Reviving PR 191, to make FileSystem creation off actual path - Streamline all filesystem access to HoodieTableMetaClient - Hadoop Conf from Spark Context serialized & passed to executor code too - Pick up env vars prefixed with HOODIE_ENV_ into Configuration object - Cleanup usage of FSUtils.getFS, piggybacking off HoodieTableMetaClient.getFS - Adding s3a to supported schemes & support escaping "." in env vars - Tests use HoodieTestUtils.getDefaultHadoopConf

tree: ca379d5624ce9ae848ab59e04e615968ec7e73cf

README.md

Hudi

Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on HDFS and serve them out via two types of tables

Read Optimized Table - Provides excellent query performance via purely columnar storage (e.g. Parquet)
Near-Real time Table (WIP) - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here