[HUDI-271] Create QuickstartUtils for simplifying quickstart guide

- This will be used in Quickstart guide (Doc changes to follow in a seperate PR). The intention is to simplify quickstart to showcase hudi APIs by writing and reading using spark datasources.
- This is located in hudi-spark module intentionally to bring all the necessary classes in hudi-spark-bundle finally.
1 file changed
tree: 32a5bec62f30ef930f3b2e555dcbd4e426559702
  1. deploy/
  2. docker/
  3. hudi-cli/
  4. hudi-client/
  5. hudi-common/
  6. hudi-hadoop-mr/
  7. hudi-hive/
  8. hudi-integ-test/
  9. hudi-spark/
  10. hudi-timeline-service/
  11. hudi-utilities/
  12. packaging/
  13. release/
  14. style/
  15. tools/
  16. .gitignore
  17. .mailmap
  18. .travis.yml
  19. _config.yml
  20. CHANGELOG.md
  21. DISCLAIMER
  22. KEYS
  23. LICENSE
  24. NOTICE
  25. pom.xml
  26. README.md
  27. RELEASE_NOTES.md
README.md

Hudi

Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage) and provide ability to query them via three types of views

  • Read Optimized View - Provides excellent query performance via purely columnar storage (e.g. Parquet)
  • Incremental View - Provides a change stream with records inserted or updated after a point in time.
  • Real time View - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here