Apache Gobblin (Incubating) 0.12.0 RC0
[GOBBLIN-355] Add helper script to publish archives on Nexus, and wire gradle tasks
4 files changed
tree: dcf6262ce91359e4492e783f681786c7c8af3dee
  1. .github/
  2. .gitignore
  3. .travis.yml
  4. CHANGELOG.md
  5. DISCLAIMER
  6. HEADER
  7. LICENSE
  8. NOTICE
  9. README.md
  10. bin/
  11. build.gradle
  12. buildSrc/
  13. conf/
  14. config/
  15. defaultEnvironment.gradle
  16. dev/
  17. gobblin-admin/
  18. gobblin-api/
  19. gobblin-audit/
  20. gobblin-aws/
  21. gobblin-cluster/
  22. gobblin-compaction/
  23. gobblin-config-management/
  24. gobblin-core-base/
  25. gobblin-core/
  26. gobblin-data-management/
  27. gobblin-distribution/
  28. gobblin-docker/
  29. gobblin-docs/
  30. gobblin-example/
  31. gobblin-flavored-build.gradle
  32. gobblin-hive-registration/
  33. gobblin-metastore/
  34. gobblin-metrics-libs/
  35. gobblin-modules/
  36. gobblin-oozie/
  37. gobblin-rest-service/
  38. gobblin-restli/
  39. gobblin-runtime-hadoop/
  40. gobblin-runtime/
  41. gobblin-salesforce/
  42. gobblin-service/
  43. gobblin-test-harness/
  44. gobblin-test-utils/
  45. gobblin-test/
  46. gobblin-tunnel/
  47. gobblin-utility/
  48. gobblin-yarn/
  49. gradle.properties
  50. gradle/
  51. gradlew
  52. gradlew.bat
  53. ligradle/
  54. maven-nexus/
  55. maven-sonatype/
  56. mkdocs.yml
  57. query_github_issues.py
  58. readthedocs.yml
  59. settings.gradle
  60. travis/
README.md

Apache Gobblin Build Status Documentation Status

Apache Gobblin is a universal data ingestion framework for extracting, transforming, and loading large volume of data from a variety of data sources, e.g., databases, rest APIs, FTP/SFTP servers, filers, etc., onto Hadoop. Apache Gobblin handles the common routine tasks required for all data ingestion ETLs, including job/task scheduling, task partitioning, error handling, state management, data quality checking, data publishing, etc. Gobblin ingests data from different data sources in the same execution framework, and manages metadata of different sources all in one place. This, combined with other features such as auto scalability, fault tolerance, data quality assurance, extensibility, and the ability of handling data model evolution, makes Gobblin an easy-to-use, self-serving, and efficient data ingestion framework.

Quick Links

  • Documentation: Check out the Gobblin documentation for a complete description of Gobblin's features
  • Powered By: Check out the list of companies known to use Gobblin
  • Architecture: The Gobblin Architecture page has a full explanation of Gobblin's architecture
  • Getting Started with Gobblin: Refer to the Getting Started Guide on how to get started with Gobblin
  • Building Gobblin: Refer to the page Building Gobblin for directions on how to build Gobblin
  • Javadocs: The full JavaDocs for each released version of Gobblin can be found here
  • Gobblin chat room: Gitter chat room for Gobblin developers and users here