commit | 44b4884bce510735ff99b88ffc9a6ad272af9600 | [log] [tgz] |
---|---|---|
author | treff7es <treff7es@gmail.com> | Wed Nov 18 15:39:07 2020 -0800 |
committer | suvasude <suvasude@linkedin.biz> | Thu Nov 19 14:50:11 2020 -0800 |
tree | 2a75b260167ff31cb8536556dc2c027dd90615d7 | |
parent | dec03666692a2aa20307fd5266538023a85348bc [diff] |
[GOBBLIN-1312][GOBBLIN-1318] Bumping parquet lib to 1.11.1 to remove hadoop-lzo dependency Bumping parquet lib to 1.11.1 to remove hadoop-lzo dependency which caused build error as twitter's maven repo is unreliable. Removing twitter parquet completly and using apache parquet everywhere bumping gobblin-parquet module to use parquet 1.11.1 Disabling parquetOutputFormatTest test until https://issues.apache.org/jira/browse/GOBBLIN-1318 is fixed Changing UTF8 to STRING JsonIntermediateToParquetConverter test to support the latest parquet Closes #3150 from treff7es/remove-lzo-dependency
Apache Gobblin is a universal data ingestion framework for extracting, transforming, and loading large volume of data from a variety of data sources: databases, rest APIs, FTP/SFTP servers, filers, etc., onto Hadoop.
Apache Gobblin handles the common routine tasks required for all data ingestion ETLs, including job/task scheduling, task partitioning, error handling, state management, data quality checking, data publishing, etc.
Gobblin ingests data from different data sources in the same execution framework, and manages metadata of different sources all in one place. This, combined with other features such as auto scalability, fault tolerance, data quality assurance, extensibility, and the ability of handling data model evolution, makes Gobblin an easy-to-use, self-serving, and efficient data ingestion framework.
If building the distribution with tests turned on:
Run the following command for downloading the gradle-wrapper.jar from Gobblin git repository to gradle/wrapper directory.
wget --no-check-certificate -P gradle/wrapper https://github.com/apache/incubator-gobblin/raw/0.12.0/gradle/wrapper/gradle-wrapper.jar (or) curl --insecure -L https://github.com/apache/incubator-gobblin/raw/0.12.0/gradle/wrapper/gradle-wrapper.jar > gradle/wrapper/gradle-wrapper.jar
Alternatively, you can download it manually from: https://github.com/apache/incubator-gobblin/blob/0.12.0/gradle/wrapper/gradle-wrapper.jar
Make sure that you download it to gradle/wrapper directory.
./gradlew rat
. Report will be generated under build/rat/rat-report.html./gradlew build -x findbugsMain -x test -x rat -x checkstyleMain
The distribution will be created in build/gobblin-distribution/distributions directory. (or)./gradlew build
The distribution will be created in build/gobblin-distribution/distributions directory.apache-gobblin
space on Slack