commit | ecf9aee3e61815877f3954455158dda53c5c1bf7 | [log] [tgz] |
---|---|---|
author | Pracheer Agarwal <pracheer.agarwal@inmobi.com> | Tue Jan 10 11:14:51 2017 +0530 |
committer | Pallavi Rao <pallavi.rao@inmobi.com> | Tue Jan 10 11:14:51 2017 +0530 |
tree | 3cc2b751f544d989a0f646d96e9fdca59d51ebc2 | |
parent | 86086653134d9c0ccd854237f40a6a15276c0a41 [diff] |
FALCON-2238 FALCON-2239 FALCON-2240 bug fixes Author: Pracheer Agarwal <pracheer.agarwal@inmobi.com> Author: sandeep <sandysmdl@gmail.com> Author: Pracheer Agarwal <pracheeragarwal@gmail.com> Author: Pracheer Agarwal <pr@im2216-x0.corp.inmobi.com> Reviewers: @sandeepSamudrala, @pallavi-rao Closes #338 from PracheerAgarwal/bugs and squashes the following commits: 0a4355e [sandeep] bug fixes c7eaaed [sandeep] FALCON-2238,FALCON-2239,FALCON-2240 bug fixes a93d71a [Pracheer Agarwal] Merge branch 'master' of https://github.com/PracheerAgarwal/falcon e3728d5 [Pracheer Agarwal] Merge branch 'master' of https://github.com/apache/falcon 066c8e2 [Pracheer Agarwal] Merge branch 'master' of https://github.com/apache/falcon b20f044 [Pracheer Agarwal] Merge branch 'master' of https://github.com/apache/falcon 7f572a1 [Pracheer Agarwal] Merge branch 'master' of https://github.com/apache/falcon 46042fd [Pracheer Agarwal] Merge branch 'master' of https://github.com/PracheerAgarwal/falcon daa3ffc [Pracheer Agarwal] FALCON-2225 extension owner added for trusted extensions 622cae4 [Pracheer Agarwal] FALCON-2225 extension owner added for trusted extensions
Falcon is a feed processing and feed management system aimed at making it easier for end consumers to onboard their feed processing and feed management on hadoop clusters.
Dependencies across various data processing pipelines are not easy to establish. Gaps here typically leads to either incorrect/partial processing or expensive reprocessing. Repeated duplicate definition of a single feed multiple times can lead to inconsistencies / issues.
Input data may not arrive always on time and it is required to kick off the processing without waiting for all data to arrive and accommodate late data separately
Feed management services such as feed retention, replications across clusters, archival etc are tasks that are burdensome on individual pipeline owners and better offered as a service for all customers.
It should be easy to onboard new workflows/pipelines
Smoother integration with metastore/catalog
Provide notification to end customer based on availability of feed groups (logical group of related feeds, which are likely to be used together)
You can find the documentation on Apache Falcon website.
Before opening a pull request, please go through the Contributing to Apache Falcon wiki. It lists steps that are required before creating a PR and the conventions that we follow. If you are looking for issues to pick up then you can look at starter tasks or open tasks
You can download release notes of previous releases from the following links.