blob: a1ee0a329e77dc7b04dded9cde319a423e1d7921 [file] [log] [blame]
---+ Falcon - Feed management and data processing platform
Falcon is a feed processing and feed management system aimed at making it
easier for end consumers to onboard their feed processing and feed
management on hadoop clusters.
---++ Why?
* Establishes relationship between various data and processing elements on a Hadoop environment
* Feed management services such as feed retention, replications across clusters, archival etc.
* Easy to onboard new workflows/pipelines, with support for late data handling, retry policies
* Integration with metastore/catalog such as Hive/HCatalog
* Provide notification to end customer based on availability of feed groups
(logical group of related feeds, which are likely to be used together)
* Enables use cases for local processing in colo and global aggregations
* Captures Lineage information for feeds and processes
---+ Getting Started
Start with these simple steps to install an falcon instance [[InstallationSteps][Simple setup]]. Also refer
to Falcon architecture and documentation in [[FalconDocumentation][Documentation]]. [[OnBoarding][On boarding]]
describes steps to on-board a pipeline to Falcon. It also gives a sample pipeline for reference.
[[EntitySpecification][Entity Specification]] gives complete details of all Falcon entities.
[[falconcli/FalconCLI][Falcon CLI]] implements [[restapi/ResourceList][Falcon's RESTful API]] and
describes various options for the command line utility provided by Falcon.
Falcon provides OOTB [[HiveIntegration][lifecycle management for Tables in Hive (HCatalog)]]
such as table replication for BCP and table eviction. Falcon also enforces
[[Security][Security]] on protected resources and enables SSL.
#LicenseInfo
---+ Licensing Information
Falcon is distributed under [[http://www.apache.org/licenses/LICENSE-2.0][Apache License 2.0]].