incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sharad Agarwal <>
Subject Re: [VOTE] Accept Falcon into the Apache Incubator (was originally named Ivory)
Date Thu, 21 Mar 2013 10:36:39 GMT
+1 (non-binding)

On Thu, Mar 21, 2013 at 10:24 AM, Srikanth Sundarrajan <> wrote:

> Hi,
> Thanks for participating in the proposal discussion on Falcon
> (formerly Ivory). I'd like to call a VOTE for acceptance of Apache
> Falcon into the Incubator. I'll let the vote run till (Tue 3/26 6pm IST).
> [ ]  +1 Accept Apache Falcon into the Incubator
> [ ]  +0 Don't care.
> [ ]  -1 Don't accept Apache Falcon into the Incubator because...
> Full proposal is pasted at the bottom of this email, and the
> corresponding wiki is
> Only VOTEs from Incubator PMC members are binding, but all are welcome
> to express their thoughts.
> Thanks,
> Srikanth Sundarrajan
> = Falcon Proposal =
> == Abstract ==
> Falcon is a data processing and management solution for Hadoop
> designed for data motion, coordination of data pipelines, lifecycle
> management, and data discovery. Falcon enables end consumers to
> quickly onboard their data and its associated processing and
> management tasks on Hadoop clusters.
> == Proposal ==
> Falcon will enable easy data management via declarative mechanism for
> Hadoop. Users of Falcon platform simply define infrastructure
> endpoints, data sets and processing rules declaratively. These
> declarative configurations are expressed in such a way that the
> dependencies between these configured entities are explicitly
> described. This information about inter-dependencies between various
> entities allows Falcon to orchestrate and manage various data
> management functions.
> The key use cases that Falcon addresses are:
>  * Data Motion
>  * Process orchestration and scheduling
>  * Policy-based Lifecycle Management
>  * Data Discovery
>  * Operability/Usability
> With these features it is possible for users to onboard their data
> sets with a comprehensive and holistic understanding of how, when and
> where their data is managed across its lifecycle. Complex functions
> such as retrying failures, identifying possible SLA breaches or
> automated handling of input data changes are now simple directives.
> All the administrative functions and user level functions are
> available via RESTful APIs. CLI is simply a wrapper over the RESTful
> APIs.
> == Background ==
> Hadoop and its ecosystem of products have made storing and processing
> massive amounts of data commonplace. This has enabled numerous
> organizations to gain valuable insights that they never could have
> achieved in the past. While it is easy to leverage Hadoop for
> crunching large volumes of data, organizing data, managing life cycle
> of data and processing data is fairly involved. This is solved
> adequately well in a classic data platform involving data warehouses
> and standard ETL (extract-transform-load) tools, but remains largely
> unsolved today. In addition to data processing complexities, Hadoop
> presents new sets of challenges and opportunities relating to
> management of data.
> Data Management on Hadoop encompasses data motion, process
> orchestration, lifecycle management, data discovery, etc. among other
> concerns that are beyond ETL. Falcon is a new data processing and
> management platform for Hadoop that solves this problem and creates
> additional opportunities by building on existing components within the
> Hadoop ecosystem (ex. Apache Oozie, Apache Hadoop DistCp etc.) without
> reinventing the wheel. Falcon has been in production at InMobi, going
> on its second year and has been managing hundreds of feeds and
> processes.
> Falcon is being developed by engineers employed with InMobi and
> Hortonworks. This platform addition will increase the adoption of
> Apache Hadoop by driving data management tractable for end users. We
> are therefore proposing to make Falcon an Apache open source project.
> == Rationale ==
> The Falcon project aims to improve the usability of Apache Hadoop. As
> a result Apache Hadoop will grow its community of users by increasing
> the places Hadoop can be utilized and the use cases it will solve. By
> developing Falcon in Apache we hope to gather a diverse community of
> contributors, helping to ensure that Falcon is deployable for a broad
> range of scenarios. Members of the Hadoop development community will
> be able to influence Falcon’s roadmap, and contribute to it. We
> believe having Falcon as part of the Apache Hadoop ecosystem will be a
> great benefit to all of Hadoop's users.
> == Current Status ==
> Falcon is widely deployed in production within InMobi and moving on to
> its second year. A version with a valuable set of features is
> developed by the list of initial committers and is hosted on github.
> === Meritocracy ===
> Our intent with this incubator proposal is to start building a diverse
> developer community around Falcon following the Apache meritocracy
> model. We have wanted to make the project open source and encourage
> contributors from multiple organizations from the start. We plan to
> provide plenty of support to new developers and to quickly recruit
> those who make solid contributions to committer status.
> === Community ===
> We are happy to report that the initial team already represents
> multiple organizations. We hope to extend the user and developer base
> further in the future and build a solid open source community around
> Falcon.
> === Core Developers ===
> Falcon is currently being developed by three engineers from InMobi –
> Srikanth Sunderrajan, Shwetha G S, and Shaik Idris, two Hortonworks
> employees – Sanjay Radia and Venkatesh Seetharam. In addition, Rohini
> Palaniswamy and Thiruvel Thirumoolan, were also involved in the
> initial design discussions. Srikanth, Shwetha and Shaik are the
> original developers. All the engineers have built two generations of
> Data Management on Hadoop, having deep expertise in Hadoop and are
> quite familiar with the Hadoop Ecosystem. Samarth Gupta & Rishu
> Mehrothra, both from InMobi have build the QA automation for Falcon.
> === Alignment ===
> The ASF is a natural host for Falcon given that it is already the home
> of Hadoop, Pig, Knox, HCatalog, and other emerging “big data” software
> projects. Falcon has been designed to solve the data management
> challenges and opportunities of the Hadoop ecosystem family of
> products. Falcon fills the gap that Hadoop ecosystem has been lacking
> in the areas of data processing and data lifecycle management.
> == Known Risks ==
> === Orphaned products & Reliance on Salaried Developers ===
> The core developers plan to work full time on the project. There is
> very little risk of Falcon getting orphaned. Falcon is in use by
> companies we work for so the companies have an interest in its
> continued vitality.
> === Inexperience with Open Source ===
> All of the core developers are active users and followers of open
> source. Srikanth Sundarrajan has been contributing patches to Apache
> Hadoop and Apache Oozie, Shwetha GS has been contributing patches to
> Apache Oozie.  Seetharam Venkatesh is a committer on Apache Knox.
> Sharad Agarwal, Amareshwari SR (also a Apache Hive PMC member) and
> Sanjay Radia are PMC members on Apache Hadoop.
> === Homogeneous Developers ===
> The current core developers are from diverse set of organizations such
> as InMobi and Hortonworks. We expect to quickly establish a developer
> community that includes contributors from several corporations post
> incubation.
> === Reliance on Salaried Developers ===
> Currently, most developers are paid to do work on Falcon but few are
> contributing in their spare time. However, once the project has a
> community built around it post incubation, we expect to get committers
> and developers from outside the current core developers.
> === Relationships with Other Apache Products ===
> Falcon is going to be used by the users of Hadoop and the Hadoop
> ecosystem in general.
> === A Excessive Fascination with the Apache Brand ===
> While we respect the reputation of the Apache brand and have no doubts
> that it will attract contributors and users, our interest is primarily
> to give Falcon a solid home as an open source project following an
> established development model. We have also given reasons in the
> Rationale and Alignment sections.
> == Documentation ==
> == Initial Source ==
> The source is currently in github repository at:
> == Source and Intellectual Property Submission Plan ==
> The complete Falcon code is under Apache Software License 2.
> == External Dependencies ==
> The dependencies all have Apache compatible licenses. These include
> BSD, MIT licensed dependencies.
> == Cryptography ==
> None
> == Required Resources ==
> === Mailing lists ===
>  * falcon-dev AT incubator DOT apache DOT org
>  * falcon-commits AT incubator DOT apache DOT org
>  * falcon-user AT incubator apache DOT org
>  * falcon-private AT incubator DOT apache DOT org
> === Subversion Directory ===
> Git is the preferred source control system: git://
> === Issue Tracking ===
> == Initial Committers ==
>  * Srikanth Sundarrajan (Srikanth.Sundarrajan AT inmobi DOT com)
>  * Shwetha GS ( AT inmobi DOT com)
>  * Shaik Idris (shaik.idris AT inmobi DOT com)
>  * Venkatesh Seetharam (Venkatesh AT apache DOT org)
>  * Sanjay Radia (sanjay AT apache DOT org)
>  * Sharad Agarwal (sharad AT apache DOT org)
>  * Amareshwari SR (amareshwari AT apache DOT org)
>  * Samarth Gupta (samarth.gupta AT inmobi DOT com)
>  * Rishu Mehrothra (rishu.mehrothra AT inmobi DOT com)
> == Affiliations ==
>  * Srikanth Sundarrajan (InMobi)
>  * Shwetha GS (InMobi)
>  * Shaik Idris (InMobi)
>  * Venkatesh Seetharam (Hortonworks Inc.)
>  * Sanjay Radia (Hortonworks Inc.)
>  * Sharad Agarwal (InMobi)
>  * Amareshwari SR (InMobi)
>  * Samarth Gupta (InMobi)
>  * Rishu Mehrothra (InMobi)
> == Sponsors ==
> === Champion ===
>  * Arun C Murthy (acmurthy at apache dot org)
> === Nominated Mentors ===
>  * Alan Gates (gates AT apache DOT org)
>  * Chris Douglas (cdouglas AT apache DOT org)
>  * Devaraj  Das (ddas AT apache DOT org)
>  * Owen O’Malley (omalley AT apache DOT org)
> === Sponsoring Entity ===
> Incubator PMC
> --
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message