incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Debo Dutta (dedutta)" <dedu...@cisco.com>
Subject Re: [PROPOSAL] Heron
Date Thu, 15 Jun 2017 17:42:31 GMT
Am happy to help too!

Thx 
Debo 

Sent from my iPhone

> On Jun 14, 2017, at 8:05 PM, William Markito Oliveira <william.markito@gmail.com>
wrote:
> 
> Howdy!
> 
> If Heron is looking for some help around incubation process, I'd love to
> help while Geode experience is still fresh in my mind and given that it's a
> project/space that I do have interest. Since I'm not an ASF member, I don't
> think I can offer to be a mentor, but can probably still help and
> participate on the process.
> 
> Thanks!
> 
>> On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz <ptgoetz@gmail.com> wrote:
>> 
>> Hi Bill/Supun,
>> 
>> Sorry for not being a little more clear. I was asking more about how the
>> Heron community would seek to engage with Storm community at the
>> *community* level as opposed to the technical level (i.e. “Community over
>> Code”).
>> 
>> I’ve been asked by many why this has never happened, and have always
>> struggled to answer. Maybe you could help answer that question as well as
>> if and how that might change if Heron were to incubate.
>> 
>> Another quick question: The proposal mentions Heron being used in
>> production at Google, but some Google employees I recently spoke to seemed
>> to contradict that. Could you explain? Note that’s nothing that would
>> preclude the project from incubating, I’m just curious.
>> 
>> -Taylor
>> 
>>> On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve <supun06@gmail.com>
>> wrote:
>>> 
>>> Hi Taylor,
>>> 
>>> For me, one of the interesting differences between Heron and Storm is the
>>> execution model. Storm uses a shared memory model while Heron uses a
>>> process based model. It will be interesting to see how these two evolve.
>>> 
>>> Thanks,
>>> Supun..
>>> 
>>> On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham <billgraham@gmail.com>
>> wrote:
>>> 
>>>> Hi Taylor,
>>>> 
>>>> Thanks for the mentor offer, we'd be glad to have your help.
>>>> 
>>>> I think the best place for collaboration would be around the evolution
>> of
>>>> the API. In addition we plan to look more into DSL solutions which we
>> could
>>>> potentially collaborate on. This could be Trident, or Beam or something
>>>> else, but there could be synergies for future development here.
>>>> 
>>>> thanks,
>>>> Bill
>>>> 
>>>> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz <ptgoetz@gmail.com>
>> wrote:
>>>> 
>>>>> Hi Bill,
>>>>> 
>>>>> Could you comment on how/if the Heron community would be willing to
>> work
>>>>> with the Storm community? I've seen a number of new features in Storm
>>>> being
>>>>> ported to Heron, but I have yet to see any attempt by the Heron
>> community
>>>>> to engage with the Apache Storm community.
>>>>> 
>>>>> I don't think it would be too far off to say that the relationship
>>>> between
>>>>> Heron and Apache Storm has been somewhat adversarial. The pre- and
>>>>> post-open sourcing marketing around Heron seemed, at least to me,
>>>> somewhat
>>>>> aggressively negative toward Storm.
>>>>> 
>>>>> As a peer to Apache Storm, how would the proposed "Apache Heron"
>>>> community
>>>>> work to collaborate with the Storm community? If Heron is adopting API
>>>>> changes in Storm, then it seems there is an opportunity for
>>>> collaboration.
>>>>> 
>>>>> Don't take any of this as an objection to incubating the project. I
>> would
>>>>> support it. I would also be willing to be a mentor, if you would
>> consider
>>>>> taking on another.
>>>>> 
>>>>> -Taylor
>>>>> 
>>>>>> On Jun 8, 2017, at 1:23 PM, Bill Graham <billgraham@gmail.com>
wrote:
>>>>>> 
>>>>>> Dear Apache Incubator Community,
>>>>>> 
>>>>>> We are excited to share our proposal for discussion and feedback
>>>>>> for entering Apache Incubation. Heron is a real-time, distributed,
>>>>>> fault-tolerant stream processing engine.
>>>>>> 
>>>>>> Our proposal can be found at https://wiki.apache.org/
>>>>> incubator/HeronProposal
>>>>>> and is included below.
>>>>>> 
>>>>>> 
>>>>>> Thank you,
>>>>>> 
>>>>>> Bill Graham on behalf of the Heron developers
>>>>>> 
>>>>>> 
>>>>>> # Heron Proposal
>>>>>> 
>>>>>> ## Abstract
>>>>>> Heron is a real-time, distributed, fault-tolerant stream processing
>>>>> engine
>>>>>> initially developed by Twitter.
>>>>>> 
>>>>>> ## Proposal
>>>>>> 
>>>>>> Heron is a real-time stream processing engine built for high
>>>> performance,
>>>>>> ease of manageability, performance predictability and developer
>>>>>> productivity[1]. We wish to develop a community around Heron to
>>>> increase
>>>>>> contributions and see Heron thrive in an open forum.
>>>>>> 
>>>>>> ## Background
>>>>>> 
>>>>>> Heron provides the ability for developers to compose directed acyclic
>>>>>> graphs (DAGs) of real-time query execution logic (i.e. a topology)
and
>>>>>> submit the topology to execute on a pluggable job scheduling system
>>>>> (e.g.,
>>>>>> Apache Aurora, YARN, Marathon, etc). Users can employ either the
>> native
>>>>>> Heron API or the Apache Storm API to develop the topology. Heron
>>>> supports
>>>>>> the Storm API for ease of migration, but beyond that Heron’s
>>>> architecture
>>>>>> differs considerably from Storm’s.
>>>>>> 
>>>>>> Users submit a topology to the scheduler using the Heron client,
which
>>>>> uses
>>>>>> the Heron binary libraries to deploy all daemons required to run
and
>>>>> manage
>>>>>> the topology. The topology therefore has no reliance on centrally
>>>> managed
>>>>>> Heron services, only on a generic job scheduling system, which lends
>>>>> itself
>>>>>> well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN
>>>> (among
>>>>>> others).
>>>>>> 
>>>>>> The scheduler runs each topology as a job consisting of multiple
>>>>>> containers. One of the containers runs the topology master,
>> responsible
>>>>> for
>>>>>> managing the topology. The remaining containers each runs a stream
>>>>> manager
>>>>>> responsible for data routing, a metrics manager that collects and
>>>> reports
>>>>>> various metrics and a number of processes called Heron instances
which
>>>>> run
>>>>>> the user-defined logic on the stream of tuples. Parallelism is
>> achieved
>>>>> via
>>>>>> process-based isolation of Heron instances, which provides predictable
>>>>>> performance while simplifying debugging. The containers are allocated
>>>> and
>>>>>> managed by the scheduler framework based on resource availability
of
>>>>> nodes
>>>>>> in the cluster. The metadata for the topology, such as the physical
>>>> plan
>>>>>> and execution details, are stored in the pluggable Heron State Manager
>>>>>> (e.g. Apache ZooKeeper).
>>>>>> 
>>>>>> ## Rationale
>>>>>> 
>>>>>> Heron is a general-purpose, modular and extensible platform that
can
>> be
>>>>>> leveraged to support common, real-time analytics use cases. There
is
>> an
>>>>>> increasing demand for open-source, scalable real-time analytics
>>>> systems.
>>>>> We
>>>>>> believe that Heron can be leveraged by other organizations to build
>>>>>> streaming applications that can benefit from its robustness, high
>>>>>> performance, adaptability to cloud environments and ease of use.
>>>>> Moreover,
>>>>>> we hope that open-sourcing Heron will help to further evolve the
>>>>> technology
>>>>>> as the project attracts contributors with diverse backgrounds and
>> areas
>>>>> of
>>>>>> expertise.
>>>>>> 
>>>>>> We believe the Apache foundation is a great fit as the long-term
home
>>>> for
>>>>>> Heron, as it provides an established process for community-driven
>>>>>> development and decision making by consensus. This is exactly the
>> model
>>>>> we
>>>>>> want for future Heron development.
>>>>>> 
>>>>>> ## Initial Goals
>>>>>> 
>>>>>> * Move the existing codebase, website, documentation, and mailing
>> lists
>>>>> to
>>>>>> Apache-hosted infrastructure.
>>>>>> * Integrate with the Apache development process.
>>>>>> * Ensure all dependencies are compliant with Apache License version
>>>> 2.0.
>>>>>> * Incrementally develop and release per Apache guidelines.
>>>>>> 
>>>>>> ## Current Status
>>>>>> 
>>>>>> Heron is a stable project used in production at Twitter since 2014
and
>>>>> open
>>>>>> sourced under the ASL v2 license in 2016. The Heron source code is
>>>>>> currently hosted at github.com (https://github.com/twitter/heron),
>>>> which
>>>>>> will seed the Apache git repository.
>>>>>> 
>>>>>> ### Meritocracy
>>>>>> 
>>>>>> By submitting this incubator proposal, we’re expressing our intent
to
>>>>> build
>>>>>> a diverse developer community around Heron that will conduct itself
>>>>>> according to The Apache Way and use a meritocratic means of building
>>>> it's
>>>>>> committer base. Several companies and universities have already
>>>> expressed
>>>>>> interest in and contributed to Heron. Our goal is to grow the Heron
>>>>>> community by encouraging open communication, contribution and
>>>>> participation
>>>>>> of all types, and ensuring that contributors are recognized
>>>>> appropriately.
>>>>>> 
>>>>>> ### Community
>>>>>> 
>>>>>> Heron is currently being used by Twitter, Google, Machine Zone and
>>>>>> ndustrial.io and has received significant contributions by Microsoft
>>>> and
>>>>>> Streamlio. By bringing Heron into the Apache ecosystem, we believe
we
>>>> can
>>>>>> attract even more developers who are interested in creating real-time
>>>>>> systems to build the project's contributor base.
>>>>>> 
>>>>>> ### Core Developers
>>>>>> 
>>>>>> Current core developers are engineers from Twitter, Google, Microsoft
>>>> and
>>>>>> Streamlio.
>>>>>> 
>>>>>> ### Alignment
>>>>>> 
>>>>>> Heron utilizes a number of Apache technologies. Heron leverages Apache
>>>>>> ZooKeeper for coordination and has scheduler implementations to
>>>> integrate
>>>>>> with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache
>>>>> REEF)
>>>>>> as well as spout implementations to integrate with Apache Kafka and
>>>>> metrics
>>>>>> implementations to integrate with Scribe. Heron also implements the
>>>>> Apache
>>>>>> Storm user-level API, which allows topologies written against Storm
to
>>>>> run
>>>>>> in Heron. We believe that having Heron at Apache will help further
the
>>>>>> growth of the streaming compute community, as well as encourage
>>>>> cooperation
>>>>>> and developer cross pollination with other Apache projects.
>>>>>> 
>>>>>> ## Known Risks
>>>>>> 
>>>>>> ### Orphaned Products
>>>>>> 
>>>>>> The risk of the Heron project being abandoned is minimal. It is used
>> in
>>>>>> production at Twitter and Google and other companies are evaluating
or
>>>>>> adopting it for production use.
>>>>>> 
>>>>>> ### Inexperience with Open Source
>>>>>> 
>>>>>> All of the core contributors to the project have considerable
>>>> experience
>>>>>> with open source software development. Bill Graham[2], Ashvin
>>>> Agrawal[3]
>>>>>> and Supun Kamburugamuve[4], committers on the project, are PMCs on
>>>> other
>>>>>> Apache projects and Bill and Ashvin have gone through the Apache
>>>>> incubator
>>>>>> process. Twitter has already donated numerous projects to the ASF
>>>> (e.g.,
>>>>>> Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be
>>>> mentored
>>>>>> by experienced ASF members that can help with any roadblocks.
>>>>>> 
>>>>>> ### Homogenous Developers
>>>>>> 
>>>>>> Initial committers come from 5 separate organizations. Our intention
>> is
>>>>>> increase the diversity of contributing developers and their
>>>> affiliations.
>>>>>> To date github contributions have come from approximately 50
>>>> contributors
>>>>>> from outside the Twitter team.
>>>>>> 
>>>>>> ### Reliance on Salaried Developers
>>>>>> 
>>>>>> It is expected that Heron development will occur on both salaried
time
>>>>> and
>>>>>> on volunteer time. The majority of initial committers are paid by
>> their
>>>>>> employers to contribute to this project. We are committed to
>> recruiting
>>>>>> additional committers from other organizations as well as non-salaried
>>>>>> committers to join project.
>>>>>> 
>>>>>> ### Relationships with Other Apache Products
>>>>>> 
>>>>>> As mentioned in the Alignment section, Heron implements the Apache
>>>> Storm
>>>>>> API and integrates with multiple Apache schedulers (Apache Mesos,
>>>> Apache
>>>>>> Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and
>> Apache
>>>>>> Thrift.
>>>>>> 
>>>>>> ### An Excessive Fascination with the Apache Brand
>>>>>> 
>>>>>> Heron's popularity is growing in the streaming compute space and
we
>> are
>>>>>> long time supporters of the Apache brand. This proposal is not for
the
>>>>>> purpose of generating publicity through. Rather, the primary benefits
>>>> to
>>>>>> joining Apache are those of community building and open decision
>> making
>>>>>> outlined in the Rationale section.
>>>>>> 
>>>>>> ## Documentation
>>>>>> 
>>>>>> This proposal exists online as http://wiki.apache.org/
>>>>>> incubator/HeronProposal. Extensive documentation can be found on
>> github
>>>>> at
>>>>>> https://twitter.github.io/heron and the source code is well
>>>> documented.
>>>>>> 
>>>>>> ## Source and Intellectual Property Submission Plan
>>>>>> 
>>>>>> The Heron codebase is currently hosted on Github:
>>>>>> https://github.com/twitter/heron. During incubation, the codebase
>> will
>>>>> be
>>>>>> migrated to Apache infrastructure. The source code is already ASF
2.0
>>>>>> licensed.
>>>>>> 
>>>>>> ## External Dependencies
>>>>>> 
>>>>>> All external libraries have ASF 2.0 compatible licenses except for
>>>>> pylint.
>>>>>> The pylint library is GPL licensed, but is only used for pre-build
>>>> Python
>>>>>> style checks and is neither bundled with, nor relied upon by, the
>> Heron
>>>>>> source or binary release artifacts.
>>>>>> 
>>>>>> ## Cryptography
>>>>>> 
>>>>>> Heron does not use any cryptography libraries.
>>>>>> 
>>>>>> ## Required Resources
>>>>>> 
>>>>>> ### Mailing lists
>>>>>> 
>>>>>> private@heron.incubator.apache.org (with moderated subscriptions)
>>>>>> dev@heron.incubator.apache.org
>>>>>> commits@heron.incubator.apache.org
>>>>>> user@heron.incubator.apache.org
>>>>>> 
>>>>>> ## Subversion Directory
>>>>>> 
>>>>>> Git is the preferred source control system: git://
>> git.apache.org/heron
>>>>>> 
>>>>>> ## Issue Tracking
>>>>>> 
>>>>>> JIRA: Heron (HERON)
>>>>>> 
>>>>>> ## Initial Committers
>>>>>> 
>>>>>> * Andrew Jorgensen (andrew at andrewjorgensen dot com)
>>>>>> * Ashvin Agrawal (ashvin at apache dot org)*
>>>>>> * Avrilia Floratou (avrilia dot floratou at gmail dot com)
>>>>>> * Bill Graham (billgraham at apache dot org)*
>>>>>> * Brian Hatfield (bmhatfield at gmail dot com)
>>>>>> * Chris Kellogg (cckellogg at gmail dot com)
>>>>>> * Huijun Wu (huijun dot wu dot 2010 at gmail dot com)
>>>>>> * Karthik Ramasamy (karthik at gmail dot com)
>>>>>> * Maosong Fu (maosongfu at gmail dot com)
>>>>>> * Neng Lu(freeneng at gmail dot com)
>>>>>> * Runhang Li (obj dot runhang at gmail dot com)
>>>>>> * Sanjeev Kulkarni (sanjeevrk at gmail dot com)
>>>>>> * Supun Kamburugamuve (supun at apache dot org)*
>>>>>> * Thomas Sun (tom dot ssf at gmail dot com)
>>>>>> * Yaliang Wang (yaliang dot w dot wang at ieee dot org)
>>>>>> 
>>>>>> ## Affiliations
>>>>>> 
>>>>>> * Andrew Jorgensen (Google)
>>>>>> * Ashvin Agrawal (Microsoft)
>>>>>> * Avrilia Floratou (Microsoft)
>>>>>> * Bill Graham (Twitter)
>>>>>> * Brian Hatfield (Google)
>>>>>> * Chris Kellogg (Twitter)
>>>>>> * Huijun Wu (Twitter)
>>>>>> * Karthik Ramasamy (Streamlio)
>>>>>> * Maosong Fu (Twitter)
>>>>>> * Neng Lu (Twitter)
>>>>>> * Runhang Li (Twitter)
>>>>>> * Sanjeev Kulkarni (Streamlio)
>>>>>> * Supun Kamburugamuve (Indiana University)
>>>>>> * Thomas Sun (Twitter)
>>>>>> * Yaliang Wang (Twitter)
>>>>>> 
>>>>>> ## Sponsors
>>>>>> 
>>>>>> ### Champion
>>>>>> 
>>>>>> * Julien Le Dem (julien at apache dot org)
>>>>>> 
>>>>>> ### Nominated Mentors
>>>>>> 
>>>>>> * Jake Farrell (jfarrell at apache dot org)
>>>>>> * Jacques Nadeau (jacques at apache dot org)
>>>>>> * Julien Le Dem (julien at apache dot org)
>>>>>> 
>>>>>> ### Sponsoring Entity
>>>>>> 
>>>>>> The Apache Incubator
>>>>>> 
>>>>>> ### Footnotes
>>>>>> 
>>>>>> 1 - Papers detailing Heron are available at
>> http://dl.acm.org/citation
>>>> .
>>>>>> cfm?id=2742788 and http://sites.computer.org/debull/A15dec/p15.pdf.
>>>>>> 2 - http://home.apache.org/phonebook.html?uid=billgraham
>>>>>> 3 - http://home.apache.org/phonebook.html?uid=ashvin
>>>>>> 4 - http://home.apache.org/phonebook.html?uid=supun
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Supun Kamburugamuve
>>> Member, Apache Software Foundation; http://www.apache.org
>>> E-mail: supun@apache.o <supun06@gmail.com>rg;  Mobile: +1 812 219 2563
>>> <(812)%20219-2563>
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>> 
>> 
> 
> 
> -- 
> ~/William

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message