incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandeep Deshmukh <sand...@datatorrent.com>
Subject Re: [VOTE] Accept PredictionIO into the Apache Incubator
Date Wed, 25 May 2016 11:52:10 GMT
+ 1 (non-binding)

Regards,
Sandeep

On Wed, May 25, 2016 at 10:38 AM, Paul Fremantle <pzfreo@gmail.com> wrote:

> +1 (binding)
> Paul
>
> On Wed, May 25, 2016 at 5:12 AM, Tsuyoshi Ozawa <ozawa@apache.org> wrote:
>
> > +1 (non-binding)
> > - Tsuyoshi
> >
> > On Wed, May 25, 2016 at 1:00 PM, Reynold Xin <rxin@databricks.com>
> wrote:
> > > +1 (binding)
> > >
> > >
> > > On Mon, May 23, 2016 at 3:22 PM, Andrew Purtell <apurtell@apache.org>
> > wrote:
> > >
> > >> Since discussion on the matter of PredictionIO has died down, I would
> > like
> > >> to call a VOTE
> > >> on accepting PredictionIO into the Apache Incubator.
> > >>
> > >> Proposal: https://wiki.apache.org/incubator/PredictionIO
> > >>
> > >> [ ] +1 Accept PredictionIO into the Apache Incubator
> > >> [ ] +0 Abstain
> > >> [ ] -1 Do not accept PredictionIO into the Apache Incubator, because
> ...
> > >>
> > >> This vote will be open for at least 72 hours.
> > >>
> > >> My vote is +1 (binding)
> > >>
> > >> --
> > >>
> > >> PredictionIO Proposal
> > >>
> > >> Abstract
> > >>
> > >> PredictionIO is an open source Machine Learning Server built on top of
> > >> state-of-the-art open source stack, that enables developers to manage
> > and
> > >> deploy production-ready predictive services for various kinds of
> machine
> > >> learning tasks.
> > >>
> > >> Proposal
> > >>
> > >> The PredictionIO platform consists of the following components:
> > >>
> > >>    * PredictionIO framework - provides the machine learning stack for
> > >>      building, evaluating and deploying engines with machine learning
> > >>      algorithms. It uses Apache Spark for processing.
> > >>
> > >>    * Event Server - the machine learning analytics layer for unifying
> > >> events
> > >>      from multiple platforms. It can use Apache HBase or any JDBC
> > backends
> > >>      as its data store.
> > >>
> > >> The PredictionIO community also maintains a Template Gallery, a place
> to
> > >> publish and download (free or proprietary) engine templates for
> > different
> > >> types of machine learning applications, and is a complemental part of
> > the
> > >> project. At this point we exclude the Template Gallery from the
> > proposal,
> > >> as it has a separate set of contributors and we’re not familiar with
> an
> > >> Apache approved mechanism to maintain such a gallery.
> > >>
> > >> Background
> > >>
> > >> PredictionIO was started with a mission to democratize and bring
> machine
> > >> learning to the masses.
> > >>
> > >> Machine learning has traditionally been a luxury for big companies
> like
> > >> Google, Facebook, and Netflix. There are ML libraries and tools lying
> > >> around the internet but the effort of putting them all together as a
> > >> production-ready infrastructure is a very resource-intensive task that
> > is
> > >> remotely reachable by individuals or small businesses.
> > >>
> > >> PredictionIO is a production-ready, full stack machine learning system
> > that
> > >> allows organizations of any scale to quickly deploy machine learning
> > >> capabilities. It comes with official and community-contributed machine
> > >> learning engine templates that are easy to customize.
> > >>
> > >> Rationale
> > >>
> > >> As usage and number of contributors to PredictionIO has grown bigger
> and
> > >> more diverse, we have sought for an independent framework for the
> > project
> > >> to keep thriving. We believe the Apache foundation is a great fit.
> > Joining
> > >> Apache would ensure that tried and true processes and procedures are
> in
> > >> place for the growing number of organizations interested in
> contributing
> > >> to PredictionIO. PredictionIO is also a good fit for the Apache
> > foundation.
> > >> PredictionIO was built on top of several Apache projects (HBase,
> Spark,
> > >> Hadoop). We are familiar with the Apache process and believe that the
> > >> democratic and meritocratic nature of the foundation aligns with the
> > >> project goals.
> > >>
> > >> Initial Goals
> > >>
> > >> The initial milestones will be to move the existing codebase to Apache
> > and
> > >> integrate with the Apache development process. Once this is
> > accomplished,
> > >> we plan for incremental development and releases that follow the
> Apache
> > >> guidelines, as well as growing our developer and user communities.
> > >>
> > >> Current Status
> > >>
> > >> PredictionIO has undergone nine minor releases and many patches.
> > >> PredictionIO is being used in production by Salesforce.com as well as
> > many
> > >> other organizations and apps. The PredictionIO codebase is currently
> > >> hosted at GitHub, which will form the basis of the Apache git
> > repository.
> > >>
> > >> Meritocracy
> > >>
> > >> We plan to invest in supporting a meritocracy. We will discuss the
> > >> requirements in an open forum. We intend to invite additional
> developers
> > >> to participate. We will encourage and monitor community participation
> so
> > >> that privileges can be extended to those that contribute.
> > >>
> > >> Community
> > >>
> > >> Acceptance into the Apache foundation would bolster the already strong
> > >> user and developer community around PredictionIO. That community
> > includes
> > >> many contributors from various other companies, and an active mailing
> > list
> > >> composed of hundreds of users.
> > >>
> > >> Core Developers
> > >>
> > >> The core developers of our project are listed in our contributors and
> > >> initial PPMC below. Though many are employed at Salesforce.com, there
> > are
> > >> also engineers from ActionML, and independent developers.
> > >>
> > >> Alignment
> > >>
> > >> The ASF is the natural choice to host the PredictionIO project as its
> > goal
> > >> is democratizing Machine Learning by making it more easily accessible
> to
> > >> every user/developer. PredictionIO is built on top of several top
> level
> > >> Apache projects as outlined above.
> > >>
> > >> Known Risks
> > >>
> > >> Orphaned Products
> > >>
> > >> PredictionIO has a solid and growing community. It is deployed on
> > >> production environments by companies of all sizes to run various kinds
> > of
> > >> predictive engines.
> > >>
> > >> In addition to the community contribution to PredictionIO framework,
> the
> > >> community is also actively contributing new engines to the Template
> > >> Gallery as well as SDKs and documentation for the project. Salesforce
> is
> > >> committed to utilize and advance the PredictionIO code base and
> support
> > >> its user community.
> > >>
> > >> Inexperience with Open Source
> > >>
> > >> PredictionIO has existed as a healthy open source project for almost
> two
> > >> years and is the most starred Scala project on GitHub. All of the
> > proposed
> > >> committers have contributed to ASF and Linux Foundation open source
> > >> projects. Several current committers on Apache projects and Apache
> > Members
> > >> are involved in this proposal and intend to provide mentorship.
> > >>
> > >> Homogeneous Developers
> > >>
> > >> The initial list of committers includes developers from several
> > >> institutions, including Salesforce, ActionML, Channel4, USC as well as
> > >> unaffiliated developers.
> > >>
> > >> Reliance on Salaried Developers
> > >>
> > >> Like most open source projects, PredictionIO receives substantial
> > support
> > >> from salaried developers. PredictionIO development is partially
> > supported
> > >> by Salesforce.com, but there are many contributors from various other
> > >> companies, and an active mailing list composed of hundreds of users.
> We
> > >> will continue our efforts to ensure stewardship of the project to be
> > >> independent of salaried developers by meritocratically promoting those
> > >> contributors to committers.
> > >>
> > >> Relationships with Other Apache Product
> > >>
> > >> PredictionIO relies heavily on top level Apache projects such as
> Apache
> > >> Spark, HBase and Hadoop. However it brings a distinguished
> > functionality,
> > >> rather than just an abstraction - Machine Learning in a plug-and-play
> > >> fashion.
> > >>
> > >> Compared to Apache Mahout, which focuses on the development of a wide
> > >> variety of algorithms, PredictionIO offers a platform to manage the
> > whole
> > >> machine learning workflow, including data collection, data
> preparation,
> > >> modeling, deployment and management of predictive services in
> production
> > >> environments.
> > >>
> > >> An Excessive Fascination with the Apache Brand
> > >>
> > >> PredictionIO is already a widely known open source project. This
> > proposal
> > >> is not for the purpose of generating publicity. Rather, the primary
> > >> benefits to joining Apache are those outlined in the Rationale
> section.
> > >>
> > >> Documentation
> > >>
> > >> PredictionIO boasts rich and live documentation, included in the code
> > repo
> > >> (docs/manual directory), is built with Middleman, and publicly hosted
> at
> > >> https://docs.prediction.io
> > >>
> > >> Initial Source and Intellectual Property Submission Plan
> > >>
> > >> Currently, the PredictionIO codebase is distributed under the Apache
> 2.0
> > >> License and hosted on GitHub:
> > https://github.com/PredictionIO/PredictionIO
> > >>
> > >> External Dependencies
> > >>
> > >> PredictionIO has the following external dependencies:
> > >>  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > >> needed)
> > >>  * Apache Spark 1.3.0 for Hadoop 2.4
> > >>  * Java SE Development Kit 8
> > >>  * and one of the following sets:
> > >>    * PostgreSQL 9.1
> > >>  or
> > >>    * MySQL 5.1
> > >>  or
> > >>    * Apache HBase 0.98.6
> > >>    * Elasticsearch 1.4.0
> > >>
> > >> Upon acceptance to the incubator, we would begin a thorough analysis
> of
> > >> all transitive dependencies to verify this information and introduce
> > >> license checking into the build and release process by integrating
> with
> > >> Apache RAT.
> > >>
> > >> Cryptography
> > >>
> > >> PredictionIO does not include cryptographic code. We utilize standard
> > >> JCE and JSSE APIs provided by the Java Runtime Environment.
> > >>
> > >> Required Resources
> > >>
> > >> We request that following resources be created for the project to use
> > >>
> > >> Mailing lists
> > >>
> > >>   predictionio-private@incubator.apache.org (with moderated
> > subscriptions)
> > >>   predictionio-dev
> > >>   predictionio-user
> > >>   predictionio-commits
> > >>
> > >>   We will migrate the existing PredictionIO mailing lists.
> > >>
> > >> Git repository
> > >>
> > >>   The PredictionIO team would like to use Git for source control, due
> to
> > >> our
> > >>   current use of GitHub.
> > >>
> > >>   git://git.apache.org/incubator-predictionio
> > >>
> > >> Documentation
> > >>
> > >>   https://predictionio.incubator.apache.org/docs/
> > >>
> > >> JIRA instance
> > >>
> > >>   PredictionIO currently uses the GitHub issue tracking system
> > associated
> > >>   with its repository:
> > https://github.com/PredictionIO/PredictionIO/issues
> > >> .
> > >>   We will migrate to Apache JIRA.
> > >>
> > >>   JIRA PREDICTIONIO
> > >>   https://issues.apache.org/jira/browse/PREDICTIONIO
> > >>
> > >> Other Resources
> > >>
> > >>   TravisCI for builds and test running.
> > >>
> > >>   PredictionIO's documentation, included in the code repo (docs/manual
> > >>   directory), is built with Middleman and publicly hosted at
> > >>   https://docs.prediction.io
> > >>
> > >>   A blog to drive adoption and excitement at
> https://blog.prediction.io
> > >>
> > >> Initial Committers
> > >>
> > >>   Pat Ferrell
> > >>   Tamas Jambor
> > >>   Justin Yip
> > >>   Xusen Yin
> > >>   Lee Moon Soo
> > >>   Donald Szeto
> > >>   Kenneth Chan
> > >>   Tom Chan
> > >>   Simon Chan
> > >>   Marco Vivero
> > >>   Matthew Tovbin
> > >>   Yevgeny Khodorkovsky
> > >>   Felipe Oliveira
> > >>   Vitaly Gordon
> > >>   Alex Merritt
> > >>
> > >> Affiliations
> > >>
> > >>   Pat Ferrell - ActionML
> > >>   Tamas Jambor - Channel4
> > >>   Justin Yip - independent
> > >>   Xusen Yin - USC
> > >>   Lee Moon Soo - NFLabs
> > >>   Donald Szeto - Salesforce
> > >>   Kenneth Chan - Salesforce
> > >>   Tom Chan - Salesforce
> > >>   Simon Chan - Salesforce
> > >>   Marco Vivero - Salesforce
> > >>   Matthew Tovbin - Salesforce
> > >>   Yevgeny Khodorkovsky - Salesforce
> > >>   Felipe Oliveira - Salesforce
> > >>   Vitaly Gordon - Salesforce
> > >>   Alex Merritt - ActionML
> > >>
> > >> Sponsors
> > >>
> > >> Champion
> > >>
> > >>   Andrew Purtell <apurtell at apache dot org>
> > >>
> > >> Nominated Mentors
> > >>
> > >>   Andrew Purtell <apurtell at apache dot org>
> > >>   James Taylor <jtaylor at apache dot org>
> > >>   Lars Hofhansl <larsh at apache dot org>
> > >>   Suneel Marthi <smarthi at apache dot org>
> > >>   Xiangrui Meng <meng at apache dot org>
> > >>   Luciano Resende <lresende at apache dot org>
> > >>
> > >> Sponsoring Entity
> > >>
> > >>   Apache Incubator PMC
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >>
> > >>    - Andy
> > >>
> > >> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > >> (via Tom White)
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>
>
> --
> Paul Fremantle
> Co-Founder and CTO, WSO2
> Member of the Apache Software Foundation
> OASIS WS-RX TC Co-chair
>
> blog: http://pzf.fremantle.org
> twitter: @pzfreo
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message