incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Chan <si...@salesforce.com>
Subject Re: [DISCUSS] PredictionIO incubation proposal
Date Mon, 16 May 2016 02:02:32 GMT
Great question, Henry. This is the main issue we are not 100% sure how to
handle yet. We put this in the proposal:

"The PredictionIO community also maintains a Template Gallery, a place to
publish and download (free or proprietary) engine templates for different
types of machine learning applications, and is a complemental part of the
project. At this point we exclude the Template Gallery from the proposal,
as it has a separate set of contributors and we’re not familiar with an
Apache approved mechanism to maintain such a gallery."

Any suggestion?

Regards,
Simon

On Sun, May 15, 2016 at 5:26 PM, Henry Saputra <henry.saputra@gmail.com>
wrote:

> This is great news!
>
> One question, what would happen with the template gallery repository?
>
> Will it be moved under ASF too or will it be maintained as separate repo?
>
> - Henry
>
> On Friday, May 13, 2016, Andrew Purtell <apurtell@apache.org> wrote:
>
> > Greetings,
> >
> > It is my pleasure to
> > ​ ​
> > propose the PredictionIO project for incubation at the Apache Software
> > Foundation.
> > ​ ​
> > PredictionIO is a
> > ​ popular​
> > open
> > ​ ​
> > source Machine Learning Server built on top of a state-of-the-art open
> > source stack, including several Apache technologies, that
> > ​ ​
> > enables developers to manage and deploy production-ready predictive
> > services for various kinds of machine learning tasks
> > ​, with more than 400 production deployments around the world and a
> growing
> > contributor community. ​
> >
> >
> > The text of the proposal is included below and is also available at
> > https://wiki.apache.org/incubator/PredictionIO
> >
> > Best regards,
> > Andrew Purtell
> >
> >
> > = PredictionIO Proposal =
> >
> > === Abstract ===
> > PredictionIO is an open source Machine Learning Server built on top of
> > state-of-the-art open source stack, that enables developers to manage and
> > deploy production-ready predictive services for various kinds of machine
> > learning tasks.
> >
> > === Proposal ===
> > The PredictionIO platform consists of the following components:
> >
> >  * PredictionIO framework - provides the machine learning stack for
> >  building, evaluating and deploying engines with machine learning
> >  algorithms. It uses Apache Spark for processing.
> >
> >  * Event Server - the machine learning analytics layer for unifying
> events
> >  from multiple platforms. It can use Apache HBase or any JDBC backends
> >  as its data store.
> >
> > The PredictionIO community also maintains a
> > ​ ​
> > Template Gallery, a place to
> > publish and download (free or proprietary) engine templates for different
> > types of machine learning applications, and is a complemental part of the
> > project. At this point we exclude the Template Gallery from the proposal,
> > as it has a separate set of contributors and we’re not familiar with an
> > Apache approved mechanism to maintain such a gallery.
> >
> > You can find the Template Gallery at https://templates.prediction.io/
> >
> > === Background ===
> > PredictionIO was started with a mission to democratize and bring machine
> > learning to the masses.
> >
> > Machine learning has traditionally been a luxury for big companies like
> > Google, Facebook, and Netflix. There are ML libraries and tools lying
> > around the internet but the effort of putting them all together as a
> > production-ready infrastructure is a very resource-intensive task that is
> > remotely reachable by individuals or small businesses.
> >
> > PredictionIO is a production-ready, full stack machine learning system
> that
> > allows organizations of any scale to quickly deploy machine learning
> > capabilities. It comes with official and community-contributed machine
> > learning engine templates that are easy to customize.
> >
> > === Rationale ===
> > As usage and number of contributors to PredictionIO has grown bigger and
> > more diverse, we have sought for an independent framework for the project
> > to keep thriving. We believe the Apache foundation is a great fit.
> Joining
> > Apache would ensure that tried and true processes and procedures are in
> > place for the growing number of organizations interested in contributing
> > to PredictionIO. PredictionIO is also a good fit for the Apache
> foundation.
> > PredictionIO was built on top of several Apache projects (HBase, Spark,
> > Hadoop). We are familiar with the Apache process and believe that the
> > democratic and meritocratic nature of the foundation aligns with the
> > project goals.
> >
> > === Initial Goals ===
> > The initial milestones will be to move the existing codebase to Apache
> and
> > integrate with the Apache development process. Once this is accomplished,
> > we plan for incremental development and releases that follow the Apache
> > guidelines, as well as growing our developer and user communities.
> >
> > === Current Status ===
> > PredictionIO has undergone nine minor releases and many patches.
> > PredictionIO is being used in production by Salesforce.com as well as
> many
> > other organizations and apps. The PredictionIO codebase is currently
> > hosted at GitHub, which will form the basis of the Apache git repository.
> >
> > ==== Meritocracy ====
> > We plan to invest in supporting a meritocracy. We will discuss the
> > requirements in an open forum. We intend to invite additional developers
> > to participate. We will encourage and monitor community participation so
> > that privileges can be extended to those that contribute.
> >
> > ==== Community ====
> > Acceptance into the Apache foundation would bolster the already strong
> > user and developer community around PredictionIO. That community includes
> > many contributors from various other companies, and an active mailing
> list
> > composed of hundreds of users.
> >
> > ==== Core Developers ====
> > The core developers of our project are listed in our contributors and
> > initial PPMC below. Though many are employed at Salesforce.com, there are
> > also engineers from ActionML, and independent developers.
> >
> > === Alignment ===
> > The ASF is the natural choice to host the PredictionIO project as its
> goal
> > is democratizing Machine Learning by making it more easily accessible to
> > every user/developer. PredictionIO is built on top of several top level
> > Apache projects as outlined above.
> >
> > === Known Risks ===
> >
> > ==== Orphaned products ====
> > PredictionIO has a solid and growing community. It is deployed on
> > production environments by companies of all sizes to run various kinds of
> > predictive engines.
> >
> > In addition to the community contribution to PredictionIO framework, the
> > community is also actively contributing new engines to the Template
> > Gallery as well as SDKs and documentation for the project. Salesforce is
> > committed to utilize and advance the PredictionIO code base and support
> > its user community.
> >
> > ==== Inexperience with Open Source ====
> > PredictionIO has existed as a healthy open source project for almost two
> > years and is the most starred Scala project on GitHub. All of the
> proposed
> > committers have contributed to ASF and Linux Foundation open source
> > projects. Several current committers on Apache projects and Apache
> Members
> > are involved in this proposal and intend to provide mentorship.
> >
> > ==== Homogeneous Developers ====
> > The initial list of committers includes developers from several
> > institutions, including Salesforce, ActionML, Channel4, USC as well as
> > unaffiliated developers.
> >
> > ==== Reliance on Salaried Developers ====
> > Like most open source projects, PredictionIO receives substantial support
> > from salaried developers. PredictionIO development is partially supported
> > by Salesforce.com, but there are many contributors from various other
> > companies, and an active mailing list composed of hundreds of users. We
> > will continue our efforts to ensure stewardship of the project to be
> > independent of salaried developers by meritocratically promoting those
> > contributors to committers.
> >
> > ==== Relationships with Other Apache Product ====
> > PredictionIO relies heavily on top level apache projects such as Apache
> > Spark, HBase and Hadoop. However it brings a distinguished functionality,
> > rather than just an abstraction - Machine Learning in a plug-and-play
> > fashion.
> >
> > Compared to Apache Mahout, which focuses on the development of a wide
> > variety of algorithms, PredictionIO offers a platform to manage the whole
> > machine learning workflow, including data collection, data preparation,
> > modeling, deployment and management of predictive services in production
> > environments.
> >
> > ==== An Excessive Fascination with the Apache Brand ====
> > PredictionIO is already a widely known open source project. This proposal
> > is not for the purpose of generating publicity. Rather, the primary
> > benefits to joining Apache are those outlined in the Rationale section.
> >
> > === Documentation ===
> > PredictionIO boasts rich and live documentation, included in the code
> repo
> > (docs/manual directory), is built with Middleman, and publicly hosted at
> > https://docs.prediction.io
> >
> > === Initial Source and Intellectual Property Submission Plan ===
> > Currently, the PredictionIO codebase is distributed under the Apache 2.0
> > License and hosted on GitHub:
> https://github.com/PredictionIO/PredictionIO
> >
> > === External Dependencies ===
> > PredictionIO has the following external dependencies:
> >  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > needed)
> >  * Apache Spark 1.3.0 for Hadoop 2.4
> >  * Java SE Development Kit 8
> >  * and one of the following sets:
> > ​  ​
> >    * PostgreSQL 9.1
> >
> > ​  ​
> > or
> >
> > ​  ​
> > * MySQL 5.1
> > ​  ​
> >  or
> >
> > ​  ​
> >  * Apache HBase 0.98.6
> >
> > ​  ​
> > * Elasticsearch 1.4.0
> >
> > Upon acceptance to the incubator, we would begin a thorough analysis of
> > all transitive dependencies to verify this information and introduce
> > license checking into the build and release process by integrating with
> > Apache RAT.
> >
> > === Cryptography ===
> > PredictionIO does not include cryptographic code. We utilize standard
> > JCE and JSSE APIs provided by the Java Runtime Environment.
> >
> > === Required Resources ===
> > We request that following resources be created for the project to use
> >
> > ==== Mailing lists ====
> >
> > predictionio-private@incubator.apache.org <javascript:;> (with moderated
> > subscriptions)
> >
> > predictionio-dev
> >
> > predictionio-user
> >
> > predictionio-commits
> >
> > We will migrate the existing PredictionIO mailing lists.
> >
> > ==== Git repository ====
> > The PredictionIO team would like to use Git for source control, due to
> our
> > current use of GitHub.
> >
> > git://git.apache.org/incubator-predictionio
> >
> > ==== Documentation ====
> > https://predictionio.incubator.apache.org/docs/
> >
> > ==== JIRA instance ====
> > PredictionIO currently uses the GitHub issue tracking system associated
> > with its repository: https://github.com/PredictionIO/PredictionIO/issues
> .
> > We will migrate to Apache JIRA.
> >
> > JIRA PREDICTIONIO
> > https://issues.apache.org/jira/browse/PREDICTIONIO
> >
> > ==== Other Resources ====
> > * TravisCI for builds and test running.
> >
> > * PredictionIO's documentation, included in the code repo (docs/manual
> > directory), is built with Middleman and publicly hosted
> > https://docs.prediction.io
> >
> > * A blog to drive adoption and excitement at https://blog.prediction.io
> >
> > === Initial Committers ===
> >
> > * Pat Ferrell
> >
> > * Tamas Jambor
> >
> > * Justin Yip
> >
> > * Xusen Yin
> >
> > * Lee Moon Soo
> >
> > * Donald Szeto
> >
> > * Kenneth Chan
> >
> > * Tom Chan
> >
> > * Simon Chan
> >
> > * Marco Vivero
> >
> > * Matthew Tovbin
> >
> > * Yevgeny Khodorkovsky
> >
> > * Felipe Oliveira
> >
> > * Vitaly Gordon
> >
> > === Affiliations ===
> >
> > * Pat Ferrell - ActionML
> >
> > * Tamas Jambor - Channel4
> >
> > * Justin Yip - independent
> >
> > * Xusen Yin - USC
> >
> > * Lee Moon Soo - NFLabs
> >
> > * Donald Szeto - Salesforce
> >
> > * Kenneth Chan - Salesforce
> >
> > * Tom Chan - Salesforce
> >
> > * Simon Chan - Salesforce
> >
> > * Marco Vivero - Salesforce
> >
> > * Matthew Tovbin - Salesforce
> >
> > * Yevgeny Khodorkovsky - Salesforce
> >
> > * Felipe Oliveira - Salesforce
> >
> > * Vitaly Gordon - Salesforce
> >
> > === Sponsors ===
> >
> > ==== Champion ====
> >
> > Andrew Purtell <apurtell at apache dot org>
> >
> > ==== Nominated Mentors ====
> >
> > * Andrew Purtell <apurtell at apache dot org>
> >
> > * James Taylor <jtaylor at apache dot org>
> >
> > * Lars Hofhansl <larsh at apache dot org>
> >
> > * Suneel Marthi <smarthi at apache dot org>
> >
> > * Xiangrui Meng <meng at apache dot org>
> >
> > * Luciano Resende <lresende at apache dot org>
> >
> > ==== Sponsoring Entity ====
> >
> > Apache Incubator PMC
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message