incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Sicker <boa...@gmail.com>
Subject Re: [VOTE] Accept DLab into the Apache Incubator
Date Wed, 15 Aug 2018 18:39:19 GMT
+1 (binding)

On Wed, 15 Aug 2018 at 12:12, P. Taylor Goetz <ptgoetz@gmail.com> wrote:

> +1 (binding)
>
> -Taylor
>
> > On Aug 15, 2018, at 12:38 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> >
> > +1
> >
> >
> >
> > On Wed, Aug 15, 2018 at 9:36 AM Dave Fisher <dave2wave@comcast.net>
> wrote:
> >
> >> +1 (binding)
> >>
> >> Sent from my iPhone
> >>
> >>> On Aug 15, 2018, at 9:27 AM, P. Taylor Goetz <ptgoetz@apache.org>
> wrote:
> >>>
> >>> After a brief discussion [1] I would like to call a VOTE to accept DLab
> >> into the Apache Incubator. The full proposal is available on the wiki[2]
> >> and is pasted below in text form as well.
> >>>
> >>> This vote will run at least 72 hours. Please VOTE as follows:
> >>>
> >>> [ ] +1 Accept DLab into the Apache Incubator
> >>> [ ] +0 No opinion
> >>> [ ] -1 Do not accept DLab into the Apache Incubator because…
> >>>
> >>> -Taylor
> >>>
> >>> [1]
> >>
> https://lists.apache.org/thread.html/9c96873d49f53da33260e21dc698f7c9b82eec256caf97a0e3f54943@%3Cgeneral.incubator.apache.org%3E
> >>> [2] https://wiki.apache.org/incubator/DLabProposal
> >>>
> >>>
> >>> = DLab Proposal =
> >>>
> >>> == Abstract ==
> >>> DLab is a platform for creating self-service, exploratory data science
> >> environments in the cloud using best-of-breed data science tools.
> >>>
> >>> DLab includes a self-service web console, used to create and manage
> >> exploratory environments. It allows teams to spin up analytical
> >> environments with just a single click of a mouse. Once established, the
> >> environment can be managed by an analytical team itself, leveraging
> simple
> >> and easy-to-use web-based interface.
> >>>
> >>> == Proposal ==
> >>> In order to work effectively, data scientists rely on a varying suite
> of
> >> analytics tools that are readily available. However, many of those tools
> >> are non-trivial to set up in terms of hardware provisioning, software
> >> installation, configuration, and deployment. Setting up a collaborative,
> >> multi-tenant development environment for data scientists consumes
> >> substantial IT and DevOps resources, as well as time. These factors
> often
> >> combine to hinder the agility and effectiveness of data science teams
> >> within an organization. Current solutions are largely closed source
> and/or
> >> proprietary, and committing to a given solution introduces the potential
> >> for vendor lock-in.
> >>>
> >>> EPAM Systems developed DLab in response to the lack of open source,
> >> permissibly licensed solutions to better enable data science workflows.
> The
> >> ALv2 was selected to encourage open development and user adoption. DLab
> was
> >> open sourced on Dec 29, 2016 and is under active development with
> support
> >> from EPAM Systems.
> >>>
> >>> We believe DLab is a unique solution with no current open source
> >> equivalent. Our primary goals of incubation are to grow and diversify
> the
> >> DLab community to ensure its long-term sustainability.
> >>>
> >>> == Rationale ==
> >>> DLab is a platform that provides data scientists with the ability to
> >> self-provision, without IT support, exploratory and production
> environments
> >> with their preferred set of tools installed and pre-configured. Tool
> >> options include, but are not limited to:
> >>>
> >>> * Apache Spark
> >>> * Apache Flink (planned)
> >>> * Apache Zeppelin
> >>> * Jupyter
> >>> * TensorFlow + Jupyter
> >>> * Deep Learning + Jupyter
> >>>
> >>> DLab leverages cloud computing providers for virtual hardware
> >> provisioning and currently supports the following:
> >>>
> >>> * Amazon Web Services (AWS)
> >>> * Microsoft Azure
> >>> * Google Compute Platform (GCP) (under development)
> >>>
> >>> DLab offers git-based collaboration tools for data scientists and
> >> developers and integrates with the following git service providers:
> >>>
> >>> * GItHub
> >>> * GitLab
> >>> * BitBucket
> >>>
> >>> Additionally, DLab includes the option to configure the UnGit tool in
> an
> >> environment to facilitate collaboration.
> >>> Finally, DLab integrates closely with many security and SSO offerings,
> >> including:
> >>>
> >>> * LDAP
> >>> * Microsoft Active Directory
> >>> * AWS Identity Access Management service
> >>>
> >>> DLab was designed from the ground up to be highly configurable,
> >> flexible, and extensible platform. We believe these qualities will
> >> encourage community growth by enabling contributors to easily add new
> >> integrations and extensions.
> >>>
> >>> == Initial Goals ==
> >>> The initial goal will be to move the existing codebase to Apache and
> >> integrate with the Apache development process and infrastructure. A
> primary
> >> goal of incubation will be to grow and diversify the DLab PPMC. We are
> well
> >> aware that the project community is comprised of individuals from a
> single
> >> company. We aim to change that during incubation.
> >>>
> >>> == Current Status ==
> >>> As previously mentioned, DLab is under active development at EPAM
> >> Systems, and is being used in a number of production deployments:
> >>>
> >>> * [An investment company] is using DLab as an AWS-based analytics
> >> platform for their data scientists to provide a convenient way to
> perform
> >> multi-tenant data analytics. This enables data scientists to easily
> >> provision work environments with integrated data sources based on
> >> Elasticsearch, Apache HBase, and Neo4j, and utilizing Apache Spark. This
> >> enabled a “one click”, self service option for users to provision an
> >> environment with the necessary tools and data.
> >>>
> >>> * [An electronics manufacturing company] leverages DLab for data
> >> quality, data exploration, and analytics. The company’s data scientists
> >> leverage DLab to work with data sources that have been transferred to
> the
> >> cloud in order to find new insights on the data, and help the
> >> implementation team define requirements for data engineering. The main
> goal
> >> is to increase the utilization of various tools by decreasing time to
> >> deployment.
> >>>
> >>> * [A retail company] is using DLab as an image recognition framework,
> to
> >> enable automated restocking of inventory.
> >>>
> >>> * [A travel company] is using DLab to create recommendation engine that
> >> will allow end users to find more relevant accommodations faster and at
> a
> >> lower cost.
> >>>
> >>> === Meritocracy ===
> >>> We value meritocracy and we understand that it is the basis for an open
> >> community that encourages multiple companies and individuals to
> contribute
> >> and be invested in the project’s future. We will encourage and monitor
> >> participation and make sure to extend privileges and responsibilities to
> >> all contributors.
> >>>
> >>> === Community ===
> >>> DLab is currently being used by developers at EPAM and a gowing number
> >> of customers are actively using it in production environments. By
> bringing
> >> DLab to Apache we hope to broaden and diversity the user and developer
> >> community through open collaboration.
> >>>
> >>> === Core Developers ===
> >>> DLab was initially developed at EPAM Systems and is under active
> >> development. We believe DLab will be of interest to a broad range of
> users
> >> and devlopers and that incubating the project at the ASF will help us
> build
> >> a diverse, sustainable community.
> >>>
> >>> === Alignment ===
> >>> DLab utilizes other Apache projects such as Apache Spark, Apache Toree
> >> (incubating), and Apache Zeppelin, along with a number of other Apache
> >> libraries. We anticipate integration with additional Apache projects as
> the
> >> DLab community and interest in the project grows.
> >>>
> >>> == Known Risks ==
> >>>
> >>> === Orphaned products ===
> >>> EPAM Systems is committed to the future development of DLab and
> >> understands that graduation to a TLP, while preferable, is not the only
> >> positive outcome of incubation.
> >>>
> >>> Should the DLab project be accepted by the Incubator, the prospective
> >> PPMC would be willing to agree to a target incubation period of 2 years
> or
> >> less, knowing that every Incubator project incurs a certain cost in
> terms
> >> of ASF infrastructure and volunteer time.
> >>>
> >>> === Inexperience with Open Source ===
> >>> Many DLab contributors are already familiar with open source processes
> >> and several of them are committers on other Apache projects. We will be
> >> actively working with experienced Apache community members to improve
> our
> >> project.
> >>>
> >>> === Homogenous Developers ===
> >>> The initial committers of DLab all come from EPAM Systems,  though we
> >> are committed to recruiting and developing additional committers from a
> >> wide spectrum of industries and backgrounds.
> >>>
> >>> === Reliance on Salaried Developers ===
> >>> It is expected that DLab development will occur on both salaried time
> >> and on volunteer time, after hours. All of the initial committers are
> paid
> >> by EPAM Systems to contribute to this project. However, they are all
> >> passionate about the project, and we are both confident and hopeful that
> >> the project will continue even if no salaried developers contribute to
> the
> >> project.
> >>>
> >>> === Relationships with Other Apache Products ===
> >>> As mentioned in the Rationale section, DLab utilizes a number of
> >> existing Apache projects (Spark, Toree, Zeppelin, et. al.), and we
> expect
> >> that list to expand as the community grows and diversifies. Any Apache
> >> project in the big data, data science, and/or analytics space would be
> >> potentially relevant.
> >>>
> >>> === A Excessive Fascination with the Apache Brand ===
> >>> We are applying to the Incubator process because we think it is the
> next
> >> logical step for the DLab project after open-sourcing the code. This
> >> proposal is not for the purpose of generating publicity. Rather, we
> want to
> >> make sure to create a very inclusive and meritocratic community, outside
> >> the umbrella of a single company. EPAM has a long history of
> contributing
> >> to Apache projects and the DLab developers and contributors understand
> the
> >> implication of making it an Apache project.
> >>>
> >>> == Required Resources ==
> >>>
> >>> === Mailing lists ===
> >>> * dev@dlab.incubator.apache.org
> >>> * commits@dlab.incubator.apache.org
> >>> * private@dlab.incubator.apache.org
> >>>
> >>> === Source control ===
> >>> * https://git-wip-us.apache.org/repos/asf/incubator-dlab
> >>>
> >>> === Issue tracking ===
> >>> * JIRA DLab (DLAB)
> >>>
> >>> == Documentation ==
> >>> * DLab Website: http://dlab.opensource.epam.com
> >>> * DLab code base: https://github.com/epam/DLab
> >>> * DLab Overview: https://github.com/epam/DLab/blob/master/README.md
> >>> * DLab User Guide:
> >> https://github.com/epam/DLab/blob/master/USER_GUIDE.md
> >>>
> >>> == Initial Source ==
> >>> The DLab codebase is currently hosted on Github:
> >> https://github.com/epam/DLab
> >>>
> >>> == Source and Intellectual Property Submission Plan ==
> >>> The DLab source code in Github is currently licensed under Apache
> >> License v2.0 and the copyright is assigned to EPAM Systems. If DLab
> becomes
> >> an Incubator project at the ASF, EPAM Systems will transfer the source
> code
> >> and trademark ownership to the Apache Software Foundation via a Software
> >> Grant Agreement.
> >>>
> >>> == External Dependencies ==
> >>> To the best of our knowledge, all of DLab dependencies are distributed
> >> under Apache compatible licenses.
> >>>
> >>> DLab was designed to be highly extensible, and we expect and encourage
> >> the development of third-party extensions and plug-ins. We also
> understand
> >> that any such component, if it requires a dependency forbidden by Apache
> >> license policy, would not be eligible for inclusion in an Apache
> release,
> >> and would have to be hosted, supported, etc. outside of ASF
> infrastructure
> >> and labeled appropriately.
> >>>
> >>> === External dependencies licensed under Apache License 2.0: ===
> >>> MongoDB Java Driver - org.mongodb:mongo-java-driver (
> >> http://mongodb.github.io/mongo-java-driver/3.2/driver)
> >>>
> >>> Dropwizard (https://github.com/dropwizard/dropwizard)
> >>>
> >>> Dropwizard Template Config (
> >> https://github.com/tkrille/dropwizard-template-config)
> >>>
> >>> Apache Directory Server (https://github.com/apache/directory-server)
> >>>
> >>> Jackson (https://github.com/FasterXML/jackson)
> >>>
> >>> AWS Java SDK (https://github.com/aws/aws-sdk-java)
> >>>
> >>> Boto3 (https://github.com/boto/boto3)
> >>>
> >>> === External dependencies licensed under the MIT License: ===
> >>> angular2-app (https://www.npmjs.com/package/angular2-app)
> >>>
> >>> angular2-seed (https://www.npmjs.com/package/angular2-seed)
> >>>
> >>> angular2-seed-advanced (
> >> https://www.npmjs.org/package/angular2-seed-advanced)
> >>>
> >>> angular2-seed-n3UX (https://www.npmjs.com/package/angular2-seed-n3UX)
> >>>
> >>> http-status-enum (https://www.npmjs.com/package/http-status-enum)
> >>> Mockito (https://github.com/mockito/mockito)
> >>>
> >>> ng2-translate (https://www.npmjs.com/package/ng2-translate)
> >>>
> >>> SLF4J (http://www.slf4j.org/)
> >>>
> >>> === External dependencies licensed under the CDDL License: ===
> >>> Jersey (https://github.com/jersey/jersey)
> >>>
> >>> === External dependencies licensed under the Python Software License
> >> Version 2: ===
> >>> jython (https://github.com/jythontools/jython)
> >>>
> >>> === ASF Projects: ===
> >>> Apache Spark, Apache Toree (incubating), Apache Zeppelin
> >>>
> >>> == Cryptography ==
> >>> Not applicable.
> >>>
> >>> == Initial Committers ==
> >>> * Dmytro Liaskovskyi dmytro_liaskovskyi@epam.com
> >>> * Volodymyr Veres Volodymyr_Veres@epam.com
> >>> * Oleh Hrynets Oleh_Hrynets@epam.com
> >>> * Oleh Hrynyk Oleh_Hrynyk@epam.com
> >>> * Oleh Martushevskyi Oleh_Martushevskyi@epam.com
> >>> * Oleh Moskovych Oleh_Moskovych@epam.com
> >>> * Vadym Kuznetsov Vadym_Kuznetsov@epam.com
> >>> * Usein Faradzhev Usein_Faradzhev@epam.com
> >>> * Bohdan Hliva Bohdan_Hliva@epam.com
> >>> * Oleksandr Melnychuk Oleksandr_Melnychuk1@epam.com
> >>> * Mikhail Teplitskiy Mikhail_Teplitskiy@epam.com
> >>> * Vira Vitanska Vira_Vitanska@epam.com
> >>> * Andriana Kovalyshyn Andriana_Kovalyshyn@epam.com
> >>> * Oleksandr Chaparin Oleksandr_Chaparin@epam.com
> >>> * Denys Shliakhov Denys_Shliakhov@epam.com
> >>> * Nazar Barabash Nazar_Barabash@epam.com
> >>> * Yuriy Holinko Yuriy_Holinko@epam.com
> >>> * Petro Kotsiuba Petro_Kotsiuba@epam.com
> >>> * Bogdan Rudyi Bogdan_Rudyi@epam.com
> >>> * Mikhail Teplitskyi Mikhail_Teplitskyi@epam.com
> >>>
> >>> == Sponsors ==
> >>>
> >>> === Champion ===
> >>> * P. Taylor Goetz ptgoetz@apache.org
> >>>
> >>> === Nominated Mentors ===
> >>> * P. Taylor Goetz ptgoetz@apache.org
> >>> * Henry Saputra hsaputra@apache.org
> >>>
> >>> === Interested Contributors ===
> >>> * Debo Dutta ddutta@apache.org
> >>>
> >>> === Sponsoring Entity ===
> >>> * The Apache Incubator
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >> For additional commands, e-mail: general-help@incubator.apache.org
> >>
> >>
>
>

-- 
Matt Sicker <boards@gmail.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message