From general-return-63289-apmail-incubator-general-archive=incubator.apache.org@incubator.apache.org Fri Feb 2 08:28:59 2018 Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D4A4617B8E for ; Fri, 2 Feb 2018 08:28:59 +0000 (UTC) Received: (qmail 83190 invoked by uid 500); 2 Feb 2018 08:28:58 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 82948 invoked by uid 500); 2 Feb 2018 08:28:58 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 82936 invoked by uid 99); 2 Feb 2018 08:28:57 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Feb 2018 08:28:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 5AE3518E888 for ; Fri, 2 Feb 2018 08:28:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 3C5-y-8sbrCl for ; Fri, 2 Feb 2018 08:28:51 +0000 (UTC) Received: from mail-yw0-f172.google.com (mail-yw0-f172.google.com [209.85.161.172]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id C469B5F1BE for ; Fri, 2 Feb 2018 08:28:50 +0000 (UTC) Received: by mail-yw0-f172.google.com with SMTP id m84so12603439ywd.5 for ; Fri, 02 Feb 2018 00:28:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=0EfY97ElDzpiJzUOw49Z9NkUPVEaVfaEHtdI+SO1CQU=; b=V29PXJRptFxmqiYjXNeO9sOEsnmLVkUFfEEzeGDSX+iJ/CT9xxEp+bOKNo63eFPBMC Fncv4dXdOrifUiL/kYofP/0yLTZssd7K5fA7+a5MDz4/gBUapTmmBfdYftAiH26yWPSH l849kqTlazUquOQvHWlh9CeQVkPedWG1SwBF0x/9YuahAgmxUvlqR/xmwbEPN3cCsio0 HlKbR4IlYd/0TxL27Ek+gQ61T3DosTEuOGKDkVR4jKtR6M+ByJAlpU5WKYuK3uVQZiOc ozBLMe4KH3wxVVT6SIo5Y0W5U4uOvWjcclL9VnVs9Zwz9FbNidwoP/+5cP39bePn0H1R IXJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=0EfY97ElDzpiJzUOw49Z9NkUPVEaVfaEHtdI+SO1CQU=; b=kNSg8sy8i1YW5MO7b82sRc5+ci9SaRCVck72+B9Hrww0auIDFXYdwaqTSORRQEh5bv l2cvQTwd73zGe5C+scp6img/3gf/3jBrxrjzJhmLza4RkChcBH2WcpLRyPWrVJhzravY KZ3g+m6cZ4G7VGVMH8wr9laDKnScx0aWlJdSGghnfT+4PIww+7+3mc6VFU60ybE/4Sgs bvporUJcUwA0lEdfHf+tA0P+WnVMZPFIQYod2pkjfCsUFABLNF429FFP5cvyzV0w6bOc GsbisDliZFK3kSuRpD52hmPpfPYA0zU1kD0tb8B71WrQo58zAPsHqPnuKKud8ePtLlCH sl7Q== X-Gm-Message-State: AKwxytfsyeyajqygsjBJ8u83pD73/w5S/g4uxBzvK2LGQVGHk7AnRw9S jI6CdWJkZz/BppzjJ/V1cF8t0C0lroTLnDe0Tss= X-Google-Smtp-Source: AH8x2276ZuwJL9dkEJhwTXKBRSS7ThIaBA0pD7zo3KQ/F2JqTpkrodRLGm49vDOITEdZ1FiNfXKuozW+Z7+1Z4CSmnc= X-Received: by 10.129.198.6 with SMTP id l6mr26231505ywi.495.1517560124344; Fri, 02 Feb 2018 00:28:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.20.130 with HTTP; Fri, 2 Feb 2018 00:28:24 -0800 (PST) In-Reply-To: References: From: Romain Manni-Bucau Date: Fri, 2 Feb 2018 09:28:24 +0100 Message-ID: Subject: Re: [VOTE] Accept Coral into the Apache Incubator To: general@incubator.apache.org Content-Type: multipart/alternative; boundary="94eb2c1a3b32230d0105643680c9" --94eb2c1a3b32230d0105643680c9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable +1 Romain Manni-Bucau @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book 2018-02-02 7:54 GMT+01:00 Jean-Baptiste Onofr=C3=A9 : > +1 (binding) > > Regards > JB > > On 02/01/2018 03:07 PM, Byung-Gon Chun wrote: > > Hi all, > > > > I would like to start a VOTE to propose the Coral project as a podling > into > > the Apache Incubator. > > > > The ASF voting rules are described at https://www.apache.org/foundation= / > > voting.html > > > > A vote for accepting a new Apache Incubator podling is a majority vote > for > > which only Incubator PMC member votes are binding. > > > > This vote will run for at least 72 hours. Please VOTE as follows. > > [] +1 Accept Coral into the Apache Incubator > > [] +0 Abstain > > [] -1 Do not accept Coral into the Apache Incubator because ... > > > > The proposal is listed below, but you can also access it on the wiki: > > https://wiki.apache.org/incubator/CoralProposal > > > > =3D CoralProposal =3D > > > > =3D=3D Abstract =3D=3D > > Coral is a data processing system for flexible employment with > > different execution scenarios for various deployment characteristics > > on clusters. > > > > =3D=3D Proposal =3D=3D > > Today, there is a wide variety of data processing systems with > > different designs for better performance and datacenter efficiency. > > They include processing data on specific resource environments and > > running jobs with specific attributes. Although each system > > successfully solves the problems it targets, most systems are designed > > in the way that runtime behaviors are built tightly inside the system > > core to hide the complexity of distributed computing. This makes it > > hard for a single system to support different deployment > > characteristics with different runtime behaviors without substantial > > effort. > > > > Coral is a data processing system that aims to flexibly control the > > runtime behaviors of a job to adapt to varying deployment > > characteristics. Moreover, it provides a means of extending the > > system=E2=80=99s capabilities and incorporating the extensions to the f= lexible > > job execution. > > > > In order to be able to easily modify runtime behaviors to adapt to > > varying deployment characteristics, Coral exposes runtime behaviors to > > be flexibly configured and modified at both compile-time and runtime > > through a set of high-level graph pass interfaces. > > > > We hope to contribute to the big data processing community by enabling > > more flexibility and extensibility in job executions. Furthermore, we > > can benefit more together as a community when we work together as a > > community to mature the system with more use cases and understanding > > of diverse deployment characteristics. The Apache Software Foundation > > is the perfect place to achieve these aspirations. > > > > =3D=3D Background =3D=3D > > Many data processing systems have distinctive runtime behaviors > > optimized and configured for specific deployment characteristics like > > different resource environments and for handling special job > > attributes. > > > > For example, much research have been conducted to overcome the > > challenge of running data processing jobs on cheap, unreliable > > transient resources. Likewise, techniques for disaggregating different > > types of resources, like memory, CPU and GPU, are being actively > > developed to use datacenter resources more efficiently. Many > > researchers are also working to run data processing jobs in even more > > diverse environments, such as across distant datacenters. Similarly, > > for special job attributes, many works take different approaches, such > > as runtime optimization, to solve problems like data skew, and to > > optimize systems for data processing jobs with small-scale input data. > > > > Although each of the systems performs well with the jobs and in the > > environments they target, they perform poorly with unconsidered cases, > > and do not consider supporting multiple deployment characteristics on > > a single system in their designs. > > > > For an application writer to optimize an application to perform well > > on a certain system engraved with its underlying behaviors, it > > requires a deep understanding of the system itself, which is an > > overhead that often requires a lot of time and effort. Moreover, for a > > developer to modify such system behaviors, it requires modifications > > of the system core, which requires an even deeper understanding of the > > system itself. > > > > With this background, Coral is designed to represent all of its jobs > > as an Intermediate Representation (IR) DAG. In the Coral compiler, > > user applications from various programming models (ex. Apache Beam) > > are submitted, transformed to an IR DAG, and optimized/customized for > > the deployment characteristics. In the IR DAG optimization phase, the > > DAG is modified through a series of compiler =E2=80=9Cpasses=E2=80=9D w= hich reshape or > > annotate the DAG with an expression of the underlying runtime > > behaviors. The IR DAG is then submitted as an execution plan for the > > Coral runtime. The runtime includes the unmodified parts of data > > processing in the backbone which is transparently integrated with > > configurable components exposed for further extension. > > > > =3D=3D Rationale =3D=3D > > Coral=E2=80=99s vision lies in providing means for flexibly supporting = a wide > > variety of job execution scenarios for users while facilitating system > > developers to extend the execution framework with various > > functionalities at the same time. The capabilities of the system can > > be extended as it grows to meet a more variety of execution scenarios. > > We require inputs from users and developers from diverse domains in > > order to make it a more thriving and useful project. The Apache > > Software Foundation provides the best tools and community to support > > this vision. > > > > =3D=3D Initial Goals =3D=3D > > Initial goals will be to move the existing codebase to Apache and > > integrate with the Apache development process. We further plan to > > develop our system to meet the needs for more execution scenarios for > > a more variety of deployment characteristics. > > > > =3D=3D Current Status =3D=3D > > Coral codebase is currently hosted in a repository at github.com. The > > current version has been developed by system developers at Seoul > > National University, Viva Republica, Samsung, and LG. > > > > =3D=3D Meritocracy =3D=3D > > We plan to strongly support meritocracy. We will discuss the > > requirements in an open forum, and those that continuously contribute > > to Coral with the passion to strengthen the system will be invited as > > committers. Contributors that enrich Coral by providing various use > > cases, various implementations of the configurable components > > including ideas for optimization techniques will be especially > > welcome. Committers with a deep understanding of the system=E2=80=99s > > technical aspects as a whole and its philosophy will definitely be > > voted as the PMC. We will monitor community participation so that > > privileges can be extended to those that contribute. > > > > =3D=3D Community =3D=3D > > We hope to expand our contribution community by becoming an Apache > > incubator project. The contributions will come from both users and > > system developers interested in flexibility and extensibility of job > > executions that Coral can support. We expect users to mainly > > contribute to diversify the use cases and deployment characteristics, > > and developers to contribute to implement them. > > > > =3D=3D Alignment =3D=3D > > Apache Spark is one of many popular data processing frameworks. The > > system is designed towards optimizing jobs using RDDs in memory and > > many other optimizations built tightly within the framework. In > > contrast to Spark, Coral aims to provide more flexibility for job > > execution in an easy manner. > > > > Apache Tez enables developers to build complex task DAGs with control > > over the control plane of job execution. In Coral, a high-level > > programming layer (ex. Apache Beam) is automatically converted to a > > basic IR DAG and can be converted to any IR DAG through a series of > > easy user writable passes, that can both reshape and modify the > > annotation (of execution properties) of the DAG. Moreover, Coral > > leaves more parts of the job execution configurable, such as the > > scheduler and the data plane. As opposed to providing a set of > > properties for solid optimization, Coral=E2=80=99s configurable parts c= an be > > easily extended and explored by implementing the pre-defined > > interfaces. For example, an arbitrary intermediate data store can be > > added. > > > > Coral currently supports Apache Beam programs and we are working on > > supporting Apache Spark programs as well. Coral also utilizes Apache > > REEF for container management, which allows Coral to run in Apache > > YARN and Apache Mesos clusters. If necessary, we plan to contribute to > > and collaborate with these other Apache projects for the benefit of > > all. We plan to extend such integrations with more Apache softwares. > > Apache software foundation already hosts many major big-data systems, > > and we expect to help further growth of the big-data community by > > having Coral within the Apache foundation. > > > > =3D=3D Known Risks =3D=3D > > =3D=3D=3D Orphaned Products =3D=3D=3D > > The risk of the Coral project being orphaned is minimal. There is > > already plenty of work that arduously support different deployment > > characteristics, and we propose a general way to implement them with > > flexible and extensible configuration knobs. The domain of data > > processing is already of high interest, and this domain is expected to > > evolve continuously with various other purposes, such as resource > > disaggregation and using transient resources for better datacenter > > resource utilization. > > > > =3D=3D=3D Inexperience with Open Source =3D=3D=3D > > The initial committers include PMC members and committers of other > > Apache projects. They have experience with open source projects, > > starting from their incubation to the top-level. They have been > > involved in the open source development process, and are familiar with > > releasing code under an open source license. > > > > =3D=3D=3D Homogeneous Developers =3D=3D=3D > > The initial set of committers is from a limited set of organizations, > > but we expect to attract new contributors from diverse organizations > > and will thus grow organically once approved for incubation. Our prior > > experience with other open source projects will help various > > contributors to actively participate in our project. > > > > =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D > > Many developers are from Seoul National University. This is not > applicable. > > > > =3D=3D=3D Relationships with Other Apache Products =3D=3D=3D > > Coral positions itself among multiple Apache products. It runs on > > Apache REEF for container management. It also utilizes many useful > > development tools including Apache Maven, Apache Log4J, and multiple > > Apache Commons components. Coral supports the Apache Beam programming > > model for user applications. We are currently working on supporting > > the Apache Spark programming APIs as well. > > > > =3D=3D=3D An Excessive Fascination with the Apache Brand =3D=3D=3D > > We hope to make Coral a powerful system for data processing, meeting > > various needs for different deployment characteristics, under a more > > variety of environments. We see the limitations of simply putting code > > on GitHub, and we believe the Apache community will help the growth of > > Coral for the project to become a positively impactful and innovative > > open source software. We believe Coral is a great fit for the Apache > > Software Foundation due to the collaboration it aims to achieve from > > the big data processing community. > > > > =3D=3D Documentation =3D=3D > > The current documentation for Coral is at https://snuspl.github.io/ > coral/. > > > > =3D=3D Initial Source =3D=3D > > The Coral codebase is currently hosted at https://github.com/snuspl/ > coral. > > > > =3D=3D External Dependencies =3D=3D > > To the best of our knowledge, all Coral dependencies are distributed > > under Apache compatible licenses. Upon acceptance to the incubator, we > > would begin a thorough analysis of all transitive dependencies to > > verify this fact and further introduce license checking into the build > > and release process. > > > > =3D=3D Cryptography =3D=3D > > Not applicable. > > > > =3D=3D Required Resources =3D=3D > > =3D=3D=3D Mailing Lists =3D=3D=3D > > We will operate two mailing lists as follows: > > * Coral PMC discussions: private@coral.incubator.apache.org > > * Coral developers: dev@coral.incubator.apache.org > > > > =3D=3D=3D Git Repositories =3D=3D=3D > > Upon incubation: https://github.com/apache/incubator-coral. > > After the incubation, we would like to move the existing repo > > https://github.com/snuspl/coral to the Apache infrastructure > > > > =3D=3D=3D Issue Tracking =3D=3D=3D > > Coral currently tracks its issues using the Github issue tracker: > > https://github.com/snuspl/coral/issues. We plan to migrate to Apache > > JIRA. > > > > =3D=3D Initial Committers =3D=3D > > * Byung-Gon Chun > > * Jeongyoon Eo > > * Geon-Woo Kim > > * Joo Yeon Kim > > * Gyewon Lee > > * Jung-Gil Lee > > * Sanha Lee > > * Wooyeon Lee > > * Yunseong Lee > > * JangHo Seo > > * Won Wook Song > > * Taegeon Um > > * Youngseok Yang > > > > =3D=3D Affiliations =3D=3D > > * SNU (Seoul National University) > > * Byung-Gon Chun > > * Jeongyoon Eo > > * Geon-Woo Kim > > * Gyewon Lee > > * Sanha Lee > > * Wooyeon Lee > > * Yunseong Lee > > * JangHo Seo > > * Won Wook Song > > * Taegeon Um > > * Youngseok Yang > > > > * LG > > * Jung-Gil Lee > > > > * Samsung > > * Joo Yeon Kim > > > > * Viva Republica > > * Geon-Woo Kim > > > > =3D=3D Sponsors =3D=3D > > =3D=3D=3D Champions =3D=3D=3D > > Byung-Gon Chun > > > > =3D=3D=3D Mentors =3D=3D=3D > > * Hyunsik Choi > > * Byung-Gon Chun > > * Jean-Baptiste Onofr=C3=A9 > > * Markus Weimer > > * Reynold Xin > > > > =3D=3D=3D Sponsoring Entity =3D=3D=3D > > The Apache Incubator > > > > > > Thanks! > > Byung-Gon Chun > > > > -- > Jean-Baptiste Onofr=C3=A9 > jbonofre@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > For additional commands, e-mail: general-help@incubator.apache.org > > --94eb2c1a3b32230d0105643680c9--