incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 俊平堵 <junping...@apache.org>
Subject Re: [DISCUSS] Linkis Proposal (a Computation Middleware project)
Date Tue, 13 Jul 2021 01:32:25 GMT
+1.
Linkis is an interesting project that builds a unified layer to decouple
the upper layer computation/application and under layer data engines, such
as Spark, Presto, Flink, etc. Also it looks already build a community
around the project.
Good luck!


Shuai Di <shuaidi1024@163.com>于2021年7月12日 周一上午11:13写道:

> Greetings!
>
>
> We would like to start an open discussion on bringing Linkis (
> https://github.com/WeBankFinTech/Linkis), a computation middleware
> project, to the Apache Incubator.
> The proposal can be found below and is also listed in the Incubator wiki:
> https://cwiki.apache.org/confluence/display/INCUBATOR/LinkisProposal,
> thanks @Junping Du for creating the page!
> We appreciate anyone who would give guidance or be willing to support us a
> an additional mentor.
> ======
> Linkis Proposal
>
>
> =Abstract=
> Linkis builds a computation middleware layer to decouple the upper
> applications and the underlying data engines, provides standardized
> interfaces (REST, JDBC, WebSocket etc.) to easily connect to various
> underlying engines (Spark, Presto, Flink, etc.), while enables cross engine
> context sharing, unified job& engine governance and orchestration.
> Linkis codebase: https://github.com/WeBankFinTech/Linkis
>
>
> =Proposal=
> Linkis is designed to solve computation governance problems in complex
> distributed environments (typically in a big data platform), where you have
> to deal with different types, versions, or clusters of underlying data
> engines and hundreds of diversified engine clients at the upper application
> layer as well.
> Linkis acts as a proxy between the upper applications layer and underlying
> engines layer. By abstracting and implementing the 3 common phases of a
> job/request for submit, prepare and execute, Linkis is able to facilitate
> the connectivity, governance and orchestration capabilities of different
> kind of engines like OLAP, OLTP (developing), Streaming, and handle all
> these "computation governance" affairs in a standardized reusable way.
> We are actively operating the Linkis community and we are looking forward
> to increase community activity continuously.We propose to contribute the
> Linkis codebase to the Apache Software Foundation. We believe that bringing
> Linkis into Apache Software Foundation and following the COMMUNITY-LED
> DEVELOPMENT "APACHE WAY" could continuously improve project quality and
> community vitality.
>
>
> =Background=
> In today's complex and distributed environment, the communication,
> coordination and governance of application services have developed mature
> solutions from SOA to micro-services, and many practices from ESB to
> Service Mesh to decouple different services.
> However, things go different while an application service needs to
> communicate with the underlying engines. Engines are isolated from each
> other, and the client-server tight coupling pattern goes everywhere. Each
> and every upper application has to directly connect to and access various
> underlying engines in a tightly coupled way, and solves the "computation
> governance" problems on its own, including maintaining different client
> environments, submiting the job, monitoring job status, fetching the
> output, handling large number of concurrent client instances, watching the
> bad jobs, adapt to engine version changes, etc.
> It lacks a common layer of "computation middleware" between the numerous
> upper-layer applications and the countless underlying engines to handle all
> these "computation governance" affairs in a standardized reusable way,
> that's why we started the Linkis project.
> Firstly, Linkis could reduce the complexity of connectivity. Instead of
> maintaining a variety of engine client environments, users now only need to
> install the Linkis client, or even just HTTP client while using the REST
> interface. Routing query to desired clusters could be done by simply
> providing a tag.
> Secondly, Links provides governance capabilities such as multi-tenancy,
> concurrency control, resource management, query validation, privilege
> enhancement and auditing.
> Meanwhile, Linkis enables orchestration strategies such as routing,
> load-balance, active-active and hybrid computation across engines (some
> still under development).
>
>
> =Rationale=
> Linkis is built on distributed microservice architecture with great
> scalability and extendibility. The enhancements of high concurrency and
> fault tolerance make it more stable and reliable. It has already supported
> many production environments with large number of daily jobs over a long
> term.
> Linkis's microservices are divided into 3 groups: Computation Governance
> Services, Public Enhancement Services, and Microservice Governance Services.
> Computation Governance Services(CGS) group is responsible for the core
> process of job/request submission, preparation and execution, lifecycle
> management, resource management, validation and orchestration.
> Public enhancement Services(PES) group provides basic public functions
> including job context sharing, material management and data source
> management, to serve other Linkis services and upper application systems.
> Microservice Governance Services(MGS) group includes customized Spring
> Cloud Gateway, Eureka and Open Feign, to provide basic functions like
> routing, service registration and discovery, and RPC framework.
> By providing capabilities of multi-tenant, high concurrency, job
> dispatching/management policies, unified resource control and
> orchestration, Linkis makes the submission, preparation and execution of
> computation jobs more flexible, reliable and controllable, and successfully
> return the results. It could greatly reduce the overall development,
> operation and maintenance costs, and the architecture complexity.
> Based on Linkis the computation middleware, new upper layer applications
> could be quickly developed by reusing the Linkis computation governance
> functions, as what’s done in the open source big data platform suite
> “WeDataSphere” (https://github.com/WeBankFinTech/WeDataSphere).
> Linkis currently mainly supports OLAP and Streaming engines, and we are
> planning to support OLTP engines better. Containerization is also one of
> the important development directions of Linkis.
>
>
> =Initial Goal=
> - Migrate the existing codebase, website, and documentation to
> Apache-hosted infrastructure.
> - Work with the infrastructure team to implement and approve our code
> review, build, and testing workflows in the context of the ASF.
> - Incremental development and release under Apache guidelines.
> - Grow and diversify the Linkis community in the Apache Way.
>
>
> =Current Status=
> ==Meritocracy==
> Linkis project was started at WeBank and has been an open-source project
> on GitHub since July 2019. Linkis has been quickly adopted by many
> organizations, more than 500 organizations have tested Linkis based on our
> sandbox application records, dozens of them have introduced Linkis into
> production based on the users’ spontaneous feedbacks, distributed in
> various industries including banking, telecommunications, insurance,
> manufacturing, education, internet, etc.
> Linkis already has contributors and users from different companies. We’ve
> set up the Committer team and we’re constantly seeking for potential new
> committer. New Contributors are always highly welcomed and guided by
> existed committers. Users could get timely support from community IM groups
> and GitHub.
>
>
> ==Community==
> Linkis now has 15 committers from 6 companies including WeBank, China
> Telecom, Kanzhun Ltd., iQIYI Inc., HONOR Mobile Phone, and Samoyed Digital.
> We have a developer IM group for more than 100 people from different
> organizations, and 9 user IM groups for more than 4,500 people.
>
>
> ==Core Developers==
> The core developers of Linkis are working in the big data team of
> different companies, mainly in WeBank since the project was initiated there.
> - Shuai Di (WeBank)
> - Qiang Yin (WeBank)
> - Heping Wang (WeBank)
> - Yongkun Yang (WeBank)
> - Zhiyue Yang (WeBank)
> - You Liu (WeBank)
> - XiaoGang Wang (China Telecom)
> - Hui Zhu (Kanzhun)
> - Zheng Wang (iQiyi)
> - Rong Zhang (Honor)
>
>
> ==Releases==
> Linkis has released multiple versions as listed here:
> https://github.com/WeBankFinTech/Linkis/releases
> We will follow the ASF guidelines more closely, and adopt the ASF source
> release process upon joining the incubator.
>
>
> ==Code Reviews==
> Linkis’s code reviews are currently public on Github:
> https://github.com/WeBankFinTech/Linkis/pulls .
>
>
> ==Alignment==
> As Linkis was built to address connectivity and other computation
> governance issues with various underlying engines, it depends on multiple
> ASF projects such as Spark, Flink, Hive and Hadoop. Linkis’s Engine
> Connector Manager service will start different Engine Connectors to connect
> to different underlying engines, providing computation governance abilities
> which benefits the usage and maintenance of these engines. Linkis will
> continue to expand the types of engines it supports in ASF projects, such
> as HBase, Kylin, and more.
>
>
> =Known Risks=
> ==Orphaned Products==
> The risk of Linkis becoming an orphan product is very low, because it’s
> already been the core infrastructure component in the production
> environments of dozens of companies' big data platforms, including large
> companies like WeBank, China Telecom, Ping An Insurance Company, Hikvision,
> etc. Hundreds of thousands of computation jobs are performed through Linkis
> in these companies everyday. Developers from these companies are
> increasingly joining the Linkis community as contributors.
> Linkis has 12 major releases so far, and received 355 PRs from
> contributors, which indicates the activity and vitality of the Linkis
> community. Linkis is also the core component of the open source big data
> platform suite “WeDataSphere”, even more users and developers are already
> active in this larger community.
> We are looking forward to further expand and diversify the community by
> joining Apache. We are also further improving the adherence to the
> Community-Led development pattern, and the standardization and transparency
> of community governance.
>
>
> ==Inexperience with Open Source==
> Linkis’s core developers have been running Linkis as a community-oriented
> open source project for a period of time, some of them already have
> experience working with other open source communities. The current Linkis
> user group scale of more than 4500 people is also a proof of our commitment
> and passion for operating the open source community.
> Meanwhile, we’ve begun to refine our community governance efforts under
> the guidance of Apache mentors, and we’ll learn more about how to operate
> the open source community effectively and properly by following the Apache
> way in our incubator journey.
>
>
> ==Homogenous Developers==
> Most of the current core developers work at WeBank where the Linkis
> project started. We also had developers from China Telecom, Kanzhun, iQiyi
> and Honor Mobile Phone elected to the committer group, and already have led
> the release of several versions of Linkis. Samoyed Digital has the latest
> nominated committer because of their solid contributions to Linkis data
> source management module.
> Though Linkis community may not be diverse enough yet, we are constantly
> looking for new contributors and potential committers to enhance the
> diversity of the community and the vitality of the project.
>
>
> ==An Excessive Fascination with the Apache Brand==
> We acknowledge that the Apache brand would add a lot of value and
> reputation to Linkis, and will benefit the cooperation and promotion at the
> global scale. However, our primary purpose is to build a more diverse and
> viable community and to gain stability for long-term development as
> submitting Linkis to Apache. We will also strictly follow the ASF's rules
> and policies under the guidance of the Incubator PMC.
>
>
> =Documentation=
> Documentation about Linkis can be found at
> https://github.com/WeBankFinTech/Linkis-Doc . Following links provide
> more information:
> - Codebase at Github: https://github.com/WeBankFinTech/Linkis
> - Issue Tracking: https://github.com/WeBankFinTech/Linkis/issues
> - Releases: https://github.com/WeBankFinTech/Linkis/releases
> =Initial Source=
> https://github.com/WeBankFinTech/Linkis
>
>
> =External Dependencies=
> Back-end:
> | Dependencies |License|Comment|
> |caffeine|Apache 2.0|
> | cglib | Apache 2.0 |
> | commons-beanutils | Apache 2.0 |
> | commons-codec | Apache 2.0 |
> | commons-collections | Apache 2.0 |
> | commons-dbcp | Apache 2.0 |
> | commons-exec | Apache 2.0 |
> | commons-io | Apache 2.0 |
> | commons-lang3 | Apache 2.0 |
> | commons-math3 | Apache 2.0 |
> | commons-net | Apache 2.0 |
> | commons-text | Apache 2.0 |
> | dozer-core | Apache 2.0 |
> | druid | Apache 2.0 |
> | fastjson | Apache 2.0 |
> | gson | Apache 2.0 |
> | guava | Apache 2.0 |
> | hadoop-auth | Apache 2.0 |
> | hadoop-client | Apache 2.0 |
> | hadoop-common | Apache 2.0 |
> | hadoop-hdfs | Apache 2.0 |
> | hadoop-yarn-client | Apache 2.0 |
>
> | hive-common | Apache 2.0 |
> | hive-exec | Apache 2.0 |
> | hive-jdbc | Apache 2.0 |
> | httpclient | Apache 2.0 |
> | httpmime | Apache 2.0 |
> | jackson-annotations | Apache 2.0 |
> | jackson-databind | Apache 2.0 |
> | jackson-module-scala | Apache 2.0 |
> | javacsv | LGPL |
> | jaxrs-ri | CDDL, GPL 1.1 | will remove |
> | jersey-container-servlet | CDDL, GPL 1.1 | will remove |
> | jersey-container-servlet-core | CDDL, GPL 1.1 | will remove |
> | jersey-entity-filtering | CDDL, GPL 1.1 | will remove |
> | jersey-json | CDDL, GPL 1.1 | will remove |
> | jersey-media-json-jackson | CDDL, GPL 1.1 | will remove |
> | jersey-media-multipart | CDDL, GPL 1.1 | will remove |
> | jersey-server | CDDL, GPL 1.1 | will remove |
> | jersey-servlet | CDDL, GPL 1.1 | will remove |
> | jersey-spring3 | CDDL, GPL 1.1 | will remove |
> | jetty-server | Apache 2.0, EPL 1.0 |
> | jetty-webapp | Apache 2.0, EPL 1.0 |
> | json4s-jackson | Apache 2.0 |
> | jsp-api | CDDL, GPL 2.0 | will remove |
> | junit | EPL 1.0 |
> | libthrift | Apache 2.0 |
> | log4j-1.2-api | Apache 2.0 |
> | log4j-api | Apache 2.0 |
> | log4j-core | Apache 2.0 |
> | log4j-slf4j-impl | Apache 2.0 |
> | mockito-all | MIT |
> | mybatis-plus-boot-starter | Apache 2.0 |
> | mysql-connector-java | GPL 2.0 | will remove |
> | netty-all | Apache 2.0 |
> | pagehelper | MIT |
> | poi-ooxml | Apache 2.0 |
> | protostuff-api | Apache 2.0 |
> | protostuff-core | Apache 2.0 |
> | protostuff-runtime | Apache 2.0 |
> | py4j | BSD 2-clause |
> | reactor-netty | Apache 2.0 |
> | reflections | BSD 2-clause |
> | scalacheck | BSD 3-clause |
> | scalacheck-shapeless | Apache 2.0 |
> | scala-compiler | Apache 2.0 |
> | scala-library | Apache 2.0 |
> | scalamock-scalatest-support | MIT |
> | scalap | Apache 2.0 |
> | scala-reflect | Apache 2.0 |
> | scalatest | Apache 2.0 |
> | slf4j-api | MIT |
> | spark-core | Apache 2.0 |
> | spark-hive | Apache 2.0 |
> | spark-repl | Apache 2.0 |
> | spark-sql | Apache 2.0 |
> | spark-testing-base | Apache 2.0 |
> | spoiwo | MIT |
> | spring-boot | Apache 2.0 |
> | spring-boot-actuator-autoconfigure | Apache 2.0 |
> | spring-boot-starter | Apache 2.0 |
> | spring-boot-starter-actuator | Apache 2.0 |
> | spring-boot-starter-aop | Apache 2.0 |
> | spring-boot-starter-cache | Apache 2.0 |
> | spring-boot-starter-jetty | Apache 2.0 |
> | spring-boot-starter-log4j2 | Apache 2.0 |
> | spring-boot-starter-quartz | Apache 2.0 |
> | spring-boot-starter-reactor-netty | Apache 2.0 |
> | spring-boot-starter-web | Apache 2.0 |
> | spring-cloud-commons | Apache 2.0 |
> | spring-cloud-config-client | Apache 2.0 |
> | spring-cloud-context | Apache 2.0 |
> | spring-cloud-gateway-core | Apache 2.0 |
> | spring-cloud-starter | Apache 2.0 |
> | spring-cloud-starter-config | Apache 2.0 |
> | spring-cloud-starter-gateway | Apache 2.0 |
> | spring-cloud-starter-netflix-eureka-client | Apache 2.0 |
> | spring-cloud-starter-netflix-eureka-server | Apache 2.0 |
> | spring-cloud-starter-openfeign | Apache 2.0 |
> | spring-core | Apache 2.0 |
> | spring-jdbc | Apache 2.0 |
> | spring-security-crypto | Apache 2.0 |
> | spring-test | Apache 2.0 |
> | spring-tx | Apache 2.0 |
> | spring-web | Apache 2.0 |
> | websocket-client | Apache 2.0, EPL 1.0 |
> | websocket-server | Apache 2.0, EPL 1.0 |
> | xlsx-streamer | Apache 2.0 |
>
> | xstream | BSD 3-clause |
>
>
> Front-end:
> |Dependencies|License|Comment|
> | axios | MIT |
> | highlight.js | BSD-3-Clause |
> | iview | MIT |
>
> | lodash | MIT |
> | moment | MIT |
> | monaco-editor | MIT |
> | sql-formatter | MIT |
> | svgo | MIT |
> | vue | MIT |
> | vue-i18n | MIT |
> | vue-router | MIT |
> | vuedraggable | MIT |
> | vuescroll | MIT |
>
> =Required Resources=
> ==Mailing List==
> Currently Linkis has no mailing list. The usual mailing lists are expected
> to be set up when entering incubation:
> - private@linkis.incubator.apache.orgfor PPMC discussions;
> - dev@linkis.incubator.apache.org for development discussions;
> - notification@linkis.incubator.apache.org for user notifications, and
> notifications from GitHub.
>
> ==Git Repositories==
> Upon entering incubation, we request to move the existing repository from
> https://github.com/WeBankFinTech/Linkis to Apache infrastructure like
> https://github.com/apache/Incubator-Linkis.
>
> ==Issue Tracking==
> The Linkis community would like to continue using GitHub Issues if
> possible.
>
> ==Other Resources==
> Apache Jenkins
>
> =Source and Intellectual Property Submission Plan=
> Most of the current code is Apache 2.0 licensed and the copyright is
> assigned to WeBank. If the project enters incubator, WeBank will transfer
> the source code & trademark ownership to ASF via a Software Grant Agreement.
>
>
>
> =Initial Committers=
> - Shuai Di (shuaidi1024@163.com)
> - Qiang Yin (690574002@qq.com)
> - Heping Wang (374126165@qq.com)
> - Yongkun Yang (wimkunkun@gmail.com)
> - Zhiyue Yang (904666286@qq.com)
> - You Liu (405240259@qq.com)
> - Deyi Hua (david_hua1996@hotmail.com)
> - Le Bai (120190695@qq.com)
> - Xiaogang Wang (913546481@qq.com)
> - Hui Zhu (46580583@qq.com)
> - Zhen Wang (643348094@qq.com)
> - Rong Zhang (693404752@qq.com)
> - Xiaohua Yi (405078363@qq.com)
> - Ke Zhou (zhouke309@vip.qq.com)
> - Jian Xie (xj2jx@163.com)
>
> =Affiliations=
> Shuai Di, Qiang Yin, Heping Wang, Yongkun Yang, Zhiyue Yang, You Liu, Deyi
> Hua, Le Bai, Ke Zhou and Jian Xie of the initial committers are employees
> of WeBank.
> Xiaogang Wang of the initial committers is an employee of China Telecom.
> Hui Zhu of the initial committers is an employee of Kanzhun.
> Zhen Wang of the initial committers is an employee of iQiyi.
> Rong Zhang of the initial committers is an employee of HONOR Mobile Phone.
> Xiaohua Yi of the initial committers is an employee of Samoyed Digital.
>
> =Sponsors=
> ==Champion==
> Junping_Du (ASF Member, IPMC Member), junping_du@apache.org
>
> ==Nominated Mentors==
> Shao Feng Shi (ASF Member, IPMC Member), shaofengshi@apache.org
> Duo Zhang (ASF Member, IPMC Member), zhangduo@apache.org
> Jerry Shao (ASF Member, IPMC Member), jshao@apache.org
> Lidong Dai (IPMC Member), lidongdai@apache.org
>
> =Sponsoring Entity=
> We request the Apache Incubator to sponsor this project.
> ======
> Best Regards,
> Shuai Di

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message