incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Feinauer <>
Subject Re: [DISCUSS] StreamPipes proposal
Date Sun, 03 Nov 2019 10:17:32 GMT

it would of course be awesome to have JB on board.
And indeed what JB suggests was the way I was also thinking of Streampipes and had a several
discussions with Dominik already.

Currently, Streampipes is some kind of mediator between several "external" engines running
in different processes or even different nodes.
But I would especially for edge applications highly welcome something which is more "edge"
centered and designed to run in one process (which then brings us back to OSGi or Karaf or
something in the long run).

My idea would be to have this as some sort of subproject which shares things like a data model
and other abstractions, so ideally we could peel out into a "shared" core (although Dominik
already assumed it to be tons of work... __ ).


Am 02.11.19, 18:14 schrieb "Jean-Baptiste Onofré" <>:

    Thanks guys !
    I'm cloning the existing codebase to dig into a little ;)
    On 02/11/2019 17:39, Christofer Dutz wrote:
    > Hi all,
    > I added him to the list.
    > Chris
    > Am 02.11.19, 11:53 schrieb "Dominik Riemer" <>:
    >     Yes, it would be super cool to have you as a mentor, thanks!
    >     We'll update the list in the wiki.
    >     Dominik
    >     -----Original Message-----
    >     From: Jean-Baptiste Onofré <> 
    >     Sent: Friday, November 1, 2019 6:49 PM
    >     To:
    >     Subject: Re: [DISCUSS] StreamPipes proposal
    >     Hi Dominik,
    >     it's an interesting proposal !
    >     It sounds kind of integration platform for IoT protocols (a specialized platform
compared to frameworks like Apache Camel or NiFi).
    >     I would be happy to be mentor on the podling if you want !
    >     Regards
    >     JB
    >     On 01/11/2019 16:51, Dominik Riemer wrote:
    >     > Hi all,
    >     > 
    >     > following up my previous mail, we would now like to start an open discussion
on bringing StreamPipes to the Apache Incubator. StreamPipes is an open source self-service
toolbox for analyzing (Industrial) IoT data streams. We are aware that one of our main challenges
will be to diversify the developer base and we are willing (and look forward!) to work on
    >     > 
    >     > The proposal can be found below and is also listed in the Incubator wiki:, thanks @Chris Dutz
for creating the page!
    >     > 
    >     > We appreciate anyone who would be willing to support us a an additional
    >     > 
    >     > 
    >     > Dominik
    >     > 
    >     > 
    >     > ----
    >     > StreamPipes Proposal
    >     > 
    >     > == Abstract ==
    >     > StreamPipes is a self-service (Industrial) IoT toolbox to enable non-technical
users to connect, analyze and explore (Industrial) IoT data streams.
    >     > 
    >     > = Proposal =
    >     > 
    >     > The goal of StreamPipes ( is to provide an easy-to-use
toolbox for non-technical users, e.g., domain experts, to exploit data streams coming from
(Industrial) IoT devices. Such users are provided with an intuitive graphical user interface
with the Pipeline Editor at its core. Users are able to graphically model processing pipelines
based on data sources (streams), data processors and data sinks. Data processors and sinks
are self-contained microservices, which implement either stateful or stateless processing
logic (e.g., a trend detection or image classifier). Their processing logic is implemented
using one of several provided wrappers (we currently have wrappers for standalone/Edge-based
processing, Apache Flink, Siddhi and working wrapper prototypes for Apache Kafka Streams and
Spark, in the future we also plan to integrate with Apache Beam). An SDK allows to easily
create new pipeline elements. Pipeline elements can be installed at runtime. To support users
in creating pipelines, an underlying semantics-based data model enables pipeline elements
to express requirements on incoming data streams that need to be fulfilled, thus reducing
modeling errors.
    >     > Data streams are integrated by using StreamPipes Connect, which allows to
connect data sources (based on standard protocols, such as MQTT, Kafka, Pulsar, OPC-UA and
further PLC4X-supported protocols) without further programming using a graphical wizard. Additional
user-faced modules of StreamPipes are a Live dashboard to quickly explore IoT data streams
and a wizard that generates code templates for new pipeline elements, a Pipeline Element Installer
used to extend the algorithm feature set at runtime.
    >     > 
    >     > === Background ===
    >     > StreamPipes was started in 2014 by researchers from FZI Research Center
for Information Technology in Karlsruhe, Germany. The original prototype was funded by an
EU project centered around predictive analytics for the manufacturing domain. Since then,
StreamPipes was constantly improved and extended by public funding mainly from federal German
ministries. In early 2018, the source code was officially released under the Apache License
2.0. At the same time, while we focused on bringing the research prototype to a production-grade
tool, the first companies started to use StreamPipes. Currently, the primary goal is to widen
the user and developer base. At ApacheCon NA 2019, after having talked to many people from
the Apache Community, we finally decided that we would like to bring StreamPipes to the Apache
    >     > 
    >     > === Rationale ===
    >     > The (Industrial) IoT domain is a highly relevant and emerging sector. Currently,
IoT platforms are offered by many vendors ranging from SMEs up to large enterprises. We believe
that open source alternatives are an important cornerstone for manufacturing companies to
easily adopt data-driven decision making. From our point of view, StreamPipes fits very well
into the existing (I)IoT ecosystem within the ASF, with projects such as Apache PLC4X focusing
on connecting machine data from PLCs, or other tools we are also using either in the core
of StreamPipes or with integrations (Apache Kafka, Apache IoTDB, Apache Pulsar). StreamPipes
itself focuses on enabling self-service IoT data analytics for non-technical users.
    >     > The whole StreamPipes code is currently on Github. To get a rough estimate
of the project size: 
    >     > * streampipes: Backend and core modules, ~3300 commits
    >     > * streampipes-ui: User Interface, ~1300 commits
    >     > * streampipes-pipeline-elements: ~100 Pipeline Elements (data 
    >     > processors/algorithms and sinks), ~500 Commits
    >     > * streampipes-connect-adapters: ~20 Adapters to connect data, ~100 commits
To achieve our goal to further extend the code base with new features, new connectors and
new algorithms and to grow both the user and developer community, we believe that a community-driven
development process is the best way to further develop StreamPipes. Finally, after having
talked to committers from various Apache IoT-related projects and participation in spontaneous
hacking sessions and being impressed by the collaboration among individual projects, we decided
that (from our point of view) the ASF is the ideal place to be the future home of StreamPipes.
    >     > 
    >     > === Initial Goals ===
    >     > * Move the existing codebase to Apache
    >     > * Fully align with Apache development- and release processes
    >     > * Perform name search and do a thorough review of existing licenses
    >     > * First Apache release
    >     > 
    >     > === Current Status ===
    >     > ** Meritocracy **
    >     > We are absolutely committed to strengthen StreamPipes as a real community-driven
open source project. The existing committer base is highly motivated to foster the open source
way in the industrial IoT sector and, together with existing Apache communities focused on
this domain, provide open source tooling for Industrial IoT projects in the same way Apache
offers in the Big Data space, for instance.
    >     > The development philosophy behind StreamPipes has always followed the principles
of meritocracy - although most committers are still active in the project, we managed to onboard
new, committed developers regularly. 2 people, who are today core of the developer team, have
joined during the past year. Therefore, we would aim to continuously expand the PMC and committer
base based on merit. 
    >     > 
    >     > ** Community **
    >     > Since being open-sourced in 2018, the public interest in StreamPipes has
steadily grown. Several companies, mainly from the manufacturing domain, have tested StreamPipes
in form of proof-of-concept projects. First companies have started to use StreamPipes in production.
This was due to a high number of events from meetups, research conferences, demo sessions
up to hackathons we participated or organized during the past two years. After having generated
a general interest in StreamPipes, our next focus will be to find more committers to diversify
the contributor base.
    >     > 
    >     > ** Core Developers **
    >     > The core developers of the system are Dominik Riemer, Philipp Zehnder, Patrick
Wiener and Johannes Tex. All core developers are initial committers in the current proposal.
Some former students who recently started to work at companies and who have also worked on
the project with great commitment, will be asked to further contribute to the project.  
    >     > 
    >     > ** Alignment **
    >     > StreamPipes has dependencies to a lot of existing Apache projects - this
is one reason why we think that the ASF is the best future home for StreamPipes. The messaging
layer is based on Apache Kafka (and also Apache Pulsar as a future option), and runtime wrappers
exist for Apache Flink, Apache Spark and Apache Kafka Streams. StreamPipes Connect already
includes adapters for several Apache projects. Most importantly, we integrate (and plan to
deepen the integration) with IIoT-focused projects such as Apache PLC4X. Also, several data
sinks exist to send messages to tools from other Apache projects (e.g., Apache Kafka, Apache
Pulsar, and Apache IoTDB). Together with these tools (and also after having talked to the
core developers after this year's ApacheCon) we are absolutely convinced that a tight integration
between these tools will strengthen the open source IoT ecosystem.
    >     > 
    >     > === Known Risks ===
    >     > ** Orphaned Products **
    >     > We don't expect the risk of an orphaned product. The initial committers
have worked on the project for years and are absolutely committed to making this open source
tool a great success. All initial committers are committed to work on StreamPipes in their
free time.
    >     > 
    >     > ** Inexperience with Open Source **
    >     > All initial committers have years of expertise related to open source development
and understand what open source means. However, none of the initial committers are currently
committers to Apache projects, although some have already contributed to some projects. From
a variety of events and from intensively studying Apache mailing lists, we are sure that the
Apache Way is the way we'd like the project to move into the future. We expect to benefit
from the experiences from the ASF in building successful open source projects.
    >     > 
    >     > ** Length of Incubation **
    >     > We are aware that incubation is a process that is focused on building the
community, learning the Apache Way and other important things such as learning the release
process and handling licensing and trademark issues. We are also aware that, although there
is a steadily increasing interest in StreamPipes, a major challenge we would need (and are
willing) to work on during the incubation phase is widening the committer base. We look forward
to that as a large developer base is exactly what we are striving for.
    >     > 
    >     > ** Homogeneous Developers **
    >     > Most current developers work for the same institution (FZI). The motivation
of all developers goes beyond their commitment to work and all current committers work on
StreamPipes in their free time. Recently, we have received first pull requests from external
contributors and a growing interest from users and companies outside of FZI. First manufacturing
companies have already evaluated and adopted StreamPipes. To attract external developers,
we've created an extensive documentation, have a Slack channel to quickly answer questions,
and provide help via mail. Therefore, we believe that making the developer community more
heterogeneous is not only mandatory, but something that can be achieved during the next months.
    >     > 
    >     > ** Reliance on salaried developers **
    >     > Currently, StreamPipes receives support from salaried developers, mainly
research scientists from FZI. However, all core developers substantially work on StreamPipes
in their spare time. As this has been the case from the beginning in early 2014, it can be
expected that a substantial portion of volunteers will continue to be working on the project
and we aim at strengthening the base of non-paid committers and paid committers of other companies.
At the same time, funding of the initial StreamPipes team is secured by public funding for
the next few years, making sure that there will be also enough commitment from developers
during their work time. 
    >     > 
    >     > ** Relationships with other Apache products ** StreamPipes is often 
    >     > compared to tools such as Node-Red and Apache Nifi. This is mainly based
on a similar UI concept (dataflow approach). Despite some technological differences (e.g.,
the microservice analytics approach vs. single-host runtime of Node-Red, the wrapper architecture
and the underlying semantics-based model), we believe the target audience differs. We aim
to collaborate with the Apache Nifi community in terms of exchanging best practices and also
integrating both projects (e.g., by building connectors).
    >     > As mentioned above, quite a few adapters and data sinks are already available
that link to existing Apache projects.
    >     > 
    >     > ** An excessive fascination with the Apache Brand ** Although we 
    >     > recognize the Apache brand as the most visible brand in the open source
domain, the primary goal of this proposal is not to create publicity, but to widen the developer
base. We believe that successful projects have broad and diverse communities. We expect that
an Apache project, with a clear and proven way to develop open source software, helps in finding
new committers. As the core development team has already worked on StreamPipes for the past
few years and is fully committed to the software and its benefit for the industrial IoT domain,
we would also continue development without being an Apache project.
    >     > 
    >     > === Documentation ===
    >     > Currently, we host a website at More technical
info (user + developer guide) can be found in the documentation:,
where users can find tutorials and manuals on how to extend StreamPipes using the SDK.
    >     > 
    >     > === Initial Source ===
    >     > Currently, the following Github repositories exist, all licensed under the
Apache Software License 2.0:
    >     > * streampipes (, the 
    >     > backend & pipeline management module)
    >     > * streampipes-ui (, 
    >     > the UI module)
    >     > * streampipes-pipeline-elements 
    >     > (, 
    >     > library of data processors and sinks)
    >     > * streampipes-connect-adapters 
    >     > (, 
    >     > StreamPipes connect adapters)
    >     > * streampipes-docs 
    >     > (, the 
    >     > abovementioned documentation)
    >     > 
    >     > === Source and intellectual property submission plan === All initial committers
will sign a ICLA with the ASF. FZI, as the organizational body that has employed the main
contributors of StreamPipes, will sign a CCLA and donate the codebase to the ASF (both subject
to formal approval). All major contributors are still active in the project.
    >     > 
    >     > === External Dependencies ===
    >     > We did an initial review of all dependencies used in the various projects.
No critical libraries that depend on category X licenses were found, some minor issues have
already been resolved (e.g., removing dependencies to org.json libraries). Most external dependencies
used by the Java-based (backend, pipeline-elements and connect) modules are licensed under
the Apache License 2.0, whereas some licenses are Cat B (e.g., CDDL). Most external dependencies
the UI requires on are licensed under the MIT license. 
    >     > Once we are moving to the Incubator, we would do a complete check of all
transitive dependencies. We don't expect any surprises here.
    >     > 
    >     > === Cryptography ===
    >     > (not applicable)
    >     > 
    >     > === Required Resources ===
    >     > 
    >     > ** Mailing Lists **
    >     > We plan to use the following mailing lists:
    >     > *
    >     > *
    >     > *
    >     > *
    >     > As StreamPipes is targeted to a non-technical audience, we see a dedicated
user mailing list as an important requirement to help users.
    >     > 
    >     > ** Subversion directory **
    >     > (not applicable)
    >     > 
    >     > ** Git repositories **
    >     > We would like to use Git for source code management and enable Github mirroring
    >     > 
    >     > As we plan to merge some of the repos described above to simplify the release
process we ask to create the following source repositories:
    >     > * streampipes (containing backend + UI)
    >     > * streampipes-extensions (containing modules that can be dynamically 
    >     > installed at runtime: pipeline elements and connect adapters)
    >     > * streampipes-website (containing docs + website)
    >     > 
    >     > ** Issue tracking **
    >     > JIRA ID: StreamPipes
    >     > 
    >     > === Initial Committers ===
    >     > List of initial committers in alphabetical order:
    >     > * Christofer Dutz (christofer.dutz at c-ware dot de)
    >     > * Dominik Riemer (dominik dot riemer at gmail dot com)
    >     > * Johannes Tex (tex at fzi dot de)
    >     > * Patrick Wiener (wiener at fzi dot de)
    >     > * Philipp Zehnder (zehnder at fzi dot de)
    >     > 
    >     > === Sponsors ===
    >     > 
    >     > ** Champion **
    >     > * Christofer Dutz (christofer.dutz at c-ware dot de)
    >     > 
    >     > ** Mentors **
    >     > * Christofer Dutz (christofer.dutz at c-ware dot de)
    >     > * Julian Feinauer (Jfeinauer at apache dot org)
    >     > * Kenneth Knowles (kenn at apache dot org)
    >     > * Justin Mclean (justin at classsoftware dot com)
    >     > 
    >     > ** Sponsoring Entity **
    >     > The Apache Incubator
    >     > 
    >     > 
    >     > ---------------------------------------------------------------------
    >     > To unsubscribe, e-mail:
    >     > For additional commands, e-mail:
    >     > 
    >     --
    >     Jean-Baptiste Onofré
    >     Talend -
 X  ܚX K??K[XZ[
    >      ? [ \ [
    >     ][  X  ܚX P?[  X ]?܋ \?X ?K ܙ B  ܈?Y??]?[ۘ[??  [X[ ? ??K[XZ[
    >      ? [ \ [
    >     Z?[???[  X ]?܋ \?X ?K ܙ B
    >     ?Т���������������������������������������������������������������������ХF�?V�7V'67&�&R�?R��?�â?vV�W&?��V�7V'67&�&T?��7V&?F�"�???6�R��&pФf�"??FF�F���?�?6���?�G2�?R��?�â?vV�W&?�ֆV�??��7V&?F�"�???6�R��&p
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail:
    > For additional commands, e-mail:
    Jean-Baptiste Onofré
    Talend -

View raw message