incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <>
Subject Re: Hello World / CRUNCH Framework
Date Sun, 16 Dec 2018 07:20:36 GMT
Hi Julian,

Regarding whether to do this as a streaming engine (with its own query language) or as a framework
above a streaming engine, I’d say that’s a false choice. If there is relational algebra
inside your system, you can provide a high-level query language that can be translated to
a lower-level query language in a streaming engine.

This approach of “layered” databases has worked well for me for several projects, and
is ever more applicable these days as data is becoming federated.

You and I have discussed SQL’s MATCH_RECOGNIZE clause as a way to build complex time-based
logic. You have probably noticed that is now in Flink, I am working on it in Calcite, and
Beam will probably get it at some point. Even if MATCH_RECOGNIZE doesn’t solve your problem,
let’s follow the same approach - convert your problem to a DSL that maps to or extends relational
algebra, and then figure out how to translate that to SQL in an underlying engine. Calcite
is a very good platform for building new “data languages”, so let’s carry on talking.


> On Dec 14, 2018, at 2:11 AM, Julian Feinauer <> wrote:
> Hi all,
> I just joined the incubator ML and wanted to present myself and possibly also start a
discussion about a software project we developed in the past.
> But first things first. My name is Julian Feinauer and I come from Germany where I run
two “start-up” companies where we work a lot on the “industrial IoT” topics, data
science and processing of “larger amounts of data”. We love open source and so we love
the ASF. Most notably, I closely follow the Apache Calcite project and hopefully find some
time soon to contribute a bit more than in the last monts. Futhermore, I am engaged in the
(incubating) PLC4X project as (P)PMC and in the  (incubating) Edgent project where I try to
“revive” the community as new (P)PMC together with Christopher Dutz.
> Now to the real topic. Over the last 3 years I started to develop a “Framework/Library”
(currently a set of jars) to facilitate processing of timeseries data. The focus is mostly
on processing of data from test stands, e.g., automotive tests, driving profiles and so on.
Furthermore, in the recent year we added a lot of functionality for processing of “industrial
data”. This means that we want to make it easy to analyze things like “how long did the
machine spend in this state”, “when are the following set of bits set” or “nofity
when the following conditions is true for the first time”.
> It is a bit technical and I don’t want to go too deep into it, but generally speaking
we try to introduce the “right” semantics to answer the typical questions when analyzing
machine or test data. This project is called “CRUNCH” and we are in the process of making
it open source (will be moved to a public github repo in this year) under the Apache 2.0 License.
> As there can be seen a close relationship to other (incubating or TLP) projects we are
thinking about if this project could fit into the incubator. Some examples for Apache projects
that we see as “related” are Apache Flink (which we can use as the Streaming Engine to
process the stream), (incubating) Edgent which we also can support as Streaming Engine and
where we try to find a suitable project goal and community currently as some of the (P)PMC
members retired or went inactive. Finally, CRUNCH has a very natural fit with PLC4X because
it can directly process the data gathered form PLCs (and in fact we are already using it in
some of our projects that way). I had several discussions with some of the (P)PMCs of PLC4X,
namely Sebastian Rühl and Christpher Dutz wo encouraged me to introduce the project to the
incubator because they also see some potential for the project to enrich the OSS ecosystem
with regards to edge / stream processing of (I)IoT data.
> So please feel free to ask questions or discuss your view on this topic as I would like
to find out if this project could fit in the Apache Ecosystem and the Incubator or not.
> Thank you already!
> Julian

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message