incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Apple <jbap...@cloudera.com.INVALID>
Subject Re: Looking for Champion
Date Mon, 18 Jun 2018 21:38:11 GMT
Let me respond specifically to a few of these as a way to, I hope,
inspire the Palo community to reconsider contributing to Impala. It
could be a great opportunity for us to produce value by keeping the
query engine working smoothly while the Palo community can focus more
of their efforts on the storage system. There is some analogue here
with how Impala works on other storage systems.

> Firstly, as a query engine for Hadoop, Impala deeply depend on HDFS and
> HBase
> (At least several years ago it was like this)

Impala can run on other storage. See, for instance
http://impala.apache.org/docs/build/html/topics/impala_kudu.html and
http://impala.apache.org/docs/build/html/topics/impala_isilon.html

> Secondly, due to introduced Mesa data model. The Catalog is different from
> Impala.
> We developped a In-Memory Catalog and also support Rollup, aggregation
> data
> model. As a consequnce, we have to change sql grammar based on Impala.

Impala supports catalog data cached in memory, and adding new features
to Impala's SQL grammar is not forbidden. I think one of my first
largish contributions changed the grammar.

> Thirdly, it is a big difference in Cluster manager and node deployment.
> Contrast Impala, Query compiling, query execution coordination and catalog
> management of storage engine are integrated to be frontend daemon.
> Query execution and data storage are integrated to be backend daemon.

I'm not sure I understand - how is Palo different here?

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message