phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Phoenix Performances & Uses Cases
Date Mon, 29 Oct 2018 14:47:42 GMT
Specifically to your last two points about windowing, transforming, 
grouping, etc: my current opinion is that Hive does certain analytical 
style operations much better than Phoenix. Personally, I don't think it 
makes sense for Phoenix to try to "catch up". It would take years for us 
to build such capabilities on par with what they have.

Some of us have been making efforts to ease data access between Hive and 
Phoenix via the PhoenixStorageHandler for Hive. The goal of this is that 
it will make your life easier to use the correct tool for the job. Use 
Hive when Hive does things well, and use Phoenix when Phoenix does it well.

(Again, this is my opinion. It is not meant to be some declaration of 
direction by the entire Apache Phoenix community)

On 10/27/18 7:50 AM, Nicolas Paris wrote:
> Hi
> I am benchmarking phoenix to better understand its strength and
> weaknesses. My basis is to compare to postgresql for OLTP workload and
> hive llap for OLAP workload. I am testing on a 10 computer cluster
> instance with hive (2.1) and phoenix (4.8)  220 GO RAM/32CPU versus a
> postgresql (9.6) 128GO RAM 32CPU.
> Right now, my opinion is:
> - when getting a subset on a large table, phoenix performs the
>    best
> - when getting a subset from multiple large tables, postgres performs
>    the best
> - when getting a subset from a large table joining one to many small
>    table, phoenix performs the best
> - when ingesting high frequency data, Phoenix performs the best
> - when grouping by query, hive > postgresql > phoenix
> - when windowning, transforming, grouping, hive performs the best,
>    phoenix the worst
> Finally, my conclusion is  phoenix is not intended at all for analytics
> queries such grouping, windowing, and joining large tables. It suits
> well for very specific use case like maintaining a very large table with
> eventually small tables to join with (such timeseries data, or binary
> storage data with hbase MOB enabled).
> Am I missing something ?
> Thanks,

View raw message