phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcell Ortutay <mortu...@23andme.com>
Subject Direct HBase vs. Phoenix query performance
Date Thu, 08 Mar 2018 21:03:25 GMT
Hi,

I am using Phoenix at my company for a large query that is meant to be run
in real time as part of our application. The query involves several
aggregations, anti-joins, and an inner query. Here is the (anonymized)
query plan:
https://gist.github.com/ortutay23andme/1da620472cc469ed2d8a6fdd0cc7eb01

The query performance on this is not great, it takes about 5sec to execute
the query, and moreover it performs badly under load. If we run ~4qps of
this query Phoenix starts to timeout and slow down a lot (queries take
>30sec).

For comparison, I wrote a simple Go script that runs a similar query
talking directly to HBase. The performance on it is substantially better.
It executes in ~1.5sec, and can handle loads of ~50-100qps on the same
cluster.

I'm wondering if anyone has ideas on what might be causing this difference
in performance? Are there configs / optimizations we can do in Phoenix to
bring the performance closer to direct HBase queries?

I can provide context on the table sizes etc. if needed.

Thanks,
Marcell

Mime
View raw message