phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mujtaba Chohan <mujt...@apache.org>
Subject Re: Speeding Up Group By Queries
Date Fri, 25 Mar 2016 17:37:01 GMT
That seems excessively slow for 10M rows which should be in order of few
seconds at most without index. 1. How wide is your table 2. How many region
servers is your data distributed on and what's the heap size? 3. Do you see
lots of disk I/O on region servers during aggregation? 4. Can you try your
query after major compacting your table?

Can you also replace log4j.properties with the attached one and reply back
with phoenix.log created by executing your query in sqlline?

Thanks,
Mujtaba


On Fri, Mar 25, 2016 at 6:56 AM, Amit Shah <amits.84@gmail.com> wrote:

> Hi,
>
> I am trying to evaluate apache hbase (version 1.0.0) and phoenix (version
> 4.6) deployed through cloudera for our OLAP workfload. I have a table
> that has 10 mil rows. I try to execute the below roll up query and it takes
> around 2 mins to return 1,850 rows.
>
> SELECT SUM(UNIT_CNT_SOLD), SUM(TOTAL_SALES) FROM TRANSACTIONS GROUP BY
> T_COUNTRY;
>
> I tried applying the "joining with indices" example given on the website
> <https://phoenix.apache.org/joins.html> on the TRANSACTIONS table by
> creating an index on the grouped by column as below but that doesn't help.
>
> CREATE INDEX TRANSACTIONS_COUNTRY_INDEX ON TRANSACTIONS (T_COUNTRY)
> INCLUDE (UNIT_CNT_SOLD, TOTAL_SALES);
>
> This index is not getting used when the query is executed. The query plan
> is as below
>
> +------------------------------------------+
> |                   PLAN                   |
> +------------------------------------------+
> | CLIENT 31-CHUNK PARALLEL 31-WAY FULL SCAN OVER TRANSACTIONS |
> |     SERVER AGGREGATE INTO DISTINCT ROWS BY [T_COUNTRY] |
> | CLIENT MERGE SORT                        |
> +------------------------------------------+
>
> Theoretically can secondary indexes help improve the performance of group
> by queries?
>
> Any suggestions on what are different options in phoenix I could try out
> to speed up GROUP BY queries?
>
> Thanks,
> Amit.
>

Mime
View raw message