phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: Row count
Date Thu, 11 Jan 2018 20:07:40 GMT
In Phoenix 4.13 the COUNT is VERY fast..was there some optimization in this
sense (from 4.7)??

On Wed, Sep 13, 2017 at 12:53 PM, Ankit Singhal <ankitsinghal59@gmail.com>
wrote:

> Best is to do "SELECT COUNT(*) FROM MYTABLE" with index. As index table
> will have less data so it can be read faster.
> if you have time series data or your data is always incremental with some
> ID then you can do incremental count with row_timestamp filters or ID filter
>
>
> bq. however the result could be non-deterministic if HBase has just been
> restarted..
>
>        Results are expected to be deterministic in normal scenarios. can
> you elaborate what is the difference you see after HBase restarted?
>
> bq. SELECT SUM(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE
> PHYSICAL_NAME = 'MYTABLE';
>
>         We calculate row count till the guidePosts is found in the region
> and no count will be stored for a region having a size not enough for
> guidepost width or remaining region after the last guidePosts. so this
> row_count should not be used against actual count.
>
>
> On Wed, Sep 13, 2017 at 4:04 PM, Flavio Pompermaier <pompermaier@okkam.it>
> wrote:
>
>> Hi to all,
>> I'm trying to investigate the best option to have get the row count out
>> of a table.
>>
>> I've tried the following:
>>
>>
>>    1. SELECT COUNT(*) FROM MYTABLE
>>       1. very slow without an index, very quick with an index
>>       2. however the result could be non-deterministic if HBase has just
>>       been restarted..
>>    2. SELECT SUM(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE
>>    PHYSICAL_NAME = 'MYTABLE';
>>    1. the result here is completely different from the first
>>       one..323329772 vs 13376168. How is that possible?
>>
>> Best,
>> Flavio
>>
>

Mime
View raw message