phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas D'Silva" <tdsi...@salesforce.com>
Subject Re: Disable NO_CACHE hint on query for LIMIT OFFSET paging queries
Date Tue, 21 Aug 2018 18:37:00 GMT
When you do an OFFSET Phoenix will scan rows and filter them out until it
reaches the offset count which can end up being very costly for large
offsets.
If you can use a RVC where the order matches the PK of the data table or
index the start key of the scan will be set based on the RVC, which is much
more efficient
(see http://phoenix.apache.org/paged.html).

On Tue, Aug 21, 2018 at 8:06 AM, Abhishek Gupta <abhilater@gmail.com> wrote:

> Hi,
>
> Could you help me understand how LIMIT OFFSET queries work under the hood
> in Phoenix, is the filtering out of rows done in heap or is there some sort
> of optimisation where it can skip at disk level.
> My idea about posting this question was to understand if the rows from
> paste pages of the query in block cache can optimize the subsequent page
> call that would use the cache hits for the filter out rows and not seek at
> disk.
>
> Thanks,
> Abhishek
>
> On Sat, Aug 18, 2018 at 4:16 AM Thomas D'Silva <tdsilva@salesforce.com>
> wrote:
>
>> Shouldn't you pass the NO_CACHE hint for the LIMIT-OFFSET queries, since
>> you will be reading and filtering out lots of rows on the server?
>> I guess using the block cache for RVC queries might help depending on how
>> many rows you read per query, you should be able to easily test this out.
>>
>> On Fri, Aug 17, 2018 at 4:25 AM, Abhishek Gupta <abhilater@gmail.com>
>> wrote:
>>
>>> Hi Team,
>>>
>>> I am working on a use case where SQL aggregated queries are made such
>>> that RVC cannot be used (aggregation on truncated primary key columns)
>>> instead LIMIT-OFFSET has to be used. RVC is used for some user user cases
>>>
>>> Currently I have disabled BLOCKCACHE for the table. I wanted to check if
>>> it would be more performant to instead enable BLOCKCACHE on the table
>>> and pass NO_CACHE hint for RVC queries because it uses non-LIMIT-OFFSET
>>> scans and not pass NO_CACHE for the LIMIT-OFFSET queries so that for
>>> the subsequent page calls can leverage prior page data in block cache.
>>>
>>> Thanks,
>>> Abhishek
>>>
>>
>>

Mime
View raw message