phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Phoenix and SQL In clauses
Date Thu, 29 May 2014 23:41:32 GMT
FWIW, another point of reference is a benchmark we put together for a use
case at Bloomberg described here:
http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html

In this test, we put 250K item in the IN clause and had very good
performance.

Thanks,
James


On Thu, May 29, 2014 at 4:36 PM, alex kamil <alex.kamil@gmail.com> wrote:

> * maximum *number *of elements
>
>
> On Thu, May 29, 2014 at 7:34 PM, alex kamil <alex.kamil@gmail.com> wrote:
>
>> forgot to mention, to determine maximum value of elements in an IN clause it's
>> important to know how many elements each of the M queries actually returns
>> and the size of the resultsets for each; while the number of elements in IN
>> clause can be  up to X , the number of  rows retrieved will be up to Y,
>>  and the MAX for X will depend on Y
>>
>>
>> On Wed, May 28, 2014 at 1:53 PM, alex kamil <alex.kamil@gmail.com> wrote:
>>
>>> Dmitry,
>>> we've tried something similar, on a table with a few hundreds of VARCHAR
>>> columns getting   >30k items in IN clause was starting to exceed 60 sec, if
>>> I remember correctly,
>>> it basically becomes an IO bottleneck getting the huge resultsets back
>>> to the client
>>>
>>> this was a workaround we used before JOINs became available in phoenix -
>>> pulled a subset of data to the client  using a batch of M queries with N
>>> elements each, and did a join on client side in HSQLDB, which was messy and
>>> inefficient
>>> with phoenix 3.0 we switched to using joins on server side and avoid
>>> this scenario completely
>>>
>>> Alex
>>>
>>>
>>>
>>>
>>> On Wed, May 28, 2014 at 1:27 PM, Dmitry Goldenberg <
>>> dgoldenberg@kmwllc.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I was wondering that the maximum value of elements in an IN clause is,
>>>> in Phoenix.  Also, from the performance standpoint, if I were to batch up
>>>> the items into M statements of N elements per In clause and execute
>>>> multiple statements to avoid ever hitting the max, what may be an optimal
>>>> value for N, considering that I only have a few relatively small column
>>>> values (int, varchar, timestamp) per statement?
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Phoenix HBase User" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to phoenix-hbase-user+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Phoenix HBase User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to phoenix-hbase-user+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Mime
View raw message