phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Gao <g...@marinsoftware.com>
Subject Re: Query by region splits
Date Mon, 18 Apr 2016 22:17:43 GMT
Hi James.

I see, [2] might work for my use case.

Thanks,
Li


On Mon, Apr 18, 2016 at 2:54 PM, James Taylor <jamestaylor@apache.org>
wrote:

> Thanks for the clarification, Li. Are you essentially trying to make
> Phoenix multi-client node? Our idea for that is Drillix [1]. Short term, if
> you know the split points, you could use our row value constructor syntax
> [2] to do the above.
>
> Thanks,
> James
>
>
> [1]
> https://apurtell.s3.amazonaws.com/phoenix/Drillix+Combined+Operational+%26+Analytical+SQL+at+Scale.pdf
> [2] https://phoenix.apache.org/paged.html
>
> On Mon, Apr 18, 2016 at 2:18 PM, Li Gao <gaol@marinsoftware.com> wrote:
>
>> Hi James,
>>
>> Thanks for the quick reply. It is helpful but not sure it can solve the
>> issue we have. Let me state use case in another way to make it more
>> obvious.
>>
>> Say Table A has 10 regions spread across 10 HBase nodes, in addition I
>> have 10 data processor machines (not the same as the hbase cluster) that
>> can each independently issue a query to Phoenix to retrieve part of the
>> table.
>>
>> Ideally I am looking for something like:
>>
>> SELECT col1,col2,... FROM TABLE_A WHERE (i.e. region=1,2,3,4...)
>>
>> So each processor machine can issue a region-specific query and retrieve
>> a non-overlapping piece of the table projection. I am not sure how such
>> Phoenix query can be constructed.
>>
>> Hope this clarifies the question.
>>
>> Thanks,
>> Li
>>
>> On Mon, Apr 18, 2016 at 2:09 PM, James Taylor <jamestaylor@apache.org>
>> wrote:
>>
>>> Phoenix already does this (and to a finer, configurable granularity).
>>> See https://phoenix.apache.org/update_statistics.html
>>>
>>> Thanks,
>>> James
>>>
>>> On Mon, Apr 18, 2016 at 2:08 PM, Li Gao <gaol@marinsoftware.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> In Phoenix is it possible to query the data by region splits? i.e. if
>>>> Table A has 10 regions on the cluster, how I can issue 10 concurrent
>>>> queries to Table A so that each query covers exactly 1 region for the
>>>> table? This is helpful for us to split the queries across multiple
>>>> processor machines and help us build MPP query connector for Phoenix.
>>>>
>>>> Any hints would be appreciated.
>>>>
>>>> Thanks,
>>>> Li
>>>>
>>>
>>>
>>
>

Mime
View raw message