phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Gao <g...@marinsoftware.com>
Subject Re: Query by region splits
Date Tue, 19 Apr 2016 21:04:46 GMT
Hi James,

Thanks for the hints on the paged query. I have one additional question
about range scan based on region's start and stop rowkey. Is it possible to
do such range scan in phoenix given the hbase's region start/stop rowkey
bytes?

i.e.

SELECT col1,col2,... FROM TABLE_A where RK BETWEEN startRowKey and
stopRowKey.

Where the RowKey in HBase is composed by several columns in Phoenix (i.e.
bigint, int, boolean, and varchar).

I saw if the table is salted will the paged query syntax still yields a
correct range scan for a given region?

Thanks,
Li


On Mon, Apr 18, 2016 at 3:17 PM, Li Gao <gaol@marinsoftware.com> wrote:

> Hi James.
>
> I see, [2] might work for my use case.
>
> Thanks,
> Li
>
>
> On Mon, Apr 18, 2016 at 2:54 PM, James Taylor <jamestaylor@apache.org>
> wrote:
>
>> Thanks for the clarification, Li. Are you essentially trying to make
>> Phoenix multi-client node? Our idea for that is Drillix [1]. Short term, if
>> you know the split points, you could use our row value constructor syntax
>> [2] to do the above.
>>
>> Thanks,
>> James
>>
>>
>> [1]
>> https://apurtell.s3.amazonaws.com/phoenix/Drillix+Combined+Operational+%26+Analytical+SQL+at+Scale.pdf
>> [2] https://phoenix.apache.org/paged.html
>>
>> On Mon, Apr 18, 2016 at 2:18 PM, Li Gao <gaol@marinsoftware.com> wrote:
>>
>>> Hi James,
>>>
>>> Thanks for the quick reply. It is helpful but not sure it can solve the
>>> issue we have. Let me state use case in another way to make it more
>>> obvious.
>>>
>>> Say Table A has 10 regions spread across 10 HBase nodes, in addition I
>>> have 10 data processor machines (not the same as the hbase cluster) that
>>> can each independently issue a query to Phoenix to retrieve part of the
>>> table.
>>>
>>> Ideally I am looking for something like:
>>>
>>> SELECT col1,col2,... FROM TABLE_A WHERE (i.e. region=1,2,3,4...)
>>>
>>> So each processor machine can issue a region-specific query and retrieve
>>> a non-overlapping piece of the table projection. I am not sure how such
>>> Phoenix query can be constructed.
>>>
>>> Hope this clarifies the question.
>>>
>>> Thanks,
>>> Li
>>>
>>> On Mon, Apr 18, 2016 at 2:09 PM, James Taylor <jamestaylor@apache.org>
>>> wrote:
>>>
>>>> Phoenix already does this (and to a finer, configurable granularity).
>>>> See https://phoenix.apache.org/update_statistics.html
>>>>
>>>> Thanks,
>>>> James
>>>>
>>>> On Mon, Apr 18, 2016 at 2:08 PM, Li Gao <gaol@marinsoftware.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> In Phoenix is it possible to query the data by region splits? i.e. if
>>>>> Table A has 10 regions on the cluster, how I can issue 10 concurrent
>>>>> queries to Table A so that each query covers exactly 1 region for the
>>>>> table? This is helpful for us to split the queries across multiple
>>>>> processor machines and help us build MPP query connector for Phoenix.
>>>>>
>>>>> Any hints would be appreciated.
>>>>>
>>>>> Thanks,
>>>>> Li
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message