phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Wang <>
Subject Re: Get region for row key
Date Tue, 12 Jul 2016 05:14:24 GMT
As I read more Phoenix code, I feel that I should do:

1. Use `PhoenixRuntime.getTable` to get a `PTable`
2. Use `table.getPKColumns` to get a list of `PColumn`s
3. For each column, use `column.getDataType`; then `dataType.toBytes(value, column.getSortOrder)`
4. Finally, create a new `ImmutableBytesPtr`, and do `table.newKey(ptr, pksByteArray)`
5. Eventually, get salted key as `SaltingUtil.getSaltedKey(ptr, table.getBucketNum())`

I appreciate anyone that can help me check this is correct. :)

Thanks a lot!


> On Jul 10, 2016, at 4:24 PM, Simon Wang <> wrote:
> About the use case:
> We want to do JDBC queries for each row in a Hive partition. Currently, we use Spark
to partition the Hive dataFrame, then do batch query in foreachPartition. Since each partition
is accessing multiple regionservers, there are a lot of overhead. So we are thinking about
partitioning the dataFrame according to the HBase region.
> Any help is appreciated!
> Best,
> Simon
>> On Jul 10, 2016, at 2:01 PM, Simon Wang < <>>
>> Hi all,
>> Happy weekend!
>> I am writing to ask if there is a way that I can get the region number of any given
row key? 
>> For the case will salting is applied, I discovered `SaltingUtils.getSaltedKey` method,
but I am not sure how I can get serialize the key as `ImmutableBytesWritable`.
>> In general, how should the client get the region number? Assuming that the client
have no prior knowledge of the table. So the client needs to read from metadata (salted or
not, SPLIT ON or not), serialize key, compare with splits, etc.
>> Thanks in advance!
>> Best,
>> Simon

View raw message