phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Wang <simon.w...@airbnb.com>
Subject Re: Get region for row key
Date Sun, 10 Jul 2016 23:24:23 GMT
About the use case:

We want to do JDBC queries for each row in a Hive partition. Currently, we use Spark to partition
the Hive dataFrame, then do batch query in foreachPartition. Since each partition is accessing
multiple regionservers, there are a lot of overhead. So we are thinking about partitioning
the dataFrame according to the HBase region.

Any help is appreciated!

Best,
Simon

> On Jul 10, 2016, at 2:01 PM, Simon Wang <simon.wang@airbnb.com> wrote:
> 
> Hi all,
> 
> Happy weekend!
> 
> I am writing to ask if there is a way that I can get the region number of any given row
key? 
> 
> For the case will salting is applied, I discovered `SaltingUtils.getSaltedKey` method,
but I am not sure how I can get serialize the key as `ImmutableBytesWritable`.
> 
> In general, how should the client get the region number? Assuming that the client have
no prior knowledge of the table. So the client needs to read from metadata (salted or not,
SPLIT ON or not), serialize key, compare with splits, etc.
> 
> Thanks in advance!
> 
> 
> Best,
> Simon
> 


Mime
View raw message