phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Help with salting
Date Wed, 04 Nov 2015 02:49:37 GMT
Hi Vijay,
Have you considered generating your IDs in a way that prevents hotspotting?
One way might be to reverse the bits you get back from the sequence
generator. You could write a simple UDF that does that:
https://phoenix.apache.org/udf.html

See inline for answers to your questions.

Thanks,
James

On Tue, Nov 3, 2015 at 3:44 PM, Vijay Vangapandu <
VijayVangapandu@eharmony.com> wrote:

> Hi,
>
> I integrated one of the online services in my company with hbase using
> apache phoenix, after loading few millions of records I noticed that we
> have hotspot problem. All the records are going to one region as the keys
> are generated using sequence.
> Usecase is: each user has 1000’s of records with combination of userid and
> second record id as rowkey (primary key uid, XXX). When user logs in we
> fetch all records by using userid and render the results to user. But
> updates will always be with combination (userid + XXX). Below are my
> questions.
>
>  1.  If I salt the table using apache phoenix, is there any performance
> impact on reads as the reads has to query all regions?
>
Yes - for range scans, Phoenix needs to run N scans to find all the data
where N is the number of salt buckets. Worst case, that's N times more load
on your cluster, but in reality, the impact would likely be lower. A good
way to think of it is that you're loading N blocks when in the non salted
case you might only be loading 1 block.


>  2.  If I have to salt the table, how many buckets should I use for 8
> regional servers with 272 regions, roughly 33 regions for a regions server?
>
Have you seen the Tuning presentation on our Presentations page?
https://phoenix.apache.org/resources.html. Maybe start with 10 or 11 salt
buckets. Looks like your region size is pretty small, so not sure how this
will impact things. Try using Pherf (https://phoenix.apache.org/pherf.html)
with different salt buckets to get an idea.


>  3.  If I salt the table using phoenix, what is the effort to move away
> from pehonix and use the hbase client directly in later times ( not that I
> want to but just checking the options)
>
Impossible. :-) The salt byte value calculation is just a few lines of code
(see SaltingUtil.getSaltingByte()), and you'd need to run scans against all
salt buckets and merge the results. But assuming your using where clauses
and other features, that's going to be a lot of work.


>
> Thanks for your help.
>
> --
> Vijay Vangapandu
> eHarmony, Platform
> Principal Software Engineer
>
>

Mime
View raw message