phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Mahonin <jmaho...@gmail.com>
Subject Re: Best split policy for wide distribution of table sizes
Date Thu, 03 Aug 2017 19:32:20 GMT
Hi Michael,

This is more of an HBase question than Phoenix specific, and you may get
better feedback from the hbase-users list, but...

At $DAYJOB, we've run into a similar "lumpy" data distribution issues,
which are particularly noticeable in smaller / under-provisioned
environments. It's not necessarily recommended, and there are likely more
elegant solutions [1], but we've found that setting the
'hbase.master.loadbalance.bytable' value to hbase-site.xml has been
effective in some environments. [2]

Good luck!

Josh

[1]
http://apache-hbase.679495.n3.nabble.com/balance-the-tables-across-region-servers-td4066249.html
[2]
https://community.hortonworks.com/questions/65208/hbase-balancer-at-a-table-level.html

On Thu, Aug 3, 2017 at 2:09 PM, Michael Young <yomaiquin@gmail.com> wrote:

> We have several phoenix tables which vary quite a bit in size. Namely, we
> have around 10-15 tables which contain perhaps 6-10x more data than the
> other 50 tables.
>
> The default split policy is currently used, and the count of regions
> across the clusters is uniform.  However, we noticed some tables have more
> regions concentrated on some nodes, presumably to keep the total count of
> regions constant.  This seems to negatively impact query performance for
> our largest data tables.
>
> We tested using the ConstantSizeSplitPolicy, to have the region data sizes
> be better balanced, and the queries seem to behave somewhat better.
>
> Is this a good approach or does anyone have a more appropriate solution?
> We don't want to implement a custom split policy but are more than willing
> to try other available split policies, or other config tuning.
>
> Thanks,
> Michael Young
>
>

Mime
View raw message