We have a table where multiple customer could have rows.
Some of them may be large and some very small in terms for number of rows.
we have a row key based on customerid+type+orderid..if not salted all the rows of large customer will end up in some regions leading to hot spotting(being large customer and more frequently used)
From: James Taylor <email@example.com>
Sent: Friday, September 8, 2017 12:56:31 PM
Subject: Re: Salt NumberHi Pradheep,Would you be able to describe your use case and why you're salting? We really only recommend salting if you have write hotspotting. Otherwise, it increases the overall load on your cluster.Thanks,James
On Fri, Sep 8, 2017 at 9:13 AM, Pradheep Shanmugam <Pradheep.Shanmugam@infor.com> wrote:
As the salt number cannot be changed later, what is is best number we can give in different cases for cluster with 10 region servers with say 6 cores in each.
Should we consider cores while deciding the number..
In some places i see number can be in the range 1-256 and in some place i see that it is equal to the number of region servers..can the number in the multiples of region server(say 20, 30 etc)
read heavy large(several 100 millions) table with range scans
write heavy large table with less frequent range scans
large table with hybrid load with range scans