phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Add automatic/default SALT
Date Tue, 12 Dec 2017 00:45:12 GMT
I'm a little hesitant of this for a few things I've noticed from lots of 
various installations:

* Salted tables are *not* always more efficient. In fact, I've found 
myself giving advice to not use salted tables a bit more than expected. 
Certain kinds of queries will require much more work if you have salting 
over not having salting

* Considering salt buckets as a measure of parallelism for a table, it's 
impossible for the system to correctly judge what the parallelism of the 
cluster should be. For example, with 10 RS and 1 Phoenix table, you 
would want to start with 10 salt buckets. However, with 10 RS and 100 
Phoenix tables, you'd *maybe* want to do 3 salt buckets. It's hard to 
make system wide decisions correctly without a global view of the entire 
system.

I think James was trying to capture some of this in his use of "relative 
conservative default", but I'd take that even a bit farther to say I 
consider it harmful for Phoenix to do that out of the box.

However, I would flip the question upside down instead: what kind of 
suggestions can Phoenix make as a database to the user to _recommend_ to 
them that they enable salting on a table given its schema and important 
queries?

On 12/8/17 12:34 PM, James Taylor wrote:
> Hi Flavio,
> I like the idea of “adaptable configuration” where you specify a config 
> value as a % of some cluster resource (with relatively conservative 
> defaults). Salting is somewhat of a gray area though as it’s not config 
> based, but driven by your DDL. One solution you could implement on top 
> of Phoenix is scripting for DDL that fills in the salt bucket parameter 
> based on cluster size.
> Thanks,
> James
> 
> On Tue, Dec 5, 2017 at 12:50 AM Flavio Pompermaier <pompermaier@okkam.it 
> <mailto:pompermaier@okkam.it>> wrote:
> 
>     Hi to all,
>     as stated by at the documentation[1] "for optimal performance,
>     number of salt buckets should match number of region servers".
>     So, why not to add an option AUTO/DEFAULT for salting that defaults
>     this parameter to the number of region servers?
>     Otherwise I have to manually connect to HBase, retrieve that number
>     and pass to Phoenix...
>     What do you think?
> 
>     [1] https://phoenix.apache.org/performance.html#Salting
> 
>     Best,
>     Flavio
> 

Mime
View raw message