phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Any reason for so small phoenix.mutate.batchSize by default?
Date Tue, 03 Sep 2019 14:19:34 GMT
Hey Alexander,

Was just poking at the code for this: it looks like this is really just 
determining the number of mutations that get "processed together" (as 
opposed to a hard limit).

Since you have done some work, I'm curious if you could generate some 
data to help back up your suggestion:

* What does your table DDL look like?
* How large is one mutation you're writing (in bytes)?
* How much data ends up being sent to a RegionServer in one RPC?

You're right in that we would want to make sure that we're sending an 
adequate amount of data to a RegionServer in an RPC, but this is tricky 
to balance for all cases (thus, setting a smaller value to avoid sending 
batches that are too large is safer).

On 9/3/19 8:03 AM, Alexander Batyrshin wrote:
>   Hello all,
> 1) There is bug in documentation -
> phoenix.mutate.batchSize is not 1000, but only 100 by default
> Changed for
> 2) I want to discuss this default value. From PHOENIX-541 
> <> I read about issue 
> with MR and wide rows (2MB per row) and it looks like rare case. But in 
> most common cases we can get much better write perfomance with batchSize 
> = 1000 especially if it used with SALT table

View raw message