phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Boado <>
Subject Re: Upsert is EXTREMELY slow
Date Thu, 12 Jul 2018 22:31:08 GMT
A tip for performance is reusing the same preparedStatement , just
clearParameters() , set values and executeUpdate() over and over again.
Don't close the statement or connections after each upsert. Also, I haven't
seen any noticeable benefit on using jdbc batches as Phoenix controls
batching by when commit() is called.

Keep an eye on not calling commit after every executeUpdate (that's a real
performance killer) . Batch commits in every ~1k upserts .

Also that attempt of asynchronous code is probably another performance
killer. Are you creating a new Runnable per database write and opening and
closing dB connections per write? Just spawn a few threads (5 to 10, if
client cpu is not maxed keep increasing it) and send upserts in a for loop
reusing preparedStatement and connections.

With a cluster that size I would expect seeing tens of thousands of writes
per second.

Finally have you checked that all RS receive same traffic ?

On Thu, 12 Jul 2018, 23:10 Pedro Boado, <> wrote:

> I believe it's related to your client code - In our use case we do easily
> 15k writes/sec in a cluster lower specced than yours.
> Check that your jdbc connection has autocommit off so Phoenix can batch
> writes and that table has a reasonable UPDATE_CACHE_FREQUENCY  ( more than
> 60000 ).
> On Thu, 12 Jul 2018, 21:54 alchemist, <>
> wrote:
>> Thanks a lot for your help.
>> Our test is inserting new rows individually. For our use case, we are
>> benchmarking that we could be able to get 10,000 new rows in a minute,
>> using
>> a cluster of writers if needed.
>> When executing the inserts with Phoenix API (UPSERT) we have been able to
>> get up to 6,000 new rows per minute.
>> We changed our test to perform the inserts individually using the HBase
>> API
>> (Put) rather than Phoenix API (UPSERT) and got an improvement of more than
>> 10x. (up to 60,000 rows per minute).
>> What would explain this difference? I assume that in both cases HBase must
>> grab the locks individually in the same way.
>> --
>> Sent from:

View raw message