phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Upsert is EXTREMELY slow
Date Fri, 13 Jul 2018 18:14:32 GMT
Sorry, I was brief and didn't get my point across. I meant to say the 
same thing you did.

Someone manually submitting two updates to an index is naively faster 
that what Phoenix goes through to automatically (and safely) do this.

On 7/13/18 12:07 PM, James Taylor wrote:
> Phoenix won’t be slower to update secondary indexes than a use case 
> would be. Both have to do the writes to a second table to keep it in sync.
> 
> On Fri, Jul 13, 2018 at 8:39 AM Josh Elser <elserj@apache.org 
> <mailto:elserj@apache.org>> wrote:
> 
>     Also, they're relying on Phoenix to do secondary index updates for them.
> 
>     Obviously, you can do this faster than Phoenix can if you know the
>     exact
>     use-case.
> 
>     On 7/12/18 6:31 PM, Pedro Boado wrote:
>      > A tip for performance is reusing the same preparedStatement , just
>      > clearParameters() , set values and executeUpdate() over and over
>     again.
>      > Don't close the statement or connections after each upsert. Also, I
>      > haven't seen any noticeable benefit on using jdbc batches as Phoenix
>      > controls batching by when commit() is called.
>      >
>      > Keep an eye on not calling commit after every executeUpdate
>     (that's a
>      > real performance killer) . Batch commits in every ~1k upserts .
>      >
>      > Also that attempt of asynchronous code is probably another
>     performance
>      > killer. Are you creating a new Runnable per database write and
>     opening
>      > and closing dB connections per write? Just spawn a few threads (5
>     to 10,
>      > if client cpu is not maxed keep increasing it) and send upserts
>     in a for
>      > loop reusing preparedStatement and connections.
>      >
>      > With a cluster that size I would expect seeing tens of thousands of
>      > writes per second.
>      >
>      > Finally have you checked that all RS receive same traffic ?
>      >
>      > On Thu, 12 Jul 2018, 23:10 Pedro Boado, <pedro.boado@gmail.com
>     <mailto:pedro.boado@gmail.com>
>      > <mailto:pedro.boado@gmail.com <mailto:pedro.boado@gmail.com>>>
wrote:
>      >
>      >     I believe it's related to your client code - In our use case
>     we do
>      >     easily 15k writes/sec in a cluster lower specced than yours.
>      >
>      >     Check that your jdbc connection has autocommit off so Phoenix can
>      >     batch writes and that table has a reasonable
>     UPDATE_CACHE_FREQUENCY
>      >     ( more than 60000 ).
>      >
>      >
>      >     On Thu, 12 Jul 2018, 21:54 alchemist,
>     <alchemistsrivastava@gmail.com <mailto:alchemistsrivastava@gmail.com>
>      >     <mailto:alchemistsrivastava@gmail.com
>     <mailto:alchemistsrivastava@gmail.com>>> wrote:
>      >
>      >         Thanks a lot for your help.
>      >         Our test is inserting new rows individually. For our use
>     case,
>      >         we are
>      >         benchmarking that we could be able to get 10,000 new rows
>     in a
>      >         minute, using
>      >         a cluster of writers if needed.
>      >         When executing the inserts with Phoenix API (UPSERT) we have
>      >         been able to
>      >         get up to 6,000 new rows per minute.
>      >
>      >         We changed our test to perform the inserts individually using
>      >         the HBase API
>      >         (Put) rather than Phoenix API (UPSERT) and got an
>     improvement of
>      >         more than
>      >         10x. (up to 60,000 rows per minute).
>      >
>      >         What would explain this difference? I assume that in both
>     cases
>      >         HBase must
>      >         grab the locks individually in the same way.
>      >
>      >
>      >
>      >         --
>      >         Sent from:
>     http://apache-phoenix-user-list.1124778.n5.nabble.com/
>      >
> 

Mime
View raw message