phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Upsert is EXTREMELY slow
Date Fri, 13 Jul 2018 16:07:35 GMT
Phoenix won’t be slower to update secondary indexes than a use case would
be. Both have to do the writes to a second table to keep it in sync.

On Fri, Jul 13, 2018 at 8:39 AM Josh Elser <elserj@apache.org> wrote:

> Also, they're relying on Phoenix to do secondary index updates for them.
>
> Obviously, you can do this faster than Phoenix can if you know the exact
> use-case.
>
> On 7/12/18 6:31 PM, Pedro Boado wrote:
> > A tip for performance is reusing the same preparedStatement , just
> > clearParameters() , set values and executeUpdate() over and over again.
> > Don't close the statement or connections after each upsert. Also, I
> > haven't seen any noticeable benefit on using jdbc batches as Phoenix
> > controls batching by when commit() is called.
> >
> > Keep an eye on not calling commit after every executeUpdate (that's a
> > real performance killer) . Batch commits in every ~1k upserts .
> >
> > Also that attempt of asynchronous code is probably another performance
> > killer. Are you creating a new Runnable per database write and opening
> > and closing dB connections per write? Just spawn a few threads (5 to 10,
> > if client cpu is not maxed keep increasing it) and send upserts in a for
> > loop reusing preparedStatement and connections.
> >
> > With a cluster that size I would expect seeing tens of thousands of
> > writes per second.
> >
> > Finally have you checked that all RS receive same traffic ?
> >
> > On Thu, 12 Jul 2018, 23:10 Pedro Boado, <pedro.boado@gmail.com
> > <mailto:pedro.boado@gmail.com>> wrote:
> >
> >     I believe it's related to your client code - In our use case we do
> >     easily 15k writes/sec in a cluster lower specced than yours.
> >
> >     Check that your jdbc connection has autocommit off so Phoenix can
> >     batch writes and that table has a reasonable UPDATE_CACHE_FREQUENCY
> >     ( more than 60000 ).
> >
> >
> >     On Thu, 12 Jul 2018, 21:54 alchemist, <alchemistsrivastava@gmail.com
> >     <mailto:alchemistsrivastava@gmail.com>> wrote:
> >
> >         Thanks a lot for your help.
> >         Our test is inserting new rows individually. For our use case,
> >         we are
> >         benchmarking that we could be able to get 10,000 new rows in a
> >         minute, using
> >         a cluster of writers if needed.
> >         When executing the inserts with Phoenix API (UPSERT) we have
> >         been able to
> >         get up to 6,000 new rows per minute.
> >
> >         We changed our test to perform the inserts individually using
> >         the HBase API
> >         (Put) rather than Phoenix API (UPSERT) and got an improvement of
> >         more than
> >         10x. (up to 60,000 rows per minute).
> >
> >         What would explain this difference? I assume that in both cases
> >         HBase must
> >         grab the locks individually in the same way.
> >
> >
> >
> >         --
> >         Sent from:
> http://apache-phoenix-user-list.1124778.n5.nabble.com/
> >
>

Mime
View raw message