phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samarth Jain <sama...@apache.org>
Subject Re: How to do true batch updates in Phoenix
Date Fri, 21 Aug 2015 17:51:50 GMT
Yes.

On Fri, Aug 21, 2015 at 10:50 AM, jeremy p <athomewithagroovebox@gmail.com>
wrote:

> Thank you!  Samarth solution looks like it'll work for me.  One question :
> you mentioned that the Phoenix client keeps uncommitted rows in memory
> until they're sent over to HBase.  When we call conn.commit() does that
> send the rows over to HBase immediately?
>
> --Jeremy
>
> On Wed, Aug 19, 2015 at 7:19 PM, ALEX K <alex.kamil@gmail.com> wrote:
>
>> I'm using the same solution as Samarth suggested (commit batching), it
>> brings down latency per single row upsert from 50ms to 5ms (averaged after
>> batching)
>>
>> On Wed, Aug 19, 2015 at 7:11 PM, Samarth Jain <samarth.jain@gmail.com>
>> wrote:
>>
>>> You can do this via phoenix by doing something like this:
>>>
>>> try (Connection conn = DriverManager.getConnection(url)) {
>>> conn.setAutoCommit(false);
>>> int batchSize = 0;
>>> int commitSize = 1000; // number of rows you want to commit per batch.
>>> Change this value according to your needs.
>>> while (there are records to upsert) {
>>>      stmt.executeUpdate();
>>>      batchSize++;
>>>      if (batchSize % commitSize == 0) {
>>>           conn.commit();
>>>      }
>>> }
>>> conn.commit(); // commit the last batch of records
>>>
>>> You don't want commitSize to be too large since Phoenix client keeps the
>>> uncommitted rows in memory till they are sent over to HBase.
>>>
>>>
>>>
>>> On Wed, Aug 19, 2015 at 3:05 PM, Serega Sheypak <
>>> serega.sheypak@gmail.com> wrote:
>>>
>>>> I would suggest you to use
>>>>
>>>> https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html
>>>> instead of list of puts and share mutableBuffer across threads (it's
>>>> thread-safe). I reduced my response time from 30-40 ms to 4ms while using
>>>> buffferedmutator. It also sends mutations in async mode. :)
>>>>
>>>> I meet the same problem. Can't force Phoenix to buffer upserts on
>>>> client-side and then send them to HBase in small batches.
>>>>
>>>> 2015-08-19 19:40 GMT+02:00 jeremy p <athomewithagroovebox@gmail.com>:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I need to do true batch updates to a Phoenix table.  By this, I mean
>>>>> sending a bunch of updates to HBase as part of a single request.  The
HBase
>>>>> API offers this behavior with the Table.put(List<Put> puts) method.
 I
>>>>> noticed PhoenixStatement exposes an executeBatch() method, however, this
>>>>> method just executes the batched statements one-by-one.  This will not
>>>>> deliver the performance that the HBase API exposes through their batch
put
>>>>> method.
>>>>>
>>>>> What is the best way for me to do true batch updates to a Phoenix
>>>>> table?  I need to do this programmatically, so I cannot use the command
>>>>> line bulk insert utility.
>>>>>
>>>>> --Jeremy
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message