phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ALEX K <alex.ka...@gmail.com>
Subject Re: Phoenix JDBC in web-app, what is the right pattern?
Date Mon, 07 Sep 2015 04:43:00 GMT
Serega,

I haven't seen any issues so far with this approach (keeping connections
open in thread-local, one conn per thread)

we send a stream of kafka messages to hbase in the following way
- each "saver" thread initializes two connections -  one for upserts and
one for alter statements (for multitenancy each thread keeps a local Map
<String, Connection> with one connection per tenant)
- get  message from inbound kafka queue, deserialize (avro) and save to
hbase with java.sql.preparedstatement and phoenix upsert or create/alter*
- connection.commit() every 1000 or so upserts or per "transaction", with
connection autocommit set to false (this brings down latency per row saved
to single digit msec)
- in case of exceptions retry save, if it still fails  - check/refresh
connections

*We pause upserts in all threads (thread.yield on detecting lock)  if there
is incoming message with metadata change (CREATE/ALTER table)  otherwise
phoenix would throw an exception

Alex

On Sun, Sep 6, 2015 at 12:36 PM, Serega Sheypak <serega.sheypak@gmail.com>
wrote:

> Hi, approach above doesn't fit web-app. There are multiple simultaneous
> upserts comes from different threads.
> So the only thing is to put connection into thread-local and use one
> connection per thread.
>
> try (Connection conn = DriverManager.getConnection(url)) {
> conn.setAutoCommit(false);
> int batchSize = 0;
> int commitSize = 1000; // number of rows you want to commit per batch.
> Change this value according to your needs.
> try (Statement stmt = conn.prepareStatement(upsert)) {
> stmt.set ...
> while (there are records to upsert) {
>      stmt.executeUpdate();
>      batchSize++;
>      if (batchSize % commitSize == 0) {
>           conn.commit();
>      }
> }
> conn.commit(); // commit the last batch of records
> }
>
>
> 2015-09-03 22:28 GMT+02:00 Jaime Solano <jdjsolano@gmail.com>:
>
>> Thanks for the explanation, Samarth!
>>
>> On Thu, Sep 3, 2015 at 4:11 PM, Samarth Jain <samarth@apache.org> wrote:
>>
>>> Hi Jaime,
>>>
>>> Phoenix is an embedded JDBC driver i.e. the connections are not meant to
>>> be pooled. Creating phoenix connections is a light weight operation. This
>>> is because all phoenix connections to a cluster from the Phoenix JDBC
>>> driver use the same underlying HConnection.
>>>
>>> If you end up pooling Phoenix connections then you will likely run into
>>> unexpected issues. It is likely that a request will leave a Phoenix
>>> connection not in a kosher state for the next request to use it.
>>>
>>> On Thu, Sep 3, 2015 at 12:58 PM, Jaime Solano <jdjsolano@gmail.com>
>>> wrote:
>>>
>>>> Hi Samarth,
>>>> In our project we were thinking about using Tomcat JDBC Connection
>>>> Pool, to handle Phoenix connections. You're saying this type of approach
>>>> should be avoided? What other approach should be followed if, in the
>>>> scenario Serega described, you experience a heavy load of users, all trying
>>>> to upsert at the same time (therefore, a lot of connections) ??
>>>> Also, can you expand a little bit more on the implications of having a
>>>> pooling mechanism for Phoenix connections?
>>>> Thanks in advance!
>>>> -Jaime
>>>>
>>>> On Thu, Sep 3, 2015 at 3:35 PM, Samarth Jain <samarth.jain@gmail.com>
>>>> wrote:
>>>>
>>>>> Yes. PhoenixConnection implements java.sql.Connection.
>>>>>
>>>>> On Thu, Sep 3, 2015 at 12:34 PM, Serega Sheypak <
>>>>> serega.sheypak@gmail.com> wrote:
>>>>>
>>>>>> >Phoenix doesn't cache connections. You shouldn't pool them and
you
>>>>>> shouldn't share them with multiple threads.
>>>>>> We are talking about java.sql.Connection, right?
>>>>>>
>>>>>> 2015-09-03 21:26 GMT+02:00 Samarth Jain <samarth@apache.org>:
>>>>>>
>>>>>>> Your pattern is correct.
>>>>>>>
>>>>>>> Phoenix doesn't cache connections. You shouldn't pool them and
you
>>>>>>> shouldn't share them with multiple threads.
>>>>>>>
>>>>>>> For batching upserts, you could do something like this:
>>>>>>>
>>>>>>> You can do this via phoenix by doing something like this:
>>>>>>>
>>>>>>> try (Connection conn = DriverManager.getConnection(url)) {
>>>>>>> conn.setAutoCommit(false);
>>>>>>> int batchSize = 0;
>>>>>>> int commitSize = 1000; // number of rows you want to commit per
>>>>>>> batch. Change this value according to your needs.
>>>>>>> try (Statement stmt = conn.prepareStatement(upsert)) {
>>>>>>> stmt.set ...
>>>>>>> while (there are records to upsert) {
>>>>>>>      stmt.executeUpdate();
>>>>>>>      batchSize++;
>>>>>>>      if (batchSize % commitSize == 0) {
>>>>>>>           conn.commit();
>>>>>>>      }
>>>>>>> }
>>>>>>> conn.commit(); // commit the last batch of records
>>>>>>> }
>>>>>>>
>>>>>>> You don't want commitSize to be too large since Phoenix client
keeps
>>>>>>> the uncommitted rows in memory till they are sent over to HBase.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 3, 2015 at 12:19 PM, Serega Sheypak <
>>>>>>> serega.sheypak@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi, I'm using  phoenix in java web-application. App does
upsert or
>>>>>>>> select by primary key.
>>>>>>>> What is the right pattern to do it?
>>>>>>>> - I create new connection for each request
>>>>>>>> - prepare and execute statement
>>>>>>>> - close stmt
>>>>>>>> - close connection
>>>>>>>>
>>>>>>>> Does phoenix caches connections internally? What is the right
way
>>>>>>>> to batch upserts in current case?
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message