phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay Kukkala <acces...@gmail.com>
Subject Re: Scan performance using JDBC result impacted by limit
Date Thu, 22 Jan 2015 14:20:04 GMT
Samarth,

Thank you for your response.

To clarify, I am not calling the PhoenixResultSet.toString().

While calling the resultSet.next() to iterate thru the items retrieved, I
make a call to resultSet.getString("pc") where "pc" is the column name and
of type VARCHAR.

The call resultSet.getString("pc") is the one where retrieving data takes a
lot of time compared to resultSet.getBytes("pc") (atleast 3-5 times more)

Hope, I was to able to make it clear what I my issue is.

thanks,
Vijay

On Wed Jan 21 2015 at 1:14:23 PM Samarth Jain <samarth.jain@gmail.com>
wrote:

> Vijay,
>
> Is there a reason why you are doing PhoenixResultSet.string()? Is it for
> logging purposes?
>
> Regarding your question regarding increase in object creation time, that
> doesn't seem like it is phoenix related. Are you seeing an increase in time
> for resultset.next() or are you seeing an increase in time for
> resultset.getObject()?
>
> -Samarth
> On Wednesday, January 21, 2015, Vijay Kukkala <vijay.kukkala@gmail.com>
> wrote:
>
>> Just to add more info on issue we were facing and workaround applied
>> the PhoenixResultSet.getString() takes way much time than
>> PhoenixResultSet.getBytes().
>>
>> the Formatting and other logic in the getString() increases with the
>> number of items to be processed.
>>
>> Somebody might want to take a look at this.
>>
>>
>>
>> On Fri Jan 16 2015 at 3:03:18 PM Vijay Kukkala <vijay.kukkala@gmail.com>
>> wrote:
>>
>>> I am using plain JDBC code to execute a query against a Phoenix cluster
>>> running 4.0.0 with Hbase 0.98.x
>>>
>>> My query is as follows against a table with single column_family.
>>> select cid,ts,id,pc,un,ug,ui,s,inf,sm,mst,se  from wcs_re where cid = ?
>>> and ts >= ? and ts <= ? limit ?
>>>
>>> <cid, ts, id> are my primary keys for the table.
>>>
>>> On the client side, the retrieved values are Iterated and converted to a
>>> domain object on the client side. Since the query was taking long, I
>>> started measuring the times taken to do the conversion for each object.
>>>
>>> The issue I see is, as I increase the limit clause value in the query
>>> from 100, 1000, 2000 and so on, my domain conversion time increases
>>> gradually from <1 , 9, 17 ms for each record retrieved from the resultset.
>>> Ideally, I would have thought that conversion time would be constant.
>>>
>>> Can somebody help shed some light on this?
>>>
>>> thanks
>>> Vijay
>>>
>>

Mime
View raw message