phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: count on large table
Date Fri, 10 Oct 2014 00:36:25 GMT
Hi Abe,


this is interesting.


How big are your rows (i.e. how much data is in the table, you tell with du in HDFS)? And
how many columns do you have? Any column families?

How many regions are in this table? (you can tell that through the HBase HMaster UI page)

When you execute the query, are all HBase region servers busy? Do you see IO, or just high
CPU?


Client batching won't help with an aggregate (such as count) where not much data is transferred
back to the client.


Thanks.

-- Lars



________________________________
 From: Abe Weinograd <abe@flonet.com>
To: user <user@phoenix.apache.org> 
Sent: Wednesday, October 8, 2014 9:15 AM
Subject: Re: count on large table
 


Good point.  I have to figure out how to do that in a SQL Tool like Squirrel or workbench.

Is there any obvious thing i can do to help tune this?  I know that's a loaded question. 
My client scanner batches are 1000 (also tried 10000 with no luck).

Thanks,
Abe




On Tue, Oct 7, 2014 at 9:09 PM, sunfl@certusnet.com.cn <sunfl@certusnet.com.cn> wrote:

Hi, Abe
>Maybe setting the following property would help...
><property> 
>    <name>phoenix.query.timeoutMs</name> 
>    <value>3600000</value> 
></property>
>
>
>Thanks,
>Sun
>
>
>________________________________
> 
>
>________________________________
>
>
>
>From: Abe Weinograd
>>Date: 2014-10-08 04:34
>>To: user
>>Subject: count on large table
>>I have a table with 1B  rows.  I know this can is very specific to my environment,
but just doing a SELECT COUNT(1) on the table   It never finished.  
>>
>>
>>We have a 10 node cluster with the RS's Heap size at 26GiB and skewed towards the
block cache.  In the RS logs, i see a lot of these:
>>
>>
>>2014-10-07 16:27:04,942 WARN org.apache.hadoop.ipc.RpcServer: (responseTooSlow): {"processingtimems":22770,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.10.0.10:44791","starttimems":1412713602172,"queuetimems":0,"class":"HRegionServer","responsesize":8,"method":"Scan"}
>>
>>
>>
>>They stop eventually, but i the query times out and the query tool reports: org.apache.phoenix.exception.PhoenixIOException:
187541ms passed since the last invocation, timeout is currently set to 60000
>>
>>
>>Any ideas of where I can start in order to figure this out?
>>
>>
>>using Phoenix 4.1 on CDH 5.1 (Hbase 0.98.1)
>>
>>
>>Thanks,
>>Abe
Mime
View raw message