phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Boado <>
Subject Re: HBase Timeout on queries
Date Mon, 05 Feb 2018 18:10:20 GMT
Flavio I get same behaviour, a count(*) over 180M records needs a couple of
minutes to complete for a table with 10 regions and 4 rs serving it.

Why are you evaluating robustness in terms of full scans? As Anil said I
wouldn't expect a NoSQL database to run quick counts on hundreds of
millions or even billions of records.

In terms of usage we have a production  Phoenix cluster with 12 RS serving
a table with ~100 billion records (6TB)  - . Queries always scan by first
column of our primary key, meaning no more than a few thousand records are
pulled in well under a second response time.

On 1 Feb 2018 16:38, "James Taylor" <> wrote:

I don’t think the HBase row_counter job is going to be faster than a
count(*) query. Both require a full table scan, so neither will be
particularly fast.

A couple of alternatives if you’re ok with an approximate count: 1) enable
stats collection (but you can leave off usage to parallelize queries) and
the do a SUM over the size column for the table using stats table directly,
or 2) do a count(*) using TABLESAMPLE clause (again enabling stats as
described above) to prevent a full scan.

On Thu, Feb 1, 2018 at 8:11 AM Flavio Pompermaier <>

> Hi Anil,
> Obviously I'm not using HBase just for the count query..Most of the time I
> do INSERT and selective queries, I was just trying to figure out if my
> HBase + Phoenix installation is robust enough to deal with a huge amount of
> data..
> On Thu, Feb 1, 2018 at 5:07 PM, anil gupta <> wrote:
>> Hey Flavio,
>> IMHO, If most of your app is just doing full table scans then i am not
>> really sure HBase(or any other NoSql) will be a good fit for your
>> solution.(building an OLAP system?) If you have point lookups and short
>> range scans then HBase/Phoenix will work well.
>> Also, if you wanna do select count(*). The HBase row_counter job will be
>> much faster than phoenix queries.
>> Thanks,
>> Anil Gupta
>> On Thu, Feb 1, 2018 at 7:35 AM, Flavio Pompermaier <>
>> wrote:
>>> I was able to make it work changing the following params (both on server
>>> and client side and restarting hbase) and now the query answers in about 6
>>> minutes:
>>> hbase.rpc.timeout (to 600000)
>>> phoenix.query.timeoutMs (to 600000)
>>> hbase.client.scanner.timeout.period (from 1 m to 10m)
>>> (from 1 m to 10m)
>>> However I'd like to know id those performances could be easily improved
>>> or not. Any ideas?
>>> On Thu, Feb 1, 2018 at 4:30 PM, Vaghawan Ojha <>
>>> wrote:
>>>> I've the same problem, even after I increased the hbase.rpc.timeout the
>>>> result is same. The difference is that I use 4.12.
>>>> On Thu, Feb 1, 2018 at 8:23 PM, Flavio Pompermaier <
>>>>> wrote:
>>>>> Hi to all,
>>>>> I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase
>>>>> and everything was fine until the data was quite small (about few
>>>>> millions). As I inserted 170 M of rows in my table I cannot get the row
>>>>> count anymore (using ELECT COUNT) because of org.apache.hbase.ipc.CallTimeoutException
>>>>> (operationTimeout 60000 expired).
>>>>> How can I fix this problem? I could increase the hbase.rpc.timeout
>>>>> parameter but I suspect I could improve a little bit the HBase performance
>>>>> first..the problem is that I don't know how.
>>>>> Thanks in advance,
>>>>> Flavio
>> --
>> Thanks & Regards,
>> Anil Gupta

View raw message