phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: HBase Timeout on queries
Date Tue, 06 Feb 2018 09:06:49 GMT
Hi Pedro,
I was query the COUNT just as a first dumb query to test if everything was
ok...indeed I had to increase 4 timeouts in order to answer that query
without errors.
By the way, I think that count is something very useful to know about a
table and, IMHO, should be something always available as a table metadata.
I don't know why HBase does't care that much about that info...

Best,
Flavio

On Mon, Feb 5, 2018 at 7:10 PM, Pedro Boado <pedro.boado@gmail.com> wrote:

> Flavio I get same behaviour, a count(*) over 180M records needs a couple
> of minutes to complete for a table with 10 regions and 4 rs serving it.
>
> Why are you evaluating robustness in terms of full scans? As Anil said I
> wouldn't expect a NoSQL database to run quick counts on hundreds of
> millions or even billions of records.
>
> In terms of usage we have a production  Phoenix cluster with 12 RS serving
> a table with ~100 billion records (6TB)  - . Queries always scan by first
> column of our primary key, meaning no more than a few thousand records are
> pulled in well under a second response time.
>
>
> On 1 Feb 2018 16:38, "James Taylor" <jamestaylor@apache.org> wrote:
>
> I don’t think the HBase row_counter job is going to be faster than a
> count(*) query. Both require a full table scan, so neither will be
> particularly fast.
>
> A couple of alternatives if you’re ok with an approximate count: 1) enable
> stats collection (but you can leave off usage to parallelize queries) and
> the do a SUM over the size column for the table using stats table directly,
> or 2) do a count(*) using TABLESAMPLE clause (again enabling stats as
> described above) to prevent a full scan.
>
> On Thu, Feb 1, 2018 at 8:11 AM Flavio Pompermaier <pompermaier@okkam.it>
> wrote:
>
>> Hi Anil,
>> Obviously I'm not using HBase just for the count query..Most of the time
>> I do INSERT and selective queries, I was just trying to figure out if my
>> HBase + Phoenix installation is robust enough to deal with a huge amount of
>> data..
>>
>> On Thu, Feb 1, 2018 at 5:07 PM, anil gupta <anilgupta84@gmail.com> wrote:
>>
>>> Hey Flavio,
>>>
>>> IMHO, If most of your app is just doing full table scans then i am not
>>> really sure HBase(or any other NoSql) will be a good fit for your
>>> solution.(building an OLAP system?) If you have point lookups and short
>>> range scans then HBase/Phoenix will work well.
>>> Also, if you wanna do select count(*). The HBase row_counter job will be
>>> much faster than phoenix queries.
>>>
>>> Thanks,
>>> Anil Gupta
>>>
>>> On Thu, Feb 1, 2018 at 7:35 AM, Flavio Pompermaier <pompermaier@okkam.it
>>> > wrote:
>>>
>>>> I was able to make it work changing the following params (both on
>>>> server and client side and restarting hbase) and now the query answers in
>>>> about 6 minutes:
>>>>
>>>> hbase.rpc.timeout (to 600000)
>>>> phoenix.query.timeoutMs (to 600000)
>>>> hbase.client.scanner.timeout.period (from 1 m to 10m)
>>>> hbase.regionserver.lease.period (from 1 m to 10m)
>>>>
>>>> However I'd like to know id those performances could be easily improved
>>>> or not. Any ideas?
>>>>
>>>> On Thu, Feb 1, 2018 at 4:30 PM, Vaghawan Ojha <vaghawan781@gmail.com>
>>>> wrote:
>>>>
>>>>> I've the same problem, even after I increased the hbase.rpc.timeout
>>>>> the result is same. The difference is that I use 4.12.
>>>>>
>>>>>
>>>>> On Thu, Feb 1, 2018 at 8:23 PM, Flavio Pompermaier <
>>>>> pompermaier@okkam.it> wrote:
>>>>>
>>>>>> Hi to all,
>>>>>> I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase
>>>>>> and everything was fine until the data was quite small (about few
>>>>>> millions). As I inserted 170 M of rows in my table I cannot get the
row
>>>>>> count anymore (using ELECT COUNT) because of org.apache.hbase.ipc.CallTimeoutException
>>>>>> (operationTimeout 60000 expired).
>>>>>> How can I fix this problem? I could increase the hbase.rpc.timeout
>>>>>> parameter but I suspect I could improve a little bit the HBase performance
>>>>>> first..the problem is that I don't know how.
>>>>>>
>>>>>> Thanks in advance,
>>>>>> Flavio
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards,
>>> Anil Gupta
>>>
>>
>

Mime
View raw message