phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Salou Guillaume <g.sa...@gmail.com>
Subject Re: huge query result miss some fields
Date Thu, 24 Nov 2016 22:48:30 GMT
Hi,

I have found a way to decrease the number of lines which have missed
fields, I've set the parameter base.client.scanner.max.result.size
<https://issues.apache.org/jira/browse/PHOENIX-3118>

Each time I increase base.client.scanner.max.result.size
<https://issues.apache.org/jira/browse/PHOENIX-3118>, failed lines
decreased.

But If I put an high value like 10MB, I have java heap space :
java.lang.RuntimeException:
org.apache.phoenix.exception.PhoenixIOException: Failed after retry of
OutOfOrderScannerNextException: was there a rpc timeout?
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException):
org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected
nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id:
796 number_of_rows: 2147483647 close_scanner: false next_call_seq: 0
client_handles_partials: true client_handles_heartbeats: true
track_scan_metrics: false renew: false
Error: Java heap space

I am actually blocked if I want to use pig, or MR or flink without limit
case.

Any Idea ?

My params : Phoenix 4.8.1/Hbase 1.2 Pig 0.16
base.client.scanner.max.result.size
<https://issues.apache.org/jira/browse/PHOENIX-3118> = 6MB, 125 422 608
lines dumped, 5,672 which have missing datas.

Regards,

Guillaume

2016-11-24 16:01 GMT+01:00 Salou Guillaume <g.salou@gmail.com>:

> MR jobs works with a limit !!!
>
> Thanks jinzhuan
>
> Hope migrate on 4.8.2 will solve this issue, because I prefer using pig
> for dumps.
>
> Guillame
>
> 2016-11-24 14:53 GMT+01:00 Salou Guillaume <g.salou@gmail.com>:
>
>> My rows are bigger than 1KB.
>>
>> I've tried to put limit in pig query, but it's not possible, trying to do
>> with MR job.
>>
>> jinzhuan what is your framework to do the dump ?
>>
>> The problem doesn't exist with dump throw flink.
>>
>> *Regards,*
>>
>>
>> 2016-11-24 14:19 GMT+01:00 金砖 <jinzhuan@wacai.com>:
>>
>>> thanks Ankit.
>>>
>>> PHOEINX-3112 is not my case.
>>>
>>> Each row is less than 1KB.
>>>
>>> ----------------------------------------------------------------
>>> *金砖*
>>> 挖财网络技术有限公司
>>> 地址:杭州市西湖区古翠路80号浙江科技产业大厦12楼
>>> 手机:15558015995
>>>
>>>  原始邮件
>>> *发件人:* Ankit Singhal<ankitsinghal59@gmail.com>
>>> *收件人:* user<user@phoenix.apache.org>
>>> *发送时间:* 2016年11月24日(周四) 20:52
>>> *主题:* Re: huge query result miss some fields
>>>
>>> Do you have bigger rows? if yes , it may be similar to
>>> https://issues.apache.org/jira/browse/PHOENIX-3112 and
>>> increasing hbase.client.scanner.max.result.size can help.
>>>
>>>
>>>
>>> On Thu, Nov 24, 2016 at 6:00 PM, 金砖 <jinzhuan@wacai.com> wrote:
>>>
>>>> thanks Abel.
>>>>
>>>>
>>>> I tried update statistics, it did not work.
>>>>
>>>>
>>>> But after some retries, I found something interesting:
>>>>
>>>> I add  'limit 100000000'  after my sql.
>>>>
>>>> Even actual size of result is the same(since there’s only 100000 rows
>>>> in table),  but the missing problem is solved.
>>>>
>>>> ----------------------------------------------------------------
>>>> *金砖*
>>>> 挖财网络技术有限公司
>>>> 地址:杭州市西湖区古翠路80号浙江科技产业大厦12楼
>>>> 手机:15558015995
>>>>
>>>>  原始邮件
>>>> *发件人:* Abel Fernández<mevsmyself@gmail.com>
>>>> *收件人:* user<user@phoenix.apache.org>
>>>> *发送时间:* 2016年11月24日(周四) 20:14
>>>> *主题:* Re: huge query result miss some fields
>>>>
>>>> Hi Jinzhuan,
>>>>
>>>> Have you tried to update the statistics of your table?
>>>>
>>>> https://phoenix.apache.org/update_statistics.html
>>>>
>>>>
>>>> On Thu, 24 Nov 2016 at 11:46 金砖 <jinzhuan@wacai.com> wrote:
>>>>
>>>>> hi, all:
>>>>>
>>>>> I’m using phoenix-4.8.0-hbase-1.1 with hbase 1.1.3.
>>>>>
>>>>> When query a lot of rows(ex: 100,000),  some fileds of rows does not
>>>>> exists  in result set.
>>>>>
>>>>>
>>>>> steps
>>>>>
>>>>> 1. I created a table test(pk varchar primary key, id bigint, name
>>>>> varchar, age bigint).
>>>>>
>>>>> 2. then populated with 100000 rows, with key prefix 'prefix', and rows
>>>>> will be 'prefix1' - 'prefix100000'.
>>>>>
>>>>> 3. then query with select * from test;
>>>>>
>>>>>
>>>>> occasionally some fields with be lost in rows,
>>>>>
>>>>> sometimes 2 rows missing id and age,
>>>>>
>>>>> some times 3 rows missing name.
>>>>>
>>>>>
>>>>> Can anyone be helpful?  Is there some settings should be done ?
>>>>>
>>>>> ----------------------------------------------------------------
>>>>> *jinzhuan*
>>>>>
>>>> --
>>>> Un saludo - Best Regards.
>>>> Abel
>>>>
>>>
>>>
>>
>

Mime
View raw message