Hi,

I have found a way to decrease the number of lines which have missed fields, I've set the parameter base.client.scanner.max.result.size

Each time I increase base.client.scanner.max.result.size, failed lines decreased.

But If I put an high value like 10MB, I have java heap space : 
java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 796 number_of_rows: 2147483647 close_scanner: false next_call_seq: 0 client_handles_partials: true client_handles_heartbeats: true track_scan_metrics: false renew: false
Error: Java heap space

I am actually blocked if I want to use pig, or MR or flink without limit case.

Any Idea ?

My params : Phoenix 4.8.1/Hbase 1.2 Pig 0.16 base.client.scanner.max.result.size = 6MB, 125 422 608 lines dumped, 5,672 which have missing datas.

Regards, 

Guillaume

2016-11-24 16:01 GMT+01:00 Salou Guillaume <g.salou@gmail.com>:
MR jobs works with a limit !!!

Thanks jinzhuan

Hope migrate on 4.8.2 will solve this issue, because I prefer using pig for dumps.

Guillame

2016-11-24 14:53 GMT+01:00 Salou Guillaume <g.salou@gmail.com>:
My rows are bigger than 1KB.

I've tried to put limit in pig query, but it's not possible, trying to do with MR job.

jinzhuan what is your framework to do the dump ?

The problem doesn't exist with dump throw flink.

Regards,


2016-11-24 14:19 GMT+01:00 金砖 <jinzhuan@wacai.com>:

thanks Ankit.

PHOEINX-3112 is not my case.

Each row is less than 1KB.


----------------------------------------------------------------
金砖
挖财网络技术有限公司
地址:杭州市西湖区古翠路80号浙江科技产业大厦12楼
手机:15558015995

 原始邮件 
发件人: Ankit Singhal<ankitsinghal59@gmail.com>
收件人: user<user@phoenix.apache.org>
发送时间: 2016年11月24日(周四) 20:52
主题: Re: huge query result miss some fields

Do you have bigger rows? if yes , it may be similar to  https://issues.apache.org/jira/browse/PHOENIX-3112 and increasing hbase.client.scanner.max.result.size can help.



On Thu, Nov 24, 2016 at 6:00 PM, 金砖 <jinzhuan@wacai.com> wrote:

thanks Abel.


I tried update statistics, it did not work.


But after some retries, I found something interesting:

I add  'limit 100000000'  after my sql.

Even actual size of result is the same(since there’s only 100000 rows in table),  but the missing problem is solved.


----------------------------------------------------------------
金砖
挖财网络技术有限公司
地址:杭州市西湖区古翠路80号浙江科技产业大厦12楼
手机:15558015995

 原始邮件 
发件人: Abel Fernández<mevsmyself@gmail.com>
收件人: user<user@phoenix.apache.org>
发送时间: 2016年11月24日(周四) 20:14
主题: Re: huge query result miss some fields

Hi Jinzhuan,

Have you tried to update the statistics of your table?



On Thu, 24 Nov 2016 at 11:46 金砖 <jinzhuan@wacai.com> wrote:

hi, all:

I’m using phoenix-4.8.0-hbase-1.1 with hbase 1.1.3.

When query a lot of rows(ex: 100,000),  some fileds of rows does not exists  in result set.


steps

1. I created a table test(pk varchar primary key, id bigint, name varchar, age bigint).

2. then populated with 100000 rows, with key prefix 'prefix', and rows will be 'prefix1' - 'prefix100000'.

3. then query with select * from test;


occasionally some fields with be lost in rows, 

sometimes 2 rows missing id and age, 

some times 3 rows missing name.


Can anyone be helpful?  Is there some settings should be done ?


----------------------------------------------------------------
jinzhuan
--
Un saludo - Best Regards.
Abel