phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Gandola <pedro.gand...@gmail.com>
Subject Re: Can phoenix local indexes create a deadlock after an HBase full restart?
Date Wed, 06 Jan 2016 15:16:19 GMT
Hi Guys,

The issue is a deadlock but it's not related with phoenix and it can be
resolved increasing the number of threads responsible for opening the
regions.

<property>
>  <name>hbase.regionserver.executor.openregion.threads</name>
>  <value>100</value>
> </property>


Got help from here
<https://community.hortonworks.com/questions/8757/phoenix-local-indexes.html>
.

Thanks
Cheers
Pedro

On Tue, Jan 5, 2016 at 10:18 PM, Pedro Gandola <pedro.gandola@gmail.com>
wrote:

> Hi Guys,
>
> I have been testing out the Phoenix Local Indexes and I'm facing an issue
> after restart the entire HBase cluster.
>
> *Scenario:* I'm using Phoenix 4.4 and HBase 1.1.1. My test cluster
> contains 10 machines and the main table contains 300 pre-split regions
> which implies 300 regions on local index table as well and to configure
> Phoenix I followed thistutorial
> <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/configuring-hbase-for-phoenix.html>
> .
>
> When I start a fresh cluster everything is just fine, the local index is
> created and I can insert data and query it using the proper indexes. The
> problem comes when I perform a full restart of the cluster to update some
> configurations in that moment I'm not able to restart the cluster anymore.
> I should do a proper rolling restart but it looks that Ambari is not doing
> it in some situations.
>
> Most of the servers are throwing exceptions like:
>
> INFO  [htable-pool7-t1] client.AsyncProcess: #5,
>> table=_LOCAL_IDX_BIDDING_EVENTS, attempt=27/350 failed=1ops, last
>> exception: org.apache.hadoop.hbase.NotServingRegionException:
>> org.apache.hadoop.hbase.NotServingRegionException: Region
>> _LOCAL_IDX_BIDDING_EVENTS,57e4b17e4b17e4ac,1451943466164.253bdee3695b566545329fa3ac86d05e.
>> is not online on ip-10-5-4-24.ec2.internal,16020,1451996088952
>> at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2898)
>> at
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:947)
>> at
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:1991)
>> at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213)
>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
>> at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>> at java.lang.Thread.run(Thread.java:745)
>>  on ip-10-5-4-24.ec2.internal,16020,1451942002174, tracking started null,
>> retrying after=20001ms, replay=1ops
>> INFO
>>  [ip-10-5-4-26.ec2.internal,16020,1451996087089-recovery-writer--pool5-t1]
>> client.AsyncProcess: #3, waiting for 2  actions to finish
>> INFO
>>  [ip-10-5-4-26.ec2.internal,16020,1451996087089-recovery-writer--pool5-t2]
>> client.AsyncProcess: #4, waiting for 2  actions to finish
>
>
> It looks that they are getting into a state where some region servers are
> waiting for other regions that are not available yet in other servers.
>
> On HBase UI I can see servers stuck on this messages:
>
> *Description:* Replaying edits from
>> hdfs://.../recovered.edits/0000000000000464197
>> *Status:* Running pre-WAL-restore hook in coprocessors (since 48mins,
>> 45sec ago)
>
>
> Another interesting thing that I noticed is the *empty coprocessor list* for
> the servers that are stuck with 0 regions assigned.
>
> HBase master goes down after logging some of these messages:
>
> GeneralBulkAssigner: Failed bulking assigning N regions
>
>
> I was able to perform full restarts before start using local indexes and
> everything worked fine. This can probably be a misconfiguration from my
> side but I have checked different properties and approaches to restart the
> cluster and I'm unable to do it.
>
> My understanding about local indexes on phoenix (please correct me if I'm
> wrong) is that they are normal HBase tables and phoenix places the regions
> to ensure the proper data locality. Is the data locality fully maintained
> when we lose N region servers and/or the regions are moved?
>
> Any insights would be very helpful.
>
> Thank you
> Cheers
> Pedro
>

Mime
View raw message