phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chabot, Jerry" <jerry.p.cha...@hpe.com>
Subject How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others
Date Thu, 05 May 2016 14:42:13 GMT
We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or
more). But now, some users can no longer search. Others do not have problems for a similar
search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix
view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search
is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND
A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first
limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and
L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding
C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and
have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs.
We use the default. This recommendation does not seem applicable because the query fails almost
immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all
seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE
clause. We do use a connection pool. Once again, the problem with using a connection pool
appears to be if you use multi-tenant. Are there other reasons not to use a connection pool?
We are also experiencing an unbalanced workload across our region servers. I have not confirmed,
at this time, whether the localities for the users that have failing searches reside on the
overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled
access, so it is difficult to enable debug logging. Does anyone have suggestions on how to
troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems
upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it
was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained
by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be
limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11
deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException:
org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been
removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException:
org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry

Mime
View raw message