You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by "Chabot, Jerry" <je...@hpe.com> on 2016/05/05 14:42:13 UTC

How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or more). But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the default. This recommendation does not seem applicable because the query fails almost immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE clause. We do use a connection pool. Once again, the problem with using a connection pool appears to be if you use multi-tenant. Are there other reasons not to use a connection pool? We are also experiencing an unbalanced workload across our region servers. I have not confirmed, at this time, whether the localities for the users that have failing searches reside on the overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled access, so it is difficult to enable debug logging. Does anyone have suggestions on how to troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) ~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990) ~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry

RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by "Chabot, Jerry" <je...@hpe.com>.
I’ve captured the execution plan by running a standalone tool.

[hpSDID@11 deviceId="-" reqId="-" threadId="1"] executing SQL select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1
.T,T1.Y,T1.U FROM "asset_metadata" T1 WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?)) AND T1.L in (?)
AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 300
[hpSDID@11 deviceId="-" reqId="-" threadId="1"] Execution plan for sql=explain select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.
S,T1.M, T1.CW, T1.T,T1.Y,T1.U FROM "asset_metadata" T1 WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?))
AND T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 300, plan=CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER asset_metadata [[13
5,175,226,230,143,212,79,165,161,24,97,85,113,180,224,47]]|    SERVER FILTER BY (s.T = 1 AND s.N LIKE '%sai2016%')|    SERVER TOP 300 ROWS SORTED BY [T1.N]|
CLIENT MERGE SORT|    PARALLEL INNER-JOIN TABLE 0|        CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER asset_metadata [[135,175,226,230,143,212,79,165,161,
24,97,85,113,180,224,47],251381657381109760]|            SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [L, C, A]|        CLIENT MERGE SORT|    DYNAMIC SERV
ER FILTER BY (L, C, A) IN (($1.$2, $1.$3, $1.$4))|    AFTER-JOIN SERVER FILTER BY T1.R = $1.$5|

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Chabot, Jerry
Sent: Monday, August 15, 2016 8:59 AM
To: user@phoenix.apache.org
Subject: RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

I’ve added the hint to the SELECT. Does anyone see anything wrong with this statement?

select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1.T,T1.Y,T1.U
   FROM "asset_metadata" T1
   WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?))
      AND T1.L in (?) AND T1.N ILIKE ? order by T1.N asc nulls first limit 100

I ask because the query still fails. The stack trace in the region server is the same as before. It appears to still be using a hash-join.

Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: \^V???0^M^G. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)

How can I confirm SORT_MERGE is being used?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Kathiresan S [mailto:kathiresanselvaraj@gmail.com]
Sent: Thursday, May 05, 2016 11:14 AM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

We had a similar problem while using explicit JOIN in the query (we didn't have sub-queries). We worked it around by switching to sort merge join from hash join (to get it run, though it'll be bit slower than hash join as per documentation)

Try adding the hint /*+ USE_SORT_MERGE_JOIN*/ in the SELECT query.

Thanks,
Kathir

On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>> wrote:
We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or more). But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the default. This recommendation does not seem applicable because the query fails almost immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE clause. We do use a connection pool. Once again, the problem with using a connection pool appears to be if you use multi-tenant. Are there other reasons not to use a connection pool? We are also experiencing an unbalanced workload across our region servers. I have not confirmed, at this time, whether the localities for the users that have failing searches reside on the overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled access, so it is difficult to enable debug logging. Does anyone have suggestions on how to troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) ~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990) ~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry


RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by "Chabot, Jerry" <je...@hpe.com>.
Here is another interesting tidbit.  One type of query works while the other does not. Again, only for some users. Let me elaborate.

The view definition over an existing HBase table is as follows.

CREATE VIEW "asset_metadata" (
    L BINARY (16) NOT NULL,
    C unsigned_long NOT NULL,
    A BINARY(16) NOT NULL,
    ReverseTS unsigned_long NOT NULL, -- reverse timestamp
    "s".R unsigned_long,
    "s".W unsigned_long,
    "s".P BINARY(16),
    "s".N VARCHAR,
    "s".E VARCHAR,
    "s".S unsigned_long,
    "s".M unsigned_long,
    "s".CW unsigned_long,
    "s".T unsigned_int,
    "s".Y VARCHAR,
    "s".U unsigned_long,
        CONSTRAINT pk PRIMARY KEY (L, C, A, ReverseTS)
);

The L, A and P are actually guids which are stored as two longs.

There are two types of queries. One that uses L = (?,?) and another that uses L = ?. The former does not work for some people. But, the latter does work for them. Both using the same single L value.
select T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1.T,T1.Y,T1.U FROM "asset_metadata" T1
   WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?))
   AND T1.L in (?) AND T1.N ILIKE ? order by T1.N asc nulls first limit 100
versus
select T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1.T,T1.Y,T1.U FROM "asset_metadata" T1
   WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L =?)
   AND T1.L = ? AND T1.N ILIKE ? order by T1.N asc nulls first limit 100
BTW, the subquery is used to select the most recent revision for each L,C,A combo.
I thought the data may be the issue. For example, one region server is having a problem. If it was data related, why would one query work and the other not?
Here is the execution plan.
CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER asset_metadata [[135,175,226,230,143,212,79,165,161,24,97,85,113,180,224,47]]|    SERVER FILTER BY (s.T = 1 AND s.N LIKE '%sai2016%')|    SERVER TOP 300 ROWS SORTED BY [T1.N]|CLIENT MERGE SORT|    PARALLEL INNER-JOIN TABLE 0|        CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER asset_metadata [[135,175,226,230,143,212,79,165,161,24,97,85,113,180,224,47],251381657381109760]|            SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [L, C, A]|        CLIENT MERGE SORT|    DYNAMIC SERVER FILTER BY (L, C, A) IN (($1.$2, $1.$3, $1.$4))|    AFTER-JOIN SERVER FILTER BY T1.R = $1.$5|
I noticed there should be another C=? in the primary where clause so it is reflected in the 2nd range scan. However, the L is actually unique today so additional rows won’t be returned. What this does tell me is the first range scan is the subquery. Which, without the hint to use sort-merge, would be cached in memory. Correct?
From: Chabot, Jerry
Sent: Tuesday, August 16, 2016 4:24 PM
To: user@phoenix.apache.org
Subject: RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

I ran the command along with ‘update statistics <table>’. However, the hash join error remains. The plan is to upgrade to Phoenix 4.8. However, that will take time. I still need to pursue other options.

Given this an HBase table. I am wondering if recreating the view would help.

Any suggestions are greatly appreciated.

-Jerry

From: Chabot, Jerry
Sent: Tuesday, August 16, 2016 11:23 AM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

I was looking at the 4.2.2 source code. I could not find the “USE_SORT_MERGE_JOIN” anywhere. Thank you for clarifying.

In the past, the stats were manually updated using the “UPDATE STATISTICS asset_metadata’ command. What is the side effect of deleting the guide posts? Will new stats get generated automatically or should I execute the update statistics command after deleting the stats?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Ankit Singhal [mailto:ankitsinghal59@gmail.com]
Sent: Tuesday, August 16, 2016 12:55 AM
To: user <us...@phoenix.apache.org>>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Hi Chabot,

"USE_SORT_MERGE_JOIN" hint is supported from v4.3.0 onwards. As your version is 4.2.2 ,so it is not getting used in your query.

It seems , your stats are corrupted because of region merge or split, resulting in some regionserver to not receive a hash join cache. Though stats collection is improved in later versions to handle such cases but for now, you can try deleting the guidePosts from SYSTEM.STATS  ,

DELETE FROM SYSTEM.STATS WHERE PHYSICAL_NAME='<tablename>';

bq. But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.
Yes, it could be possible because some users are hitting certain key range only depending upon the first column(prefix) of the row key.

Regards,
Ankit Singhal






On Mon, Aug 15, 2016 at 6:29 PM, Chabot, Jerry <je...@hpe.com>> wrote:
I’ve added the hint to the SELECT. Does anyone see anything wrong with this statement?

select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1.T,T1.Y,T1.U
   FROM "asset_metadata" T1
   WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?))
      AND T1.L in (?) AND T1.N ILIKE ? order by T1.N asc nulls first limit 100

I ask because the query still fails. The stack trace in the region server is the same as before. It appears to still be using a hash-join.

Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: \^V???0^M^G. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)

How can I confirm SORT_MERGE is being used?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Kathiresan S [mailto:kathiresanselvaraj@gmail.com<ma...@gmail.com>]
Sent: Thursday, May 05, 2016 11:14 AM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

We had a similar problem while using explicit JOIN in the query (we didn't have sub-queries). We worked it around by switching to sort merge join from hash join (to get it run, though it'll be bit slower than hash join as per documentation)

Try adding the hint /*+ USE_SORT_MERGE_JOIN*/ in the SELECT query.

Thanks,
Kathir

On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>> wrote:
We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or more). But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the default. This recommendation does not seem applicable because the query fails almost immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE clause. We do use a connection pool. Once again, the problem with using a connection pool appears to be if you use multi-tenant. Are there other reasons not to use a connection pool? We are also experiencing an unbalanced workload across our region servers. I have not confirmed, at this time, whether the localities for the users that have failing searches reside on the overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled access, so it is difficult to enable debug logging. Does anyone have suggestions on how to troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) ~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990) ~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry



RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by "Chabot, Jerry" <je...@hpe.com>.
I ran the command along with ‘update statistics <table>’. However, the hash join error remains. The plan is to upgrade to Phoenix 4.8. However, that will take time. I still need to pursue other options.

Given this an HBase table. I am wondering if recreating the view would help.

Any suggestions are greatly appreciated.

-Jerry

From: Chabot, Jerry
Sent: Tuesday, August 16, 2016 11:23 AM
To: user@phoenix.apache.org
Subject: RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

I was looking at the 4.2.2 source code. I could not find the “USE_SORT_MERGE_JOIN” anywhere. Thank you for clarifying.

In the past, the stats were manually updated using the “UPDATE STATISTICS asset_metadata’ command. What is the side effect of deleting the guide posts? Will new stats get generated automatically or should I execute the update statistics command after deleting the stats?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Ankit Singhal [mailto:ankitsinghal59@gmail.com]
Sent: Tuesday, August 16, 2016 12:55 AM
To: user <us...@phoenix.apache.org>>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Hi Chabot,

"USE_SORT_MERGE_JOIN" hint is supported from v4.3.0 onwards. As your version is 4.2.2 ,so it is not getting used in your query.

It seems , your stats are corrupted because of region merge or split, resulting in some regionserver to not receive a hash join cache. Though stats collection is improved in later versions to handle such cases but for now, you can try deleting the guidePosts from SYSTEM.STATS  ,

DELETE FROM SYSTEM.STATS WHERE PHYSICAL_NAME='<tablename>';

bq. But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.
Yes, it could be possible because some users are hitting certain key range only depending upon the first column(prefix) of the row key.

Regards,
Ankit Singhal






On Mon, Aug 15, 2016 at 6:29 PM, Chabot, Jerry <je...@hpe.com>> wrote:
I’ve added the hint to the SELECT. Does anyone see anything wrong with this statement?

select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1.T,T1.Y,T1.U
   FROM "asset_metadata" T1
   WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?))
      AND T1.L in (?) AND T1.N ILIKE ? order by T1.N asc nulls first limit 100

I ask because the query still fails. The stack trace in the region server is the same as before. It appears to still be using a hash-join.

Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: \^V???0^M^G. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)

How can I confirm SORT_MERGE is being used?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Kathiresan S [mailto:kathiresanselvaraj@gmail.com<ma...@gmail.com>]
Sent: Thursday, May 05, 2016 11:14 AM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

We had a similar problem while using explicit JOIN in the query (we didn't have sub-queries). We worked it around by switching to sort merge join from hash join (to get it run, though it'll be bit slower than hash join as per documentation)

Try adding the hint /*+ USE_SORT_MERGE_JOIN*/ in the SELECT query.

Thanks,
Kathir

On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>> wrote:
We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or more). But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the default. This recommendation does not seem applicable because the query fails almost immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE clause. We do use a connection pool. Once again, the problem with using a connection pool appears to be if you use multi-tenant. Are there other reasons not to use a connection pool? We are also experiencing an unbalanced workload across our region servers. I have not confirmed, at this time, whether the localities for the users that have failing searches reside on the overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled access, so it is difficult to enable debug logging. Does anyone have suggestions on how to troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) ~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990) ~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry



RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by "Chabot, Jerry" <je...@hpe.com>.
I was looking at the 4.2.2 source code. I could not find the “USE_SORT_MERGE_JOIN” anywhere. Thank you for clarifying.

In the past, the stats were manually updated using the “UPDATE STATISTICS asset_metadata’ command. What is the side effect of deleting the guide posts? Will new stats get generated automatically or should I execute the update statistics command after deleting the stats?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Ankit Singhal [mailto:ankitsinghal59@gmail.com]
Sent: Tuesday, August 16, 2016 12:55 AM
To: user <us...@phoenix.apache.org>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Hi Chabot,

"USE_SORT_MERGE_JOIN" hint is supported from v4.3.0 onwards. As your version is 4.2.2 ,so it is not getting used in your query.

It seems , your stats are corrupted because of region merge or split, resulting in some regionserver to not receive a hash join cache. Though stats collection is improved in later versions to handle such cases but for now, you can try deleting the guidePosts from SYSTEM.STATS  ,

DELETE FROM SYSTEM.STATS WHERE PHYSICAL_NAME='<tablename>';

bq. But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.
Yes, it could be possible because some users are hitting certain key range only depending upon the first column(prefix) of the row key.

Regards,
Ankit Singhal






On Mon, Aug 15, 2016 at 6:29 PM, Chabot, Jerry <je...@hpe.com>> wrote:
I’ve added the hint to the SELECT. Does anyone see anything wrong with this statement?

select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1.T,T1.Y,T1.U
   FROM "asset_metadata" T1
   WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?))
      AND T1.L in (?) AND T1.N ILIKE ? order by T1.N asc nulls first limit 100

I ask because the query still fails. The stack trace in the region server is the same as before. It appears to still be using a hash-join.

Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: \^V???0^M^G. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)

How can I confirm SORT_MERGE is being used?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Kathiresan S [mailto:kathiresanselvaraj@gmail.com<ma...@gmail.com>]
Sent: Thursday, May 05, 2016 11:14 AM
To: user@phoenix.apache.org<ma...@phoenix.apache.org>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

We had a similar problem while using explicit JOIN in the query (we didn't have sub-queries). We worked it around by switching to sort merge join from hash join (to get it run, though it'll be bit slower than hash join as per documentation)

Try adding the hint /*+ USE_SORT_MERGE_JOIN*/ in the SELECT query.

Thanks,
Kathir

On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>> wrote:
We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or more). But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the default. This recommendation does not seem applicable because the query fails almost immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE clause. We do use a connection pool. Once again, the problem with using a connection pool appears to be if you use multi-tenant. Are there other reasons not to use a connection pool? We are also experiencing an unbalanced workload across our region servers. I have not confirmed, at this time, whether the localities for the users that have failing searches reside on the overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled access, so it is difficult to enable debug logging. Does anyone have suggestions on how to troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) ~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990) ~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry



Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by Ankit Singhal <an...@gmail.com>.
Hi Chabot,

"USE_SORT_MERGE_JOIN" hint is supported from v4.3.0 onwards. As your
version is 4.2.2 ,so it is not getting used in your query.

It seems , your stats are corrupted because of region merge or split,
resulting in some regionserver to not receive a hash join cache. Though
stats collection is improved in later versions to handle such cases but for
now, you can try deleting the guidePosts from SYSTEM.STATS  ,

DELETE FROM SYSTEM.STATS WHERE PHYSICAL_NAME='<tablename>';

bq. But now, some users can no longer search. Others do not have problems
for a similar search. Granted, the searched data is different.
Yes, it could be possible because some users are hitting certain key range
only depending upon the first column(prefix) of the row key.

Regards,
Ankit Singhal






On Mon, Aug 15, 2016 at 6:29 PM, Chabot, Jerry <je...@hpe.com>
wrote:

> I’ve added the hint to the SELECT. Does anyone see anything wrong with
> this statement?
>
>
>
> select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M,
> T1.CW, T1.T,T1.Y,T1.U
>
>    FROM "asset_metadata" T1
>
>    WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C
> AND A=T1.A AND C=? AND L IN (?))
>
>       AND T1.L in (?) AND T1.N ILIKE ? order by T1.N asc nulls first limit
> 100
>
>
>
> I ask because the query still fails. The stack trace in the region server
> is the same as before. It appears to still be using a hash-join.
>
>
>
> Caused by: org.apache.phoenix.exception.PhoenixIOException:
> org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.
> DoNotRetryIOException:
>
> Could not find hash cache for joinId: \^V???0^M^G. The cache might have
> expired and have been removed.
>
>         at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(
> HashJoinRegionScanner.java:96)
>
>         at org.apache.phoenix.coprocessor.ScanRegionObserver.
> doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.
> postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.
> postScannerOpen(RegionCoprocessorHost.java:1976)
>
>
>
> How can I confirm SORT_MERGE is being used?
>
>
>
> *Jerry P. Chabot*  | Master Software Designer 7  |  Big Data Solutions,  Hewlett
> Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com
>
>
>
> *From:* Kathiresan S [mailto:kathiresanselvaraj@gmail.com]
> *Sent:* Thursday, May 05, 2016 11:14 AM
> *To:* user@phoenix.apache.org
> *Subject:* Re: How to troubleshoot 'Could not find hash cache for joinId'
> which is failing always for some users and never for others
>
>
>
> We had a similar problem while using explicit JOIN in the query (we didn't
> have sub-queries). We worked it around by switching to sort merge join from
> hash join (to get it run, though it'll be bit slower than hash join as per
> documentation)
>
>
>
> Try adding the hint */*+ USE_SORT_MERGE_JOIN*/ *in the SELECT query.
>
>
>
> Thanks,
>
> Kathir
>
>
>
> On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>
> wrote:
>
> We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period
> of time (a year or more). But now, some users can no longer search. Others
> do not have problems for a similar search. Granted, the searched data is
> different.
>
>
>
> The 'asset' table was created and maintained using an HBase client. Search
> uses a Phoenix view on this table. The row key is L,C,A,R.
>
>
>
> A search that is failing is what we call a multiple locality search. A
> single locality search is working. In multi-locality search, there is an IN
> L=(...) clause.
>
>
>
> Note, the L is the first column in the row key. The query is shown below.
>
>
>
>     SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L
> and C=T1.C AND A=T1.A AND C=? AND L IN (?))
>
>                   AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by
> T1.N asc nulls first limit 500
>
>
>
> The subquery selects the latest revision of each 'asset'. The additional
> conditions (C=? and L IN (?)) was added to make the subquery more
> efficient. I did notice there is no corresponding C=? in the primary WHERE
> clause.
>
>
>
> The root cause is "Could not find hash cache for joinId: The cache might
> have expired and have been removed." One recommendation on the Phoenix site
> is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use
> the default. This recommendation does not seem applicable because the query
> fails almost *immediately* - not after 30 seconds. Some bugs have
> reported a similar problem. But, they all seem to be multi-tenant. We do
> not use the multi-tenant property. We enforce via the WHERE clause. We do
> use a connection pool. Once again, the problem with using a connection pool
> appears to be if you use multi-tenant. Are there other reasons not to use a
> connection pool? We are also experiencing an unbalanced workload across our
> region servers. I have not confirmed, at this time, whether the localities
> for the users that have failing searches reside on the overloaded region
> server.
>
>
>
> Unfortunately, the problem only exists in the HBase clustered environment
> with controlled access, so it is difficult to enable debug logging. Does
> anyone have suggestions on how to troubleshoot or theories as to what could
> be the cause?
>
>
>
> Other considerations are
>
> - rework the query.
>
> - upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had
> all kinds of problems upgrading so I've taken that off the table.
>
> - provide an alternative to connection pooling. Not sure this would fix
> the problem. If it was the cause, why would search consistently work for
> some users?
>
> - replace Phoenix (obviously not a short-term option). Given the search
> table is maintained by HBase we can't leverage secondary indexes.
> Therefore, future search enhancements may be limited.
>
>
>
> <ERROR> 1 2016-05-05T13:07:49.624Z localhost - - -
> [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
>
> 4175617" threadId="330108"] Error searching assets:
> query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
>
> ... Internal database server error.
>
> Caused by: org.apache.phoenix.exception.PhoenixIOException:
> org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.
> DoNotRetryIOException:
>
> Could not find hash cache for joinId: ^@?g^R(g{. The cache might have
> expired and have been removed.
>
>        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(
> HashJoinRegionScanner.java:96)
>
>         at org.apache.phoenix.coprocessor.ScanRegionObserver.
> doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.
> postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.
> postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.
> scan(HRegionServer.java:3093)
>
>         at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:114)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>         at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortResultIterator.
> getIterators(MergeSortResultIterator.java:48)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(
> MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortResultIterator.next(
> MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(
> MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.DelegateResultIterator.next(
> DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.
> queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.
> 0.27.jar:?]
>
>         ... 53 more
>
> Caused by: java.util.concurrent.ExecutionException:
> org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
> Could no
>
> t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and
> have been removed.
>
>         at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(
> HashJoinRegionScanner.java:96)
>
>         at org.apache.phoenix.coprocessor.ScanRegionObserver.
> doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.
> postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.
> postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.
> scan(HRegionServer.java:3093)
>
>         at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:114)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>         at java.util.concurrent.FutureTask.report(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.FutureTask.get(Unknown Source)
> ~[?:1.7.0_51]
>
>         at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortResultIterator.
> getIterators(MergeSortResultIterator.java:48)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(
> MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortResultIterator.next(
> MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(
> MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.DelegateResultIterator.next(
> DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.
> queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.
> 0.27.jar:?]
>
>         ... 53 more
>
> Caused by: org.apache.phoenix.exception.PhoenixIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
> for joinId: ^@?g^R(g{.
>
> The cache might have expired and have been removed.
>
>         at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(
> HashJoinRegionScanner.java:96)
>
>         at org.apache.phoenix.coprocessor.ScanRegionObserver.
> doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.
> postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.
> postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.
> scan(HRegionServer.java:3093)
>
>         at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:114)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:94)
>
>                                       at java.lang.Thread.run(Unknown
> Source)
>
>         at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at java.util.concurrent.FutureTask.run(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source) ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source) ~[?:1.7.0_51]
>
>         ... 1 more
>
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
> for joinId: ^@?g^R(g{. Th
>
> e cache might have expired and have been removed.
>
>         at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(
> HashJoinRegionScanner.java:96)
>
>        at org.apache.phoenix.coprocessor.ScanRegionObserver.
> doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.
> postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.
> postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.
> scan(HRegionServer.java:3093)
>
>         at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:114)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>         at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown
> Source) ~[?:?]
>
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
> Source) ~[?:1.7.0_51]
>
>         at java.lang.reflect.Constructor.newInstance(Unknown Source)
> ~[?:1.7.0_51]
>
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> ~[hadoop-common-2.7.1.jar:?]
>
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
> ~[hadoop-common-2.7.1.jar:?]
>
>         at org.apache.hadoop.hbase.protobuf.ProtobufUtil.
> getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ScannerCallable.
> openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.
> callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.
> callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.
> nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.
> initializeScannerInConstruction(ClientScanner.java:188)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at java.util.concurrent.FutureTask.run(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source) ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source) ~[?:1.7.0_51]
>
>         ... 1 more
>
> Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
> for joinId: ^@?g
>
> ^R(g{. The cache might have expired and have been removed.
>
>         at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(
> HashJoinRegionScanner.java:96)
>
>         at org.apache.phoenix.coprocessor.ScanRegionObserver.
> doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.
> postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.
> postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.
> scan(HRegionServer.java:3093)
>
>         at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>                                       at org.apache.hadoop.hbase.ipc.
> RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:114)
>
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>
>
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.ipc.RpcClient$
> BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
> ~[hbase-client-0.98.5-hadoop2.jar:
>
> 0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
> ~[hbase-protocol-0.98.5-hadoop2.
>
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ScannerCallable.
> openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.
> callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.
> callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.
> nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.
> jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.
> initializeScannerInConstruction(ClientScanner.java:188)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
>
> oop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at java.util.concurrent.FutureTask.run(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source) ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source) ~[?:1.7.0_51]
>
>         ... 1 more
>
>
>
> -Jerry
>
>
>

RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by "Chabot, Jerry" <je...@hpe.com>.
I’ve added the hint to the SELECT. Does anyone see anything wrong with this statement?

select /*+ USE_SORT_MERGE_JOIN */T1.L,T1.C,T1.A,T1.R,T1.W,T1.P,T1.N,T1.E,T1.S,T1.M, T1.CW, T1.T,T1.Y,T1.U
   FROM "asset_metadata" T1
   WHERE R=(SELECT MAX(R) from "asset_metadata" WHERE L=T1.L AND C=T1.C AND A=T1.A AND C=? AND L IN (?))
      AND T1.L in (?) AND T1.N ILIKE ? order by T1.N asc nulls first limit 100

I ask because the query still fails. The stack trace in the region server is the same as before. It appears to still be using a hash-join.

Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: \^V???0^M^G. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)

How can I confirm SORT_MERGE is being used?

Jerry P. Chabot  | Master Software Designer 7  |  Big Data Solutions,  Hewlett Packard Enterprise |  508.467.1236  |  jerry.p.chabot@hpe.com<ma...@hpe.com>

From: Kathiresan S [mailto:kathiresanselvaraj@gmail.com]
Sent: Thursday, May 05, 2016 11:14 AM
To: user@phoenix.apache.org
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

We had a similar problem while using explicit JOIN in the query (we didn't have sub-queries). We worked it around by switching to sort merge join from hash join (to get it run, though it'll be bit slower than hash join as per documentation)

Try adding the hint /*+ USE_SORT_MERGE_JOIN*/ in the SELECT query.

Thanks,
Kathir

On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>> wrote:
We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or more). But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the default. This recommendation does not seem applicable because the query fails almost immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE clause. We do use a connection pool. Once again, the problem with using a connection pool appears to be if you use multi-tenant. Are there other reasons not to use a connection pool? We are also experiencing an unbalanced workload across our region servers. I have not confirmed, at this time, whether the localities for the users that have failing searches reside on the overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled access, so it is difficult to enable debug logging. Does anyone have suggestions on how to troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) ~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990) ~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry


RE: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by "Chabot, Jerry" <je...@hpe.com>.
Thank you for the responses. This is a Production environment which has not exhibited any problems until now. Therefore, we have not had a need to patch HBase. Unfortunately, upgrading Phoenix without patching HBase does not appear to be an option given the upgrade issues I encountered.

I will investigate adding the hint.

Can I rule out using connection pools as a potential cause?

-Jerry

From: James Taylor [mailto:jamestaylor@apache.org]
Sent: Thursday, May 05, 2016 11:18 AM
To: user <us...@phoenix.apache.org>
Subject: Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

It's quite possible you're hitting a bug that has been fixed as you're 5 or 6 releases behind on Phoenix and 14 patch releases behind on HBase.

Thanks,
James

On Thu, May 5, 2016 at 8:14 AM, Kathiresan S <ka...@gmail.com>> wrote:
We had a similar problem while using explicit JOIN in the query (we didn't have sub-queries). We worked it around by switching to sort merge join from hash join (to get it run, though it'll be bit slower than hash join as per documentation)

Try adding the hint /*+ USE_SORT_MERGE_JOIN*/ in the SELECT query.

Thanks,
Kathir

On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>> wrote:
We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of time (a year or more). But now, some users can no longer search. Others do not have problems for a similar search. Granted, the searched data is different.

The 'asset' table was created and maintained using an HBase client. Search uses a Phoenix view on this table. The row key is L,C,A,R.

A search that is failing is what we call a multiple locality search. A single locality search is working. In multi-locality search, there is an IN L=(...) clause.

Note, the L is the first column in the row key. The query is shown below.

    SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
                  AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N asc nulls first limit 500

The subquery selects the latest revision of each 'asset'. The additional conditions (C=? and L IN (?)) was added to make the subquery more efficient. I did notice there is no corresponding C=? in the primary WHERE clause.

The root cause is "Could not find hash cache for joinId: The cache might have expired and have been removed." One recommendation on the Phoenix site is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the default. This recommendation does not seem applicable because the query fails almost immediately - not after 30 seconds. Some bugs have reported a similar problem. But, they all seem to be multi-tenant. We do not use the multi-tenant property. We enforce via the WHERE clause. We do use a connection pool. Once again, the problem with using a connection pool appears to be if you use multi-tenant. Are there other reasons not to use a connection pool? We are also experiencing an unbalanced workload across our region servers. I have not confirmed, at this time, whether the localities for the users that have failing searches reside on the overloaded region server.

Unfortunately, the problem only exists in the HBase clustered environment with controlled access, so it is difficult to enable debug logging. Does anyone have suggestions on how to troubleshoot or theories as to what could be the cause?

Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table is maintained by HBase we can't leverage secondary indexes. Therefore, future search enhancements may be limited.

<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets: query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
       at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) ~[phoenix-core-4.2.2.jar:4.2.2]
        at com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556) ~[cmx-core-metadata-core-3.2.0.27.jar:?]
        ... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
                                      at java.lang.Thread.run(Unknown Source)
        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
       at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)
        at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:1.7.0_51]
        at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:1.7.0_51]
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) ~[hadoop-common-2.7.1.jar:?]
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
        at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
        at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
        at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
                                      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Unknown Source)

        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) ~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990) ~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738) ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
        at org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88) ~[phoenix-core-4.2.2.jar:4.2.2]
        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83) ~[phoenix-core-4.2.2.jar:4.2.2]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:1.7.0_51]
        ... 1 more

-Jerry



Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by James Taylor <ja...@apache.org>.
It's quite possible you're hitting a bug that has been fixed as you're 5 or
6 releases behind on Phoenix and 14 patch releases behind on HBase.

Thanks,
James

On Thu, May 5, 2016 at 8:14 AM, Kathiresan S <ka...@gmail.com>
wrote:

> We had a similar problem while using explicit JOIN in the query (we didn't
> have sub-queries). We worked it around by switching to sort merge join from
> hash join (to get it run, though it'll be bit slower than hash join as per
> documentation)
>
> Try adding the hint */*+ USE_SORT_MERGE_JOIN*/ *in the SELECT query.
>
> Thanks,
> Kathir
>
> On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>
> wrote:
>
>> We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period
>> of time (a year or more). But now, some users can no longer search. Others
>> do not have problems for a similar search. Granted, the searched data is
>> different.
>>
>>
>>
>> The 'asset' table was created and maintained using an HBase client.
>> Search uses a Phoenix view on this table. The row key is L,C,A,R.
>>
>>
>>
>> A search that is failing is what we call a multiple locality search. A
>> single locality search is working. In multi-locality search, there is an IN
>> L=(...) clause.
>>
>>
>>
>> Note, the L is the first column in the row key. The query is shown below.
>>
>>
>>
>>     SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE
>> L=T1.L and C=T1.C AND A=T1.A AND C=? AND L IN (?))
>>
>>                   AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by
>> T1.N asc nulls first limit 500
>>
>>
>>
>> The subquery selects the latest revision of each 'asset'. The additional
>> conditions (C=? and L IN (?)) was added to make the subquery more
>> efficient. I did notice there is no corresponding C=? in the primary WHERE
>> clause.
>>
>>
>>
>> The root cause is "Could not find hash cache for joinId: The cache might
>> have expired and have been removed." One recommendation on the Phoenix site
>> is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use
>> the default. This recommendation does not seem applicable because the query
>> fails almost *immediately* - not after 30 seconds. Some bugs have
>> reported a similar problem. But, they all seem to be multi-tenant. We do
>> not use the multi-tenant property. We enforce via the WHERE clause. We do
>> use a connection pool. Once again, the problem with using a connection pool
>> appears to be if you use multi-tenant. Are there other reasons not to use a
>> connection pool? We are also experiencing an unbalanced workload across our
>> region servers. I have not confirmed, at this time, whether the localities
>> for the users that have failing searches reside on the overloaded region
>> server.
>>
>>
>>
>> Unfortunately, the problem only exists in the HBase clustered environment
>> with controlled access, so it is difficult to enable debug logging. Does
>> anyone have suggestions on how to troubleshoot or theories as to what could
>> be the cause?
>>
>>
>>
>> Other considerations are
>>
>> - rework the query.
>>
>> - upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I
>> had all kinds of problems upgrading so I've taken that off the table.
>>
>> - provide an alternative to connection pooling. Not sure this would fix
>> the problem. If it was the cause, why would search consistently work for
>> some users?
>>
>> - replace Phoenix (obviously not a short-term option). Given the search
>> table is maintained by HBase we can't leverage secondary indexes.
>> Therefore, future search enhancements may be limited.
>>
>>
>>
>> <ERROR> 1 2016-05-05T13:07:49.624Z localhost - - -
>> [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
>>
>> 4175617" threadId="330108"] Error searching assets:
>> query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
>>
>> ... Internal database server error.
>>
>> Caused by: org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException:
>>
>> Could not find hash cache for joinId: ^@?g^R(g{. The cache might have
>> expired and have been removed.
>>
>>        at
>> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>>
>>         at
>> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>>
>>         at
>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>>
>>         at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>>
>>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>>
>>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>
>>         at java.lang.Thread.run(Unknown Source)
>>
>>         at
>> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
>> ~[cmx-core-metadata-core-3.2.0.27.jar:?]
>>
>>         ... 53 more
>>
>> Caused by: java.util.concurrent.ExecutionException:
>> org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException: Could no
>>
>> t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and
>> have been removed.
>>
>>         at
>> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>>
>>         at
>> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>>
>>         at
>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>>
>>         at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>>
>>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>>
>>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>
>>         at java.lang.Thread.run(Unknown Source)
>>
>>         at java.util.concurrent.FutureTask.report(Unknown Source)
>> ~[?:1.7.0_51]
>>
>>         at java.util.concurrent.FutureTask.get(Unknown Source)
>> ~[?:1.7.0_51]
>>
>>         at
>> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
>> ~[cmx-core-metadata-core-3.2.0.27.jar:?]
>>
>>         ... 53 more
>>
>> Caused by: org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
>> for joinId: ^@?g^R(g{.
>>
>> The cache might have expired and have been removed.
>>
>>         at
>> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>>
>>         at
>> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>>
>>         at
>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>>
>>         at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>>
>>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>>
>>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>
>>                                       at java.lang.Thread.run(Unknown
>> Source)
>>
>>         at
>> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at java.util.concurrent.FutureTask.run(Unknown Source)
>> ~[?:1.7.0_51]
>>
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>> Source) ~[?:1.7.0_51]
>>
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>> Source) ~[?:1.7.0_51]
>>
>>         ... 1 more
>>
>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
>> for joinId: ^@?g^R(g{. Th
>>
>> e cache might have expired and have been removed.
>>
>>         at
>> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>>
>>        at
>> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>>
>>         at
>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>>
>>         at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>>
>>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>>
>>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>
>>         at java.lang.Thread.run(Unknown Source)
>>
>>         at
>> sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown Source)
>> ~[?:?]
>>
>>         at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
>> ~[?:1.7.0_51]
>>
>>         at java.lang.reflect.Constructor.newInstance(Unknown Source)
>> ~[?:1.7.0_51]
>>
>>         at
>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>> ~[hadoop-common-2.7.1.jar:?]
>>
>>         at
>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>> ~[hadoop-common-2.7.1.jar:?]
>>
>>         at
>> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at java.util.concurrent.FutureTask.run(Unknown Source)
>> ~[?:1.7.0_51]
>>
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>> Source) ~[?:1.7.0_51]
>>
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>> Source) ~[?:1.7.0_51]
>>
>>         ... 1 more
>>
>> Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException:
>> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
>> for joinId: ^@?g
>>
>> ^R(g{. The cache might have expired and have been removed.
>>
>>         at
>> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>>
>>         at
>> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>>
>>         at
>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>>
>>         at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>>
>>         at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>>
>>                                       at
>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>>
>>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>
>>         at java.lang.Thread.run(Unknown Source)
>>
>>
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>> ~[hbase-client-0.98.5-hadoop2.jar:
>>
>> 0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
>> ~[hbase-protocol-0.98.5-hadoop2.
>>
>> jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
>>
>> oop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
>> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at
>> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
>> ~[phoenix-core-4.2.2.jar:4.2.2]
>>
>>         at java.util.concurrent.FutureTask.run(Unknown Source)
>> ~[?:1.7.0_51]
>>
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>> Source) ~[?:1.7.0_51]
>>
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>> Source) ~[?:1.7.0_51]
>>
>>         ... 1 more
>>
>>
>>
>> -Jerry
>>
>
>

Re: How to troubleshoot 'Could not find hash cache for joinId' which is failing always for some users and never for others

Posted by Kathiresan S <ka...@gmail.com>.
We had a similar problem while using explicit JOIN in the query (we didn't
have sub-queries). We worked it around by switching to sort merge join from
hash join (to get it run, though it'll be bit slower than hash join as per
documentation)

Try adding the hint */*+ USE_SORT_MERGE_JOIN*/ *in the SELECT query.

Thanks,
Kathir

On Thu, May 5, 2016 at 10:42 AM, Chabot, Jerry <je...@hpe.com>
wrote:

> We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period
> of time (a year or more). But now, some users can no longer search. Others
> do not have problems for a similar search. Granted, the searched data is
> different.
>
>
>
> The 'asset' table was created and maintained using an HBase client. Search
> uses a Phoenix view on this table. The row key is L,C,A,R.
>
>
>
> A search that is failing is what we call a multiple locality search. A
> single locality search is working. In multi-locality search, there is an IN
> L=(...) clause.
>
>
>
> Note, the L is the first column in the row key. The query is shown below.
>
>
>
>     SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L
> and C=T1.C AND A=T1.A AND C=? AND L IN (?))
>
>                   AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by
> T1.N asc nulls first limit 500
>
>
>
> The subquery selects the latest revision of each 'asset'. The additional
> conditions (C=? and L IN (?)) was added to make the subquery more
> efficient. I did notice there is no corresponding C=? in the primary WHERE
> clause.
>
>
>
> The root cause is "Could not find hash cache for joinId: The cache might
> have expired and have been removed." One recommendation on the Phoenix site
> is to increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use
> the default. This recommendation does not seem applicable because the query
> fails almost *immediately* - not after 30 seconds. Some bugs have
> reported a similar problem. But, they all seem to be multi-tenant. We do
> not use the multi-tenant property. We enforce via the WHERE clause. We do
> use a connection pool. Once again, the problem with using a connection pool
> appears to be if you use multi-tenant. Are there other reasons not to use a
> connection pool? We are also experiencing an unbalanced workload across our
> region servers. I have not confirmed, at this time, whether the localities
> for the users that have failing searches reside on the overloaded region
> server.
>
>
>
> Unfortunately, the problem only exists in the HBase clustered environment
> with controlled access, so it is difficult to enable debug logging. Does
> anyone have suggestions on how to troubleshoot or theories as to what could
> be the cause?
>
>
>
> Other considerations are
>
> - rework the query.
>
> - upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had
> all kinds of problems upgrading so I've taken that off the table.
>
> - provide an alternative to connection pooling. Not sure this would fix
> the problem. If it was the cause, why would search consistently work for
> some users?
>
> - replace Phoenix (obviously not a short-term option). Given the search
> table is maintained by HBase we can't leverage secondary indexes.
> Therefore, future search enhancements may be limited.
>
>
>
> <ERROR> 1 2016-05-05T13:07:49.624Z localhost - - -
> [logger="AssetsResource"] [hpSDID@11 deviceId="-" reqId="33214548506
>
> 4175617" threadId="330108"] Error searching assets:
> query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
>
> ... Internal database server error.
>
> Caused by: org.apache.phoenix.exception.PhoenixIOException:
> org.apache.phoenix.exception.PhoenixIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException:
>
> Could not find hash cache for joinId: ^@?g^R(g{. The cache might have
> expired and have been removed.
>
>        at
> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>
>         at
> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>         at
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
> ~[cmx-core-metadata-core-3.2.0.27.jar:?]
>
>         ... 53 more
>
> Caused by: java.util.concurrent.ExecutionException:
> org.apache.phoenix.exception.PhoenixIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Could no
>
> t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and
> have been removed.
>
>         at
> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>
>         at
> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>         at java.util.concurrent.FutureTask.report(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.FutureTask.get(Unknown Source)
> ~[?:1.7.0_51]
>
>         at
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
> ~[cmx-core-metadata-core-3.2.0.27.jar:?]
>
>         ... 53 more
>
> Caused by: org.apache.phoenix.exception.PhoenixIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
> for joinId: ^@?g^R(g{.
>
> The cache might have expired and have been removed.
>
>         at
> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>
>         at
> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>
>                                       at java.lang.Thread.run(Unknown
> Source)
>
>         at
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at java.util.concurrent.FutureTask.run(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source) ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source) ~[?:1.7.0_51]
>
>         ... 1 more
>
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
> for joinId: ^@?g^R(g{. Th
>
> e cache might have expired and have been removed.
>
>         at
> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>
>        at
> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>         at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown
> Source) ~[?:?]
>
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.lang.reflect.Constructor.newInstance(Unknown Source)
> ~[?:1.7.0_51]
>
>         at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> ~[hadoop-common-2.7.1.jar:?]
>
>         at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
> ~[hadoop-common-2.7.1.jar:?]
>
>         at
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at java.util.concurrent.FutureTask.run(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source) ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source) ~[?:1.7.0_51]
>
>         ... 1 more
>
> Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
> for joinId: ^@?g
>
> ^R(g{. The cache might have expired and have been removed.
>
>         at
> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
>
>         at
> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
>
>         at
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
>
>         at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
>
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>
>                                       at
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
>
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>
>         at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>
>         at java.lang.Thread.run(Unknown Source)
>
>
>
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
> ~[hbase-client-0.98.5-hadoop2.jar:
>
> 0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
> ~[hbase-protocol-0.98.5-hadoop2.
>
> jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
>
> oop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
> ~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at
> org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
> ~[phoenix-core-4.2.2.jar:4.2.2]
>
>         at java.util.concurrent.FutureTask.run(Unknown Source)
> ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source) ~[?:1.7.0_51]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source) ~[?:1.7.0_51]
>
>         ... 1 more
>
>
>
> -Jerry
>