phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Bossert <maboss...@gmail.com>
Subject Re: yet another question...perhaps dumb...JOIN with two conditions
Date Fri, 11 Sep 2015 18:03:07 GMT
Don't have a quantum computer...but am on a small supercomputer ;). 1500 cores, 6TB of memory,
40TB of SSD, and a few hundred TB of spinning disks...

Sent from my iPhone

> On Sep 11, 2015, at 1:23 PM, James Heather <james.heather@mendeley.com> wrote:
> 
> With your query as it stands, you're trying to construct 250K*270M pairs before filtering
them. That's 67.5 trillion. You will need a quantum computer.
> 
> I think you will be better off restructuring...
> 
> James
> 
>> On 11 Sep 2015 5:34 pm, "M. Aaron Bossert" <mabossert@gmail.com> wrote:
>> AH!  Now I get it...I am running on a pretty beefy cluster...I would have thought
this would work, even if a bit slower.  Do you know which timeout settings I would need to
alter to get this to work?
>> 
>>> On Fri, Sep 11, 2015 at 12:26 PM, Maryann Xue <maryann.xue@gmail.com> wrote:
>>> Yes, I know. That timeout was because Phoenix was doing CROSS JOIN which made
progressing with each row very slow.
>>> Even if it could succeed, it would take a long time to complete.
>>> 
>>> Thanks,
>>> Maryann
>>> 
>>>> On Fri, Sep 11, 2015 at 11:58 AM, M. Aaron Bossert <mabossert@gmail.com>
wrote:
>>>> So, I've tried it both ways.  The IPV4RANGES table is small at around 250k
rows, while the other table is around 270M rows.  I did a bit of googling and see that the
error I am seeing is related to hbase timeouts-ish...Here is the description:
>>>> 
>>>> "Thrown if a region server is passed an unknown scanner id. Usually means
the client has take too long between checkins and so the scanner lease on the serverside has
expired OR the serverside is closing down and has cancelled all leases."
>>>> 
>>>> Has anyone had experience with this before?  Is there perhaps a timeout setting
somewhere to bump up?
>>>> 
>>>>> On Fri, Sep 11, 2015 at 10:45 AM, Maryann Xue <maryann.xue@gmail.com>
wrote:
>>>>> Hi Aaron,
>>>>> 
>>>>> As Jaime pointed out, it is a non-equi join. And unfortunately it is
handled as CROSS join in Phoenix and thus is not very efficient. For each row from the left
side, it will be joined with all of the rows from the right side before the condition is a
applied to filter the joined result. Try switching the left table and the right table in your
query to see if it will work a little better?
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Maryann
>>>>> 
>>>>>> On Fri, Sep 11, 2015 at 10:06 AM, M. Aaron Bossert <mabossert@gmail.com>
wrote:
>>>>>> Not sure where the problem is, but when I run the suggested query,
I get the following error...and when I try is with the sort/merge join hint, I get yet a different
error:
>>>>>> 
>>>>>> java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixIOException:
org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: C}^U. The cache might have expired and have been removed.
>>>>>> 	at org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:95)
>>>>>> 	at org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:212)
>>>>>> 	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:178)
>>>>>> 	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1845)
>>>>>> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3173)
>>>>>> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29994)
>>>>>> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>>>>>> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>>>> 	at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>>>> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>>>>> 	at java.lang.Thread.run(Thread.java:745)
>>>>>> 
>>>>>> 	at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>>>>>> 	at sqlline.TableOutputFormat.print(TableOutputFormat.java:33)
>>>>>> 	at sqlline.SqlLine.print(SqlLine.java:1653)
>>>>>> 	at sqlline.Commands.execute(Commands.java:833)
>>>>>>  	at sqlline.Commands.sql(Commands.java:732)
>>>>>> 	at sqlline.SqlLine.dispatch(SqlLine.java:808)
>>>>>> 	at sqlline.SqlLine.begin(SqlLine.java:681)
>>>>>> 	at sqlline.SqlLine.start(SqlLine.java:398)
>>>>>> 	at sqlline.SqlLine.main(SqlLine.java:292)
>>>>>> 
>>>>>> and then the following is with the sort/merge join hint:
>>>>>> 
>>>>>> 15/09/11 08:39:56 WARN client.ScannerCallable: Ignore, probably already
closed
>>>>>> org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException:
Name: 658, already closed?
>>>>>> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3145)
>>>>>> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29994)
>>>>>> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>>>>>> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>>>> 	at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>>>> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>>>>> 	at java.lang.Thread.run(Thread.java:745)
>>>>>> 
>>>>>> 	at sun.reflect.GeneratedConstructorAccessor15.newInstance(Unknown
Source)
>>>>>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>>>>> 	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>>>>>> 	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>>>>>> 	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:287)
>>>>>> 	at org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:303)
>>>>>> 	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:159)
>>>>>> 	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:58)
>>>>>> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:115)
>>>>>> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:91)
>>>>>> 	at org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:481)
>>>>>> 	at org.apache.phoenix.iterate.ScanningResultIterator.close(ScanningResultIterator.java:49)
>>>>>> 	at org.apache.phoenix.iterate.TableResultIterator.close(TableResultIterator.java:95)
>>>>>> 	at org.apache.phoenix.iterate.LookAheadResultIterator$1.close(LookAheadResultIterator.java:42)
>>>>>> 	at org.apache.phoenix.iterate.ConcatResultIterator.close(ConcatResultIterator.java:70)
>>>>>> 	at org.apache.phoenix.iterate.RoundRobinResultIterator$RoundRobinIterator.close(RoundRobinResultIterator.java:298)
>>>>>> 	at org.apache.phoenix.iterate.RoundRobinResultIterator.close(RoundRobinResultIterator.java:134)
>>>>>> 	at org.apache.phoenix.iterate.RoundRobinResultIterator.fetchNextBatch(RoundRobinResultIterator.java:260)
>>>>>> 	at org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:174)
>>>>>> 	at org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91)
>>>>>> 	at org.apache.phoenix.execute.SortMergeJoinPlan$BasicJoinIterator.advance(SortMergeJoinPlan.java:346)
>>>>>> 	at org.apache.phoenix.execute.SortMergeJoinPlan$BasicJoinIterator.next(SortMergeJoinPlan.java:273)
>>>>>> 	at org.apache.phoenix.iterate.FilterResultIterator.advance(FilterResultIterator.java:61)
>>>>>> 	at org.apache.phoenix.iterate.LookAheadResultIterator.init(LookAheadResultIterator.java:59)
>>>>>> 	at org.apache.phoenix.iterate.LookAheadResultIterator.next(LookAheadResultIterator.java:65)
>>>>>> 	at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:773)
>>>>>> 	at sqlline.IncrementalRows.hasNext(IncrementalRows.java:62)
>>>>>> 	at sqlline.TableOutputFormat.print(TableOutputFormat.java:33)
>>>>>> 	at sqlline.SqlLine.print(SqlLine.java:1653)
>>>>>> 	at sqlline.Commands.execute(Commands.java:833)
>>>>>> 	at sqlline.Commands.sql(Commands.java:732)
>>>>>> 	at sqlline.SqlLine.dispatch(SqlLine.java:808)
>>>>>> 	at sqlline.SqlLine.begin(SqlLine.java:681)
>>>>>> 	at sqlline.SqlLine.start(SqlLine.java:398)
>>>>>> 	at sqlline.SqlLine.main(SqlLine.java:292)
>>>>>> 
>>>>>>> On Fri, Sep 11, 2015 at 12:26 AM, Jaime Solano <jdjsolano@gmail.com>
wrote:
>>>>>>> Hi Aaron,
>>>>>>> 
>>>>>>> The JOIN you're trying to run is a non-equi join, meaning that
the ON condition is not an equality ('>=' and '<=' in your case). This type of join
is not supported in Phoenix versions prior to 4.3
>>>>>>> 
>>>>>>> In Phoenix 4.3+, you can do something like this:
>>>>>>> 
>>>>>>> SELECT * FROM NG.AKAMAI_FORCEFIELD AS FORC, NG.IPV4RANGES AS
IPV4 WHERE FORC.SOURCE_IP >= IPV4.IPSTART AND FORC.SOURCE_IP <= IPV4.IPEND;
>>>>>>> 
>>>>>>> Best,
>>>>>>> -Jaime
>>>>>>> 
>>>>>>>> On Thu, Sep 10, 2015 at 10:59 PM, M. Aaron Bossert <mabossert@gmail.com>
wrote:
>>>>>>>> I am trying to execute the following query, but get an error...is
there another way to achieve the same result by restructuring the query?
>>>>>>>> 
>>>>>>>> QUERY:
>>>>>>>> SELECT * FROM NG.AKAMAI_FORCEFIELD AS FORC INNER JOIN NG.IPV4RANGES
AS IPV4 ON FORC.SOURCE_IP >= IPV4.IPSTART AND FORC.SOURCE_IP <= IPV4.IPEND;
>>>>>>>> 
>>>>>>>> ERROR:
>>>>>>>> Error: ERROR 217 (22017): Amibiguous or non-equi join condition
specified. Consider using table list with where clause. (state=22017,code=217)
>>>>>>>> java.sql.SQLException: ERROR 217 (22017): Amibiguous or non-equi
join condition specified. Consider using table list with where clause.
>>>>>>>> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:388)
>>>>>>>> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.throwAmbiguousJoinConditionException(JoinCompiler.java:961)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.leaveBooleanNode(JoinCompiler.java:899)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.visitLeave(JoinCompiler.java:927)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.visitLeave(JoinCompiler.java:871)
>>>>>>>> 	at org.apache.phoenix.parse.ComparisonParseNode.accept(ComparisonParseNode.java:47)
>>>>>>>> 	at org.apache.phoenix.parse.CompoundParseNode.acceptChildren(CompoundParseNode.java:64)
>>>>>>>> 	at org.apache.phoenix.parse.AndParseNode.accept(AndParseNode.java:47)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$JoinSpec.<init>(JoinCompiler.java:459)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$JoinSpec.<init>(JoinCompiler.java:442)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$JoinTableConstructor.visit(JoinCompiler.java:197)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler$JoinTableConstructor.visit(JoinCompiler.java:171)
>>>>>>>> 	at org.apache.phoenix.parse.JoinTableNode.accept(JoinTableNode.java:81)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler.compile(JoinCompiler.java:127)
>>>>>>>> 	at org.apache.phoenix.compile.JoinCompiler.optimize(JoinCompiler.java:1158)
>>>>>>>> 	at org.apache.phoenix.compile.QueryCompiler.compileSelect(QueryCompiler.java:193)
>>>>>>>> 	at org.apache.phoenix.compile.QueryCompiler.compile(QueryCompiler.java:158)
>>>>>>>> 	at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:375)
>>>>>>>> 	at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:349)
>>>>>>>> 	at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:255)
>>>>>>>> 	at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:250)
>>>>>>>> 	at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>>>>>>>> 	at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:249)
>>>>>>>> 	at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1377)
>>>>>>>> 	at sqlline.Commands.execute(Commands.java:822)
>>>>>>>> 	at sqlline.Commands.sql(Commands.java:732)
>>>>>>>> 	at sqlline.SqlLine.dispatch(SqlLine.java:808)
>>>>>>>> 	at sqlline.SqlLine.begin(SqlLine.java:681)
>>>>>>>> 	at sqlline.SqlLine.start(SqlLine.java:398)
>>>>>>>> 	at sqlline.SqlLine.main(SqlLine.java:292)

Mime
View raw message