phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Heather <james.heat...@mendeley.com>
Subject Re: yet another question...perhaps dumb...JOIN with two conditions
Date Fri, 11 Sep 2015 17:23:35 GMT
With your query as it stands, you're trying to construct 250K*270M pairs
before filtering them. That's 67.5 trillion. You will need a quantum
computer.

I think you will be better off restructuring...

James
On 11 Sep 2015 5:34 pm, "M. Aaron Bossert" <mabossert@gmail.com> wrote:

> AH!  Now I get it...I am running on a pretty beefy cluster...I would have
> thought this would work, even if a bit slower.  Do you know which timeout
> settings I would need to alter to get this to work?
>
> On Fri, Sep 11, 2015 at 12:26 PM, Maryann Xue <maryann.xue@gmail.com>
> wrote:
>
>> Yes, I know. That timeout was because Phoenix was doing CROSS JOIN which
>> made progressing with each row very slow.
>> Even if it could succeed, it would take a long time to complete.
>>
>> Thanks,
>> Maryann
>>
>> On Fri, Sep 11, 2015 at 11:58 AM, M. Aaron Bossert <mabossert@gmail.com>
>> wrote:
>>
>>> So, I've tried it both ways.  The IPV4RANGES table is small at around
>>> 250k rows, while the other table is around 270M rows.  I did a bit of
>>> googling and see that the error I am seeing is related to hbase
>>> timeouts-ish...Here is the description:
>>>
>>> "Thrown if a region server is passed an unknown scanner id. Usually
>>> means the client has take too long between checkins and so the scanner
>>> lease on the serverside has expired OR the serverside is closing down and
>>> has cancelled all leases."
>>>
>>> Has anyone had experience with this before?  Is there perhaps a timeout
>>> setting somewhere to bump up?
>>>
>>> On Fri, Sep 11, 2015 at 10:45 AM, Maryann Xue <maryann.xue@gmail.com>
>>> wrote:
>>>
>>>> Hi Aaron,
>>>>
>>>> As Jaime pointed out, it is a non-equi join. And unfortunately it is
>>>> handled as CROSS join in Phoenix and thus is not very efficient. For each
>>>> row from the left side, it will be joined with all of the rows from the
>>>> right side before the condition is a applied to filter the joined result.
>>>> Try switching the left table and the right table in your query to see if
it
>>>> will work a little better?
>>>>
>>>>
>>>> Thanks,
>>>> Maryann
>>>>
>>>> On Fri, Sep 11, 2015 at 10:06 AM, M. Aaron Bossert <mabossert@gmail.com
>>>> > wrote:
>>>>
>>>>> Not sure where the problem is, but when I run the suggested query, I
>>>>> get the following error...and when I try is with the sort/merge join
hint,
>>>>> I get yet a different error:
>>>>>
>>>>> java.lang.RuntimeException:
>>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>>> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache
>>>>> for joinId: C}^U. The cache might have expired and have been removed.
>>>>>
>>>>> at
>>>>> org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:95)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:212)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:178)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1845)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3173)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29994)
>>>>>
>>>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>>>>>
>>>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>>>
>>>>> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>>>>
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>>
>>>>> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>>>>>
>>>>> at sqlline.TableOutputFormat.print(TableOutputFormat.java:33)
>>>>>
>>>>> at sqlline.SqlLine.print(SqlLine.java:1653)
>>>>>
>>>>> at sqlline.Commands.execute(Commands.java:833)
>>>>>
>>>>> at sqlline.Commands.sql(Commands.java:732)
>>>>>
>>>>> at sqlline.SqlLine.dispatch(SqlLine.java:808)
>>>>>
>>>>> at sqlline.SqlLine.begin(SqlLine.java:681)
>>>>>
>>>>> at sqlline.SqlLine.start(SqlLine.java:398)
>>>>>
>>>>> at sqlline.SqlLine.main(SqlLine.java:292)
>>>>>
>>>>>
>>>>> and then the following is with the sort/merge join hint:
>>>>>
>>>>>
>>>>> 15/09/11 08:39:56 WARN client.ScannerCallable: Ignore, probably
>>>>> already closed
>>>>>
>>>>> org.apache.hadoop.hbase.UnknownScannerException:
>>>>> org.apache.hadoop.hbase.UnknownScannerException: Name: 658, already closed?
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3145)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29994)
>>>>>
>>>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>>>>>
>>>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>>>
>>>>> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>>>>
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>>
>>>>> at sun.reflect.GeneratedConstructorAccessor15.newInstance(Unknown
>>>>> Source)
>>>>>
>>>>> at
>>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>>
>>>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:287)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:303)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:159)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:58)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:115)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:91)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:481)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.ScanningResultIterator.close(ScanningResultIterator.java:49)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.TableResultIterator.close(TableResultIterator.java:95)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.LookAheadResultIterator$1.close(LookAheadResultIterator.java:42)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.ConcatResultIterator.close(ConcatResultIterator.java:70)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.RoundRobinResultIterator$RoundRobinIterator.close(RoundRobinResultIterator.java:298)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.RoundRobinResultIterator.close(RoundRobinResultIterator.java:134)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.RoundRobinResultIterator.fetchNextBatch(RoundRobinResultIterator.java:260)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.RoundRobinResultIterator.getIterators(RoundRobinResultIterator.java:174)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.RoundRobinResultIterator.next(RoundRobinResultIterator.java:91)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.execute.SortMergeJoinPlan$BasicJoinIterator.advance(SortMergeJoinPlan.java:346)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.execute.SortMergeJoinPlan$BasicJoinIterator.next(SortMergeJoinPlan.java:273)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.FilterResultIterator.advance(FilterResultIterator.java:61)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.LookAheadResultIterator.init(LookAheadResultIterator.java:59)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.iterate.LookAheadResultIterator.next(LookAheadResultIterator.java:65)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:773)
>>>>>
>>>>> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:62)
>>>>>
>>>>> at sqlline.TableOutputFormat.print(TableOutputFormat.java:33)
>>>>>
>>>>> at sqlline.SqlLine.print(SqlLine.java:1653)
>>>>>
>>>>> at sqlline.Commands.execute(Commands.java:833)
>>>>>
>>>>> at sqlline.Commands.sql(Commands.java:732)
>>>>>
>>>>> at sqlline.SqlLine.dispatch(SqlLine.java:808)
>>>>>
>>>>> at sqlline.SqlLine.begin(SqlLine.java:681)
>>>>>
>>>>> at sqlline.SqlLine.start(SqlLine.java:398)
>>>>>
>>>>> at sqlline.SqlLine.main(SqlLine.java:292)
>>>>>
>>>>> On Fri, Sep 11, 2015 at 12:26 AM, Jaime Solano <jdjsolano@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Aaron,
>>>>>>
>>>>>> The JOIN you're trying to run is a non-equi join, meaning that the
ON
>>>>>> condition is not an equality ('>=' and '<=' in your case).
This type of
>>>>>> join is not supported in Phoenix versions prior to 4.3
>>>>>>
>>>>>> In Phoenix 4.3+, you can do something like this:
>>>>>>
>>>>>> SELECT * FROM NG.AKAMAI_FORCEFIELD AS FORC, NG.IPV4RANGES AS IPV4
>>>>>> WHERE FORC.SOURCE_IP >= IPV4.IPSTART AND FORC.SOURCE_IP <=
IPV4.IPEND;
>>>>>>
>>>>>> Best,
>>>>>> -Jaime
>>>>>>
>>>>>> On Thu, Sep 10, 2015 at 10:59 PM, M. Aaron Bossert <
>>>>>> mabossert@gmail.com> wrote:
>>>>>>
>>>>>>> I am trying to execute the following query, but get an error...is
>>>>>>> there another way to achieve the same result by restructuring
the query?
>>>>>>>
>>>>>>> QUERY:
>>>>>>>
>>>>>>> SELECT * FROM NG.AKAMAI_FORCEFIELD AS FORC INNER JOIN NG.IPV4RANGES
>>>>>>> AS IPV4 ON FORC.SOURCE_IP >= IPV4.IPSTART AND FORC.SOURCE_IP
<= IPV4.IPEND;
>>>>>>>
>>>>>>>
>>>>>>> ERROR:
>>>>>>>
>>>>>>> *Error: ERROR 217 (22017): Amibiguous or non-equi join condition
>>>>>>> specified. Consider using table list with where clause.
>>>>>>> (state=22017,code=217)*
>>>>>>>
>>>>>>> java.sql.SQLException: ERROR 217 (22017): Amibiguous or non-equi
>>>>>>> join condition specified. Consider using table list with where
clause.
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:388)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.throwAmbiguousJoinConditionException(JoinCompiler.java:961)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.leaveBooleanNode(JoinCompiler.java:899)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.visitLeave(JoinCompiler.java:927)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$OnNodeVisitor.visitLeave(JoinCompiler.java:871)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.parse.ComparisonParseNode.accept(ComparisonParseNode.java:47)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.parse.CompoundParseNode.acceptChildren(CompoundParseNode.java:64)
>>>>>>>
>>>>>>> at org.apache.phoenix.parse.AndParseNode.accept(AndParseNode.java:47)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$JoinSpec.<init>(JoinCompiler.java:459)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$JoinSpec.<init>(JoinCompiler.java:442)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$JoinTableConstructor.visit(JoinCompiler.java:197)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler$JoinTableConstructor.visit(JoinCompiler.java:171)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.parse.JoinTableNode.accept(JoinTableNode.java:81)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler.compile(JoinCompiler.java:127)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.JoinCompiler.optimize(JoinCompiler.java:1158)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.QueryCompiler.compileSelect(QueryCompiler.java:193)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.compile.QueryCompiler.compile(QueryCompiler.java:158)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:375)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:349)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:255)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:250)
>>>>>>>
>>>>>>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:249)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1377)
>>>>>>>
>>>>>>> at sqlline.Commands.execute(Commands.java:822)
>>>>>>>
>>>>>>> at sqlline.Commands.sql(Commands.java:732)
>>>>>>>
>>>>>>> at sqlline.SqlLine.dispatch(SqlLine.java:808)
>>>>>>>
>>>>>>> at sqlline.SqlLine.begin(SqlLine.java:681)
>>>>>>>
>>>>>>> at sqlline.SqlLine.start(SqlLine.java:398)
>>>>>>>
>>>>>>> at sqlline.SqlLine.main(SqlLine.java:292)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message