phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Wang <davidwang...@gmail.com>
Subject Re: Cannot run query containing inner join on Phoenix 3.0.0 when data size increased from 10 MB to 1 GB
Date Tue, 25 Feb 2014 00:48:45 GMT
Hi Maryann,

I tried moving the lineitem table (the largest table) to the first place in
the query below:
select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from
lineitem inner join orders on l_orderkey = o_orderkey inner join supplier
on l_suppkey = s_suppkey inner join customer on c_nationkey = s_nationkey
and c_custkey = o_custkey inner join nation on s_nationkey = n_nationkey
inner join region on n_regionkey = r_regionkey where r_name = 'AMERICA' and
o_orderdate >= '1993-01-01' and o_orderdate < '1994-01-01' group by n_name
order by revenue desc


But when I execute I get the following error:

java.lang.RuntimeException:
com.salesforce.phoenix.exception.PhoenixIOException:
com.salesforce.phoenix.exception.PhoenixIOException: Failed after
attempts=14, exceptions:
Mon Feb 24 19:36:50 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:36:51 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:36:52 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:36:54 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:36:56 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:37:00 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:37:04 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:37:12 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:37:28 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: �@2[]
Mon Feb 24 19:38:00 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: i?�0��
Mon Feb 24 19:39:05 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: i?�0��
Mon Feb 24 19:40:09 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: i?�0��
Mon Feb 24 19:41:13 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: i?�0��
Mon Feb 24 19:42:18 EST 2014,
org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
java.io.IOException: java.io.IOException: Could not find hash cache for
joinId: i?�0��

        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)
        at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074)
        at sqlline.SqlLine.print(SqlLine.java:1735)
        at sqlline.SqlLine$Commands.execute(SqlLine.java:3683)
        at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
        at sqlline.SqlLine.dispatch(SqlLine.java:821)
        at sqlline.SqlLine.begin(SqlLine.java:699)
        at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
        at sqlline.SqlLine.main(SqlLine.java:424)


Do you know of any way I can fix this?

Thank you so much,

David


On Mon, Feb 24, 2014 at 7:02 PM, Maryann Xue <maryann.xue@gmail.com> wrote:

> Hi David,
>
> Since join is implemented through hash-join and table stats are currently
> unavailable yet to help better decide the join order (and which tables
> should be small enough to go into memory), users may sometimes have to be
> careful about the table order in the query.
> In your case, you can try moving the lineitem table (which is the largest
> here) to the first place in the query. Think that should work.
>
>
> Thanks,
> Maryann
>
>
>
> On Mon, Feb 24, 2014 at 6:08 PM, David Wang <davidwang400@gmail.com>wrote:
>
>> Hi Maryann,
>>
>> The sizes of each table in my query are as follows:
>> customer - 24 MB
>> orders - 164 MB
>> lineitem - 725 MB
>> supplier - 1.4 MB
>> nation - 2.2 KB
>> region - 400 B
>>
>> The heap size of my region servers is 4 GB.
>>
>> Thank you,
>> David
>>
>>
>> On Mon, Feb 24, 2014 at 5:58 PM, Maryann Xue <maryann.xue@gmail.com>wrote:
>>
>>> Hi David,
>>>
>>> What are sizes of each table in your query? And what is the heap size of
>>> your region server?
>>>
>>>
>>> Thanks,
>>> Maryann
>>>
>>>
>>> On Mon, Feb 24, 2014 at 5:50 PM, David Wang <davidwang400@gmail.com>wrote:
>>>
>>>> Hi,
>>>>
>>>> I had a question about running queries containing inner joins on
>>>> phoenix, but wasn't sure where to post.
>>>> I downloaded the phoenix-master (3.0.0), and inserted 10 MB of data
>>>> through psql.sh.  I found I was able to run a query,
>>>> which contains an inner join in sqlline.
>>>>
>>>> The query I tried:
>>>>
>>>> select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from
>>>> customer inner join orders on c_custkey = o_custkey inner join lineitem on
>>>> l_orderkey = o_orderkey inner join supplier on l_suppkey = s_suppkey and
>>>> c_nationkey = s_nationkey inner join nation on s_nationkey = n_nationkey
>>>> inner join region on n_regionkey = r_regionkey where r_name = 'AMERICA' and
>>>> o_orderdate >= '1993-01-01' and o_orderdate < '1994-01-01' group by
n_name
>>>> order by revenue desc;
>>>>
>>>> But when I increased the data size from 10 MB to 1 GB, and try to run
>>>> the same query, I get the following error:
>>>> Error: Encountered exception in hash plan execution. (state=,code=0)
>>>>
>>>> I have included the path to snappy-java-1.0.3.2.jar in sqlline.sh like
>>>> the following:
>>>> java -cp
>>>> ".:$phoenix_client_jar:/usr/local/hbase-0.94.8/lib/snappy-java-1.0.3.2.jar"
>>>>  -Dlog4j.configuration=file:$current_dir/log4j.properties
>>>> sqlline.SqlLine
>>>> -d com.salesforce.phoenix.jdbc.PhoenixDriver -u jdbc:phoenix:$1 -n none
>>>> -p
>>>> none --color=true --fastConnect=false --verbose=true
>>>> --isolation=TRANSACTION_READ_COMMITTED $sqlfile
>>>>
>>>> Do you have any idea where the problem might be?
>>>>
>>>> Thank you so much,
>>>>
>>>> David
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Maryann
>>>
>>
>>
>
>
> --
> Thanks,
> Maryann
>

Mime
View raw message