phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Wang <davidwang...@gmail.com>
Subject Re: Cannot run query containing inner join on Phoenix 3.0.0 when data size increased from 10 MB to 1 GB
Date Tue, 25 Feb 2014 00:58:10 GMT
Hi Maryann,

Could you please explain what is the complete test case is?

I just try to run the query from tpc-h query #5, which I provided in an
earlier email.  I just wanted to make sure the inner join works on phoenix.

Thank you so much,

David


On Mon, Feb 24, 2014 at 7:52 PM, Maryann Xue <maryann.xue@gmail.com> wrote:

> Could you please give the complete test case so that I can repo the issue?
>
>
> On Mon, Feb 24, 2014 at 7:48 PM, David Wang <davidwang400@gmail.com>wrote:
>
>> Hi Maryann,
>>
>> I tried moving the lineitem table (the largest table) to the first place
>> in the query below:
>> select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from
>> lineitem inner join orders on l_orderkey = o_orderkey inner join supplier
>> on l_suppkey = s_suppkey inner join customer on c_nationkey = s_nationkey
>> and c_custkey = o_custkey inner join nation on s_nationkey = n_nationkey
>> inner join region on n_regionkey = r_regionkey where r_name = 'AMERICA' and
>> o_orderdate >= '1993-01-01' and o_orderdate < '1994-01-01' group by n_name
>> order by revenue desc
>>
>>
>> But when I execute I get the following error:
>>
>> java.lang.RuntimeException:
>> com.salesforce.phoenix.exception.PhoenixIOException:
>> com.salesforce.phoenix.exception.PhoenixIOException: Failed after
>> attempts=14, exceptions:
>> Mon Feb 24 19:36:50 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:36:51 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:36:52 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:36:54 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:36:56 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:37:00 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:37:04 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:37:12 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:37:28 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: �@2[]
>> Mon Feb 24 19:38:00 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: i?�0��
>> Mon Feb 24 19:39:05 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: i?�0��
>> Mon Feb 24 19:40:09 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: i?�0��
>> Mon Feb 24 19:41:13 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: i?�0��
>> Mon Feb 24 19:42:18 EST 2014,
>> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
>> java.io.IOException: java.io.IOException: Could not find hash cache for
>> joinId: i?�0��
>>
>>         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)
>>         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074)
>>         at sqlline.SqlLine.print(SqlLine.java:1735)
>>         at sqlline.SqlLine$Commands.execute(SqlLine.java:3683)
>>         at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
>>         at sqlline.SqlLine.dispatch(SqlLine.java:821)
>>         at sqlline.SqlLine.begin(SqlLine.java:699)
>>         at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>>         at sqlline.SqlLine.main(SqlLine.java:424)
>>
>>
>> Do you know of any way I can fix this?
>>
>> Thank you so much,
>>
>> David
>>
>>
>> On Mon, Feb 24, 2014 at 7:02 PM, Maryann Xue <maryann.xue@gmail.com>wrote:
>>
>>> Hi David,
>>>
>>> Since join is implemented through hash-join and table stats are
>>> currently unavailable yet to help better decide the join order (and which
>>> tables should be small enough to go into memory), users may sometimes have
>>> to be careful about the table order in the query.
>>> In your case, you can try moving the lineitem table (which is the
>>> largest here) to the first place in the query. Think that should work.
>>>
>>>
>>> Thanks,
>>> Maryann
>>>
>>>
>>>
>>> On Mon, Feb 24, 2014 at 6:08 PM, David Wang <davidwang400@gmail.com>wrote:
>>>
>>>> Hi Maryann,
>>>>
>>>> The sizes of each table in my query are as follows:
>>>> customer - 24 MB
>>>> orders - 164 MB
>>>> lineitem - 725 MB
>>>> supplier - 1.4 MB
>>>> nation - 2.2 KB
>>>> region - 400 B
>>>>
>>>> The heap size of my region servers is 4 GB.
>>>>
>>>> Thank you,
>>>> David
>>>>
>>>>
>>>> On Mon, Feb 24, 2014 at 5:58 PM, Maryann Xue <maryann.xue@gmail.com>wrote:
>>>>
>>>>> Hi David,
>>>>>
>>>>> What are sizes of each table in your query? And what is the heap size
>>>>> of your region server?
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Maryann
>>>>>
>>>>>
>>>>> On Mon, Feb 24, 2014 at 5:50 PM, David Wang <davidwang400@gmail.com>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I had a question about running queries containing inner joins on
>>>>>> phoenix, but wasn't sure where to post.
>>>>>> I downloaded the phoenix-master (3.0.0), and inserted 10 MB of data
>>>>>> through psql.sh.  I found I was able to run a query,
>>>>>> which contains an inner join in sqlline.
>>>>>>
>>>>>> The query I tried:
>>>>>>
>>>>>> select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue
>>>>>> from customer inner join orders on c_custkey = o_custkey inner join
>>>>>> lineitem on l_orderkey = o_orderkey inner join supplier on l_suppkey
=
>>>>>> s_suppkey and c_nationkey = s_nationkey inner join nation on s_nationkey
=
>>>>>> n_nationkey inner join region on n_regionkey = r_regionkey where
r_name =
>>>>>> 'AMERICA' and o_orderdate >= '1993-01-01' and o_orderdate <
'1994-01-01'
>>>>>> group by n_name order by revenue desc;
>>>>>>
>>>>>> But when I increased the data size from 10 MB to 1 GB, and try to
run
>>>>>> the same query, I get the following error:
>>>>>> Error: Encountered exception in hash plan execution. (state=,code=0)
>>>>>>
>>>>>> I have included the path to snappy-java-1.0.3.2.jar in sqlline.sh
>>>>>> like the following:
>>>>>> java -cp
>>>>>> ".:$phoenix_client_jar:/usr/local/hbase-0.94.8/lib/snappy-java-1.0.3.2.jar"
>>>>>>  -Dlog4j.configuration=file:$current_dir/log4j.properties
>>>>>> sqlline.SqlLine
>>>>>> -d com.salesforce.phoenix.jdbc.PhoenixDriver -u jdbc:phoenix:$1 -n
>>>>>> none -p
>>>>>> none --color=true --fastConnect=false --verbose=true
>>>>>> --isolation=TRANSACTION_READ_COMMITTED $sqlfile
>>>>>>
>>>>>> Do you have any idea where the problem might be?
>>>>>>
>>>>>> Thank you so much,
>>>>>>
>>>>>> David
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Maryann
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Maryann
>>>
>>
>>
>
>
> --
> Thanks,
> Maryann
>

Mime
View raw message