phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maryann Xue <maryann....@gmail.com>
Subject Re: Cannot run query containing inner join on Phoenix 3.0.0 when data size increased from 10 MB to 1 GB
Date Sat, 01 Mar 2014 14:51:19 GMT
Hi David,

What do you mean by "first time"? Could you please share your server logs?

And also could you please try with the latest master branch, and check if
there are any warnings in you CLIENT logs?


Thanks,
Maryann



On Sat, Mar 1, 2014 at 12:56 PM, David Wang <davidwang400@gmail.com> wrote:

> Hi,
>
>
> I successfully ran a query containing an inner join in Phoenix 3.0 on a 10
> MB data set.
>
> But when I increased the data size from 10 MB to 1 GB, and try to run the
> same query, I get an error.  I summarized each step of what I did and the
> error I encountered.  I would appreciate any help or advice.
>
>
> 1).  Below is the original TPC-H Query 5 before I translated it to
> phoenix-style:
>
> select
>    n_name,
>    sum(l_extendedprice * (1 - l_discount)) as revenue
> from
>    customer,
>    orders,
>    lineitem,
>    supplier,
>    nation,
>    region
> where
>    c_custkey = o_custkey
>    and l_orderkey = o_orderkey
>    and l_suppkey = s_suppkey
>    and c_nationkey = s_nationkey
>    and s_nationkey = n_nationkey
>    and n_regionkey = r_regionkey
>    and r_name = '[REGION]'
>    and o_orderdate >= date '[DATE]'
>    and o_orderdate < date '[DATE]' + interval '1' year
> group by
>    n_name
> order by
>    revenue desc;
>
>
> 2). The sizes of each table in my query are as follows:
>
> lineitem - 725 MB
> orders - 164 MB
> customer - 24 MB
> supplier - 1.4 MB
> nation - 2.2 KB
> region - 400 B
> The heap size of my region servers is 4 GB.
>
> 3). I modified this statement to following according to Maryann's
> suggestion (which was to place the largest table first):
>
> select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue
> from lineitem inner join orders on l_orderkey = o_orderkey
>                    inner join supplier on l_suppkey = s_suppkey
>                    inner join customer on c_nationkey = s_nationkey and
> c_custkey = o_custkey
>                    inner join nation on s_nationkey = n_nationkey
>                    inner join region on n_regionkey = r_regionkey
> where r_name = 'AMERICA' and o_orderdate >= '1993-01-01' and o_orderdate <
> '1994-01-01'
> group by n_name order by revenue desc
>
> 4).When I execute at very first time I get the following error:
>
> java.lang.RuntimeException:
> com.salesforce.phoenix.exception.PhoenixIOException:
> com.salesforce.phoenix.exception.PhoenixIOException: Failed after
> attempts=14, exceptions:
> Mon Feb 24 19:36:50 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:36:51 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:36:52 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:36:54 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:36:56 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:37:00 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:37:04 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:37:12 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:37:28 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId:  @2[]
> Mon Feb 24 19:38:00 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i? 0
> Mon Feb 24 19:39:05 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i? 0
> Mon Feb 24 19:40:09 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i? 0
> Mon Feb 24 19:41:13 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i? 0
> Mon Feb 24 19:42:18 EST 2014,
> org.apache.hadoop.hbase.client.ScannerCallable@53f5fcb6,
> java.io.IOException: java.io.IOException: Could not find hash cache for
> joinId: i? 0
>
>         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440)
>         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074)
>         at sqlline.SqlLine.print(SqlLine.java:1735)
>         at sqlline.SqlLine$Commands.execute(SqlLine.java:3683)
>         at sqlline.SqlLine$Commands.sql(SqlLine.java:3584)
>         at sqlline.SqlLine.dispatch(SqlLine.java:821)
>         at sqlline.SqlLine.begin(SqlLine.java:699)
>         at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>         at sqlline.SqlLine.main(SqlLine.java:424)
>
> 5).I re-execute at 2nd time I got the result that is correct with the
> solution.
>
> My cluster settting is one master and three slaves. Each machine has
> 8-cores and 8-GB RAM. A total of 1 GB data was distributed in three slaves
> and running in three machines (monitoring by top command on each machine).
>
> Thank you so much,
>
> David
>



-- 
Thanks,
Maryann

Mime
View raw message