phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maryann Xue <maryann....@gmail.com>
Subject Re: Inner Join not returning any results in Phoenix
Date Fri, 20 Feb 2015 16:28:11 GMT
Hi Matt,

The error you got with "Limit Rows" off might be related to insufficient
memory on region servers for one of your tables. Which is the larger table
between table1 and table2? You might want to try putting the larger table
as the first table in your join query and see if it works.

And I will quickly check if the LIMIT problem is a Phoenix bug and will
keep you posted.


Thanks,
Maryann


On Fri, Feb 20, 2015 at 11:14 AM, Matthew Johnson <matt.johnson@algomi.com>
wrote:

> Hi Abe,
>
>
>
> Glad to hear I’m not alone! Will try and figure out exactly what’s
> happening and maybe raise a Jira :-)
>
>
>
>
>
> @Constantin – I have tried with and without the “Limit Rows” – but without
> it, and without any indexes, the query runs for a while (about 10 minutes?)
> and then throws an error:
>
>
>
> *Error: Encountered exception in sub plan [0] execution.*
>
>
>
> Which I’m guessing is either HBase or Zookeeper timeout. The weird thing
> is that in standard SQL databases (eg Oracle, MySQL etc) then the “Limit
> Rows” does not affect any aggregate functions like ‘count’, because the
> actual number of result rows for a count is just 1 row (the count itself).
> But in HBase it seems that the Row Limit, as Abe mentioned, is applied to
> one of the table BEFORE it does the join, so it affects the results of the
> ‘count’ function.
>
>
>
> When I try to create my indexes so I am able to do the join without Row
> Limit, I get the following error:
>
>
>
> ERROR 1029 (42Y88): Mutable secondary indexes must have the
> hbase.regionserver.wal.codec property set to
> org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec in the
> hbase-sites.xml of every region server
>
>
>
> Which I am happy to do (will have to wait until outside of business hours
> though), but I am curious, will this have any impact on the rest of my
> cluster and could it have any unforeseen consequences?
>
>
>
> Thanks again for the input!
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
> *From:* Ciureanu, Constantin (GfK) [mailto:Constantin.Ciureanu@gfk.com]
> *Sent:* 20 February 2015 15:48
>
> *To:* user@phoenix.apache.org
> *Subject:* RE: Inner Join not returning any results in Phoenix
>
>
>
> Hello Matt,
>
>
>
> http://codingclues.eu/2008/the-squirrel-100-rows-problem/
>
>
>
> Can you please test again after unchecking “Contents- Limit rows” and “SQL
> – Limit rows”?
>
>
>
> [image: SQL tab]
>
>
>
> P.S. Off-topic – it’s as funny as this “problem” (not possible to send an
> email for more than 500 miles away J
> http://www.ibiblio.org/harris/500milemail.html )
>
>
>
> Regards,
>
>   Constantin
>
>
>
> *From:* Abe Weinograd [mailto:abe@flonet.com <abe@flonet.com>]
> *Sent:* Friday, February 20, 2015 4:18 PM
> *To:* user
> *Subject:* Re: Inner Join not returning any results in Phoenix
>
>
>
> Matt,
>
>
>
> I have seen this same issue.  When passing a LIMIT to a query with joins
> (most query tools do it implicitly), Phoenix seems to apply that to the
> table on the right of the join I believe.  I hadn't had a chance to play
> with it more and file a JIRA, but what you are describing is consistent
> with what I have seen.
>
>
>
> Abe
>
>
>
> On Fri, Feb 20, 2015 at 10:04 AM, Matthew Johnson <matt.johnson@algomi.com>
> wrote:
>
> Hi Constantin,
>
>
>
> Many thanks for your reply – the quotes were both of the same type (double
> quotes for table and column names, single quotes for string literals), it
> is just my email client that formatted them weirdly, sorry!
>
>
>
> I have discovered what I believe is an important piece of the puzzle to my
> problem. I am using Squirrel SQL as my JDBC client for Phoenix, and it has
> a “Limit Rows” feature. When I try and count the number of rows in a single
> table:
>
>
>
> *select count(*) from “mytable1”*
>
>
>
> I get the expected number of results (eg 20,000). But when I join two
> tables together, it seems that the “Limit Rows” from Squirrel is somehow
> being applied before the join is performed, and if “Limit Rows” is set to
> 100 I get 100 results or less. If the inner join is quite sparse (eg 20,000
> rows in a table but only 100 of these will join with a second table) then I
> believe it tries to join the first 100 it finds and returns no results. In
> my experience of Oracle or MySQL, joins are done entirely on server side
> and then you just get back the number of rows you limited, rather than what
> appears to be happening which is the row limit is applied to the first
> table before the join is attempted with the second table. Is that how
> Phoenix works?
>
>
>
> I have also discovered that I get different results (with “Limit Rows”
> turned on) depending on which order I join the tables:
>
>
>
> *SELECT count(*) FROM “mytable1” hc*
>
> *INNER JOIN “mytable2” bs*
>
> *On hc.”myId” = bs.”myId”*
>
>
>
> Gives me a very different number of results than:
>
>
>
> *SELECT count(*) FROM “mytable2” bs*
>
> *INNER JOIN “mytable1” hc*
>
> *On hc.”myId” = bs.”myId”*
>
>
>
>
>
> Unfortunately I cannot test whether I get the same number of results with
> “Limit Rows” turned off because my query times out! So I am now looking at
> creating secondary indexes on the “myId” column in both tables to see if I
> am able to do this join quicker. Does a join like this use a lot of memory
> on server side? Is something likely to be running out of resources?
>
>
>
> Many thanks again for your time.
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
> *From:* Ciureanu, Constantin (GfK) [mailto:Constantin.Ciureanu@gfk.com]
> *Sent:* 20 February 2015 14:40
> *To:* user@phoenix.apache.org
> *Subject:* RE: Inner Join not returning any results in Phoenix
>
>
>
> Hi Matthew,
>
>
>
> Is it working without the quotes “ / *"*  ?   (I see you are using 2
> types of quotes, weird)
>
> I guess that’s not needed, and probably causing troubles.   I don’t have
> to use quotes anyway.
>
>
>
> Alternatively check the types of data in those 2 tables (if the field
> types are not the same in both tables, the join will not work).
>
>
>
> Good luck,
>
>   Constantin
>
>
>
> *From:* Matthew Johnson [mailto:matt.johnson@algomi.com
> <matt.johnson@algomi.com>]
> *Sent:* Friday, February 20, 2015 12:54 PM
> *To:* user@phoenix.apache.org
> *Subject:* Inner Join not returning any results in Phoenix
>
>
>
> Hi guys,
>
>
>
> I’m a little bit stuck with doing an Inner Join with Phoenix. I set up one
> environment, created tables in HBase, and then created views (rather than
> tables) in Phoenix, and am able to query as expected (when I join my two
> tables I see results). I’ve just promoted to another environment, with the
> exact same setup, but my Inner Join returns no results!
>
>
>
> I run the following two individual queries:
>
>
>
> *SELECT * FROM "mytable1" hc*
>
> *where hc."myId" = 'XS0'*
>
>
>
> *SELECT * FROM "mytable2" bs*
>
> *where bs."myId" = 'XS0'*
>
>
>
> And both of these queries give results. But when I run:
>
>
>
> *SELECT * FROM "mytable1" hc*
>
> *INNER JOIN “mytable2” bs*
>
> *On hc."myId" = bs.”myId”*
>
>
>
> I get no results. I also get no results if I try:
>
>
>
> *SELECT * FROM "mytable1" hc*
>
> *where hc."myId" in (select distinct “myId” from “mytable2”)*
>
>
>
> I have checked in HBase shell and can see the “myId” value is as expected
> (XS0 in both tables). I am not sure if there are any logs that I can look
> at to get some insight?
>
>
>
> Many thanks in advance for any suggestions!
>
>
>
> Cheers,
>
> Matt
>
>
>
> *PS* Something that may or may not be of note: In the environments I am
> using:
>
> WORKING: *hbase-0.98.8-hadoop2 / phoenix-4.2.2-server.jar*
>
> FAILING:     *hbase-0.98.9-hadoop2* / *phoenix-4.2.2-server.jar*
>
>
>

Mime
View raw message