phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abe Weinograd <...@flonet.com>
Subject Re: Inner Join not returning any results in Phoenix
Date Fri, 20 Feb 2015 15:18:28 GMT
Matt,

I have seen this same issue.  When passing a LIMIT to a query with joins
(most query tools do it implicitly), Phoenix seems to apply that to the
table on the right of the join I believe.  I hadn't had a chance to play
with it more and file a JIRA, but what you are describing is consistent
with what I have seen.

Abe

On Fri, Feb 20, 2015 at 10:04 AM, Matthew Johnson <matt.johnson@algomi.com>
wrote:

> Hi Constantin,
>
>
>
> Many thanks for your reply – the quotes were both of the same type (double
> quotes for table and column names, single quotes for string literals), it
> is just my email client that formatted them weirdly, sorry!
>
>
>
> I have discovered what I believe is an important piece of the puzzle to my
> problem. I am using Squirrel SQL as my JDBC client for Phoenix, and it has
> a “Limit Rows” feature. When I try and count the number of rows in a single
> table:
>
>
>
> *select count(*) from “mytable1”*
>
>
>
> I get the expected number of results (eg 20,000). But when I join two
> tables together, it seems that the “Limit Rows” from Squirrel is somehow
> being applied before the join is performed, and if “Limit Rows” is set to
> 100 I get 100 results or less. If the inner join is quite sparse (eg 20,000
> rows in a table but only 100 of these will join with a second table) then I
> believe it tries to join the first 100 it finds and returns no results. In
> my experience of Oracle or MySQL, joins are done entirely on server side
> and then you just get back the number of rows you limited, rather than what
> appears to be happening which is the row limit is applied to the first
> table before the join is attempted with the second table. Is that how
> Phoenix works?
>
>
>
> I have also discovered that I get different results (with “Limit Rows”
> turned on) depending on which order I join the tables:
>
>
>
> *SELECT count(*) FROM “mytable1” hc*
>
> *INNER JOIN “mytable2” bs*
>
> *On hc.”myId” = bs.”myId”*
>
>
>
> Gives me a very different number of results than:
>
>
>
> *SELECT count(*) FROM “mytable2” bs*
>
> *INNER JOIN “mytable1” hc*
>
> *On hc.”myId” = bs.”myId”*
>
>
>
>
>
> Unfortunately I cannot test whether I get the same number of results with
> “Limit Rows” turned off because my query times out! So I am now looking at
> creating secondary indexes on the “myId” column in both tables to see if I
> am able to do this join quicker. Does a join like this use a lot of memory
> on server side? Is something likely to be running out of resources?
>
>
>
> Many thanks again for your time.
>
>
>
> Cheers,
>
> Matt
>
>
>
>
>
> *From:* Ciureanu, Constantin (GfK) [mailto:Constantin.Ciureanu@gfk.com]
> *Sent:* 20 February 2015 14:40
> *To:* user@phoenix.apache.org
> *Subject:* RE: Inner Join not returning any results in Phoenix
>
>
>
> Hi Matthew,
>
>
>
> Is it working without the quotes “ / *"*  ?   (I see you are using 2
> types of quotes, weird)
>
> I guess that’s not needed, and probably causing troubles.   I don’t have
> to use quotes anyway.
>
>
>
> Alternatively check the types of data in those 2 tables (if the field
> types are not the same in both tables, the join will not work).
>
>
>
> Good luck,
>
>   Constantin
>
>
>
> *From:* Matthew Johnson [mailto:matt.johnson@algomi.com
> <matt.johnson@algomi.com>]
> *Sent:* Friday, February 20, 2015 12:54 PM
> *To:* user@phoenix.apache.org
> *Subject:* Inner Join not returning any results in Phoenix
>
>
>
> Hi guys,
>
>
>
> I’m a little bit stuck with doing an Inner Join with Phoenix. I set up one
> environment, created tables in HBase, and then created views (rather than
> tables) in Phoenix, and am able to query as expected (when I join my two
> tables I see results). I’ve just promoted to another environment, with the
> exact same setup, but my Inner Join returns no results!
>
>
>
> I run the following two individual queries:
>
>
>
> *SELECT * FROM "mytable1" hc*
>
> *where hc."myId" = 'XS0'*
>
>
>
> *SELECT * FROM "mytable2" bs*
>
> *where bs."myId" = 'XS0'*
>
>
>
> And both of these queries give results. But when I run:
>
>
>
> *SELECT * FROM "mytable1" hc*
>
> *INNER JOIN “mytable2” bs*
>
> *On hc."myId" = bs.”myId”*
>
>
>
> I get no results. I also get no results if I try:
>
>
>
> *SELECT * FROM "mytable1" hc*
>
> *where hc."myId" in (select distinct “myId” from “mytable2”)*
>
>
>
> I have checked in HBase shell and can see the “myId” value is as expected
> (XS0 in both tables). I am not sure if there are any logs that I can look
> at to get some insight?
>
>
>
> Many thanks in advance for any suggestions!
>
>
>
> Cheers,
>
> Matt
>
>
>
> *PS* Something that may or may not be of note: In the environments I am
> using:
>
> WORKING: *hbase-0.98.8-hadoop2 / phoenix-4.2.2-server.jar*
>
> FAILING:     *hbase-0.98.9-hadoop2* / *phoenix-4.2.2-server.jar*
>

Mime
View raw message