Hi Gerald,

I have verified against latest Phoenix code that this problem has been fixed. I have also checked Phoenix 4.13 release tags. Looks like all versions of 4.13 packages now include that fix. Would you mind getting the latest Phoenix-4.13 package and testing it again? Thank you!


Thanks,
Maryann

On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi <gsangudi@23andme.com> wrote:
Hello,

I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN plan:

Table:

CREATE TABLE salted (

      keyA BIGINT NOT NULL,

      keyB BIGINT NOT NULL,

      val SMALLINT,

      CONSTRAINT pk PRIMARY KEY (keyA, keyB)

)

SALT_BUCKETS = 64;


EXPLAIN:

EXPLAIN

SELECT /*+ USE_SORT_MERGE_JOIN */

COUNT(*) c

FROM salted t1 JOIN salted t2

ON (t1.keyB = t2.keyB)

WHERE t1.keyA = 10

AND t2.keyA = 20;


+------------------------------------------------------------------------------+-----------------+-+

|                                     PLAN   | EST_BYTES_READ | |

+------------------------------------------------------------------------------+-----------------+-+

| SORT-MERGE-JOIN (INNER) TABLES                                               | null | |

|     CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null | |

|         SERVER FILTER BY FIRST KEY ONLY                                      | null | |

|     CLIENT MERGE SORT                                                        | null | |

| AND (SKIP MERGE)                                                             | null | |

|     CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20] - [63,20]  | null | |

|         SERVER FILTER BY FIRST KEY ONLY                                      | null | |

|         SERVER SORTED BY [T2.KEYB]                                           | null | |

|     CLIENT MERGE SORT                                                        | null | |

| CLIENT AGGREGATE INTO SINGLE ROW                                             | null | |

+------------------------------------------------------------------------------+-----------------+-+



In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort necessary? For both JOIN terms T1 and T2, the value of keyA, the leading part of the primary key, is fixed. Furthermore, there is no corresponding sort of T1.KEYB.

When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there an open ticket? I would be happy to file a ticket and to contribute to a fix. I would appreciate any guidance.

Thanks,
Gerald