phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerald Sangudi <gsang...@23andme.com>
Subject Re: SORT_MERGE_JOIN on non-leading key: server-side sorting
Date Fri, 11 May 2018 17:52:10 GMT
Hi Maryann,

Thanks for verifying against latest. However, I did not detect the fix in
Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
EXPLAIN plan still shows the same issue.

Thanks,
Gerald

On Wed, May 9, 2018 at 11:51 AM, Maryann Xue <maryann.xue@gmail.com> wrote:

> Hi Gerald,
>
> I have verified against latest Phoenix code that this problem has been
> fixed. I have also checked Phoenix 4.13 release tags. Looks like all
> versions of 4.13 packages now include that fix. Would you mind getting the
> latest Phoenix-4.13 package and testing it again? Thank you!
>
>
> Thanks,
> Maryann
>
> On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi <gsangudi@23andme.com>
> wrote:
>
>> Hello,
>>
>> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
>> plan:
>>
>> Table:
>>
>> *CREATE TABLE salted (       keyA BIGINT NOT NULL,       keyB BIGINT NOT
>> NULL,       val SMALLINT,       CONSTRAINT pk PRIMARY KEY (keyA,
>> keyB))SALT_BUCKETS = 64;*
>> EXPLAIN:
>>
>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
>> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>>
>> *+------------------------------------------------------------------------------+-----------------+-+|
>>                                     PLAN   | EST_BYTES_READ |
>> |+------------------------------------------------------------------------------+-----------------+-+|
>> SORT-MERGE-JOIN (INNER) TABLES
>>                                               | null | ||     CLIENT
>> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
>> ||         SERVER FILTER BY FIRST KEY ONLY
>>                                      | null | ||     CLIENT MERGE SORT
>>                                                        | null | || AND
>> (SKIP MERGE)                                                             |
>> null | ||     CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
>> - [63,20]  | null | ||         SERVER FILTER BY FIRST KEY ONLY
>>                                      | null | ||         SERVER SORTED BY
>> [T2.KEYB]                                           | null | ||     CLIENT
>> MERGE SORT                                                        | null |
>> || CLIENT AGGREGATE INTO SINGLE ROW
>>                                             | null |
>> |+------------------------------------------------------------------------------+-----------------+-+*
>>
>> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
>> necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
>> part of the primary key, is fixed. Furthermore, there is no corresponding
>> sort of T1.KEYB.
>>
>> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor
>> T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there
>> an open ticket? I would be happy to file a ticket and to contribute to a
>> fix. I would appreciate any guidance.
>>
>> Thanks,
>> Gerald
>>
>>
>

Mime
View raw message