phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maryann Xue <maryann....@gmail.com>
Subject Re: SORT_MERGE_JOIN on non-leading key: server-side sorting
Date Tue, 08 May 2018 21:58:28 GMT
Hi Gerald,

Thank you for finding this issue! I think it is similar to PHOENIX-4508.
I'll verify your case on the latest Phoenix branch and see if it has been
fixed.


Thanks,
Maryann

On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi <gsangudi@23andme.com>
wrote:

> Hello,
>
> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN plan:
>
> Table:
>
> *CREATE TABLE salted (       keyA BIGINT NOT NULL,       keyB BIGINT NOT
> NULL,       val SMALLINT,       CONSTRAINT pk PRIMARY KEY (keyA,
> keyB))SALT_BUCKETS = 64;*
> EXPLAIN:
>
> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>
> *+------------------------------------------------------------------------------+-----------------+-+|
>                                     PLAN   | EST_BYTES_READ |
> |+------------------------------------------------------------------------------+-----------------+-+|
> SORT-MERGE-JOIN (INNER) TABLES
>                                               | null | ||     CLIENT
> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null |
> ||         SERVER FILTER BY FIRST KEY ONLY
>                                      | null | ||     CLIENT MERGE SORT
>                                                        | null | || AND
> (SKIP MERGE)                                                             |
> null | ||     CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
> - [63,20]  | null | ||         SERVER FILTER BY FIRST KEY ONLY
>                                      | null | ||         SERVER SORTED BY
> [T2.KEYB]                                           | null | ||     CLIENT
> MERGE SORT                                                        | null |
> || CLIENT AGGREGATE INTO SINGLE ROW
>                                             | null |
> |+------------------------------------------------------------------------------+-----------------+-+*
>
> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
> necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
> part of the primary key, is fixed. Furthermore, there is no corresponding
> sort of T1.KEYB.
>
> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB nor
> T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is there
> an open ticket? I would be happy to file a ticket and to contribute to a
> fix. I would appreciate any guidance.
>
> Thanks,
> Gerald
>
>

Mime
View raw message