phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maryann Xue <maryann....@gmail.com>
Subject Re: SORT_MERGE_JOIN on non-leading key: server-side sorting
Date Wed, 16 May 2018 16:41:08 GMT
Sorry for the late response. Yes, sure, I will try it right away.
On Wed, May 16, 2018 at 9:40 AM Gerald Sangudi <gsangudi@23andme.com> wrote:

> Hi Maryann,
>
> Following up again -- I ran EXPLAIN on 4.13.1 and saw the same behavior, a
> sort on the RHS. Is it possible for you to try it on 4.13.1?
>
> Thanks,
> Gerald
>
> On Fri, May 11, 2018 at 10:52 AM, Gerald Sangudi <gsangudi@23andme.com>
> wrote:
>
>> Hi Maryann,
>>
>> Thanks for verifying against latest. However, I did not detect the fix in
>> Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
>> upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
>> EXPLAIN plan still shows the same issue.
>>
>> Thanks,
>> Gerald
>>
>> On Wed, May 9, 2018 at 11:51 AM, Maryann Xue <maryann.xue@gmail.com>
>> wrote:
>>
>>> Hi Gerald,
>>>
>>> I have verified against latest Phoenix code that this problem has been
>>> fixed. I have also checked Phoenix 4.13 release tags. Looks like all
>>> versions of 4.13 packages now include that fix. Would you mind getting the
>>> latest Phoenix-4.13 package and testing it again? Thank you!
>>>
>>>
>>> Thanks,
>>> Maryann
>>>
>>> On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi <gsangudi@23andme.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
>>>> plan:
>>>>
>>>> Table:
>>>>
>>>> *CREATE TABLE salted (       keyA BIGINT NOT NULL,       keyB BIGINT
>>>> NOT NULL,       val SMALLINT,       CONSTRAINT pk PRIMARY KEY (keyA,
>>>> keyB))SALT_BUCKETS = 64;*
>>>> EXPLAIN:
>>>>
>>>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1 JOIN
>>>> salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA = 20;*
>>>>
>>>> *+------------------------------------------------------------------------------+-----------------+-+|
>>>>                                     PLAN   | EST_BYTES_READ |
>>>> |+------------------------------------------------------------------------------+-----------------+-+|
>>>> SORT-MERGE-JOIN (INNER) TABLES
>>>>                                               | null | ||     CLIENT
>>>> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]  | null
|
>>>> ||         SERVER FILTER BY FIRST KEY ONLY
>>>>                                      | null | ||     CLIENT MERGE SORT
>>>>                                                        | null | || AND
>>>> (SKIP MERGE)                                                            
|
>>>> null | ||     CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,20]
>>>> - [63,20]  | null | ||         SERVER FILTER BY FIRST KEY ONLY
>>>>                                      | null | ||         SERVER SORTED BY
>>>> [T2.KEYB]                                           | null | ||     CLIENT
>>>> MERGE SORT                                                        | null
|
>>>> || CLIENT AGGREGATE INTO SINGLE ROW
>>>>                                             | null |
>>>> |+------------------------------------------------------------------------------+-----------------+-+*
>>>>
>>>> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this sort
>>>> necessary? For both JOIN terms T1 and T2, the value of keyA, the leading
>>>> part of the primary key, is fixed. Furthermore, there is no corresponding
>>>> sort of T1.KEYB.
>>>>
>>>> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB
>>>> nor T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If so, is
>>>> there an open ticket? I would be happy to file a ticket and to contribute
>>>> to a fix. I would appreciate any guidance.
>>>>
>>>> Thanks,
>>>> Gerald
>>>>
>>>>
>>>
>>
>

Mime
View raw message