phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerald Sangudi <gsang...@23andme.com>
Subject Re: SORT_MERGE_JOIN on non-leading key: server-side sorting
Date Thu, 17 May 2018 22:48:18 GMT
Hi Maryann,

Would you mind sharing the EXPLAIN output you are getting?

I'm not able to find a download for phoenix-4.13.2-HBase or 4.14.0-HBase.
The *-cdh downloads do not work with Amazon EMR. I tried building against
the 4.14.0-rc0 tag. This produced 4.14.0 for phoenix-core.jar, but all the
other jars produced are 4.13.1, including the client jar. When I deploy
these jars on EMR and run queries via sqlline, I get a NoClassDefFoundError.

Are the plans to provide 4.13.2-HBase as a public download?

Thanks,
Gerald

On Wed, May 16, 2018 at 10:40 AM, Maryann Xue <maryann.xue@gmail.com> wrote:

> Hi Gerald,
>
> I checked again. Unfortunately this fix is included with 4.13.2 but not
> 4.13.1. Would you mind upgrading your library to 4.13.2?
>
> Thanks,
> Maryann
>
> On Wed, May 16, 2018 at 9:41 AM, Maryann Xue <maryann.xue@gmail.com>
> wrote:
>
>> Sorry for the late response. Yes, sure, I will try it right away.
>>
>> On Wed, May 16, 2018 at 9:40 AM Gerald Sangudi <gsangudi@23andme.com>
>> wrote:
>>
>>> Hi Maryann,
>>>
>>> Following up again -- I ran EXPLAIN on 4.13.1 and saw the same behavior,
>>> a sort on the RHS. Is it possible for you to try it on 4.13.1?
>>>
>>> Thanks,
>>> Gerald
>>>
>>> On Fri, May 11, 2018 at 10:52 AM, Gerald Sangudi <gsangudi@23andme.com>
>>> wrote:
>>>
>>>> Hi Maryann,
>>>>
>>>> Thanks for verifying against latest. However, I did not detect the fix
>>>> in Phoenix 4.13.1. AWS EMR currently provides Phoenix 4.13.0. I manually
>>>> upgraded an AWS EMR cluster to Phoenix 4.13.1 and rebooted the cluster. The
>>>> EXPLAIN plan still shows the same issue.
>>>>
>>>> Thanks,
>>>> Gerald
>>>>
>>>> On Wed, May 9, 2018 at 11:51 AM, Maryann Xue <maryann.xue@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Gerald,
>>>>>
>>>>> I have verified against latest Phoenix code that this problem has been
>>>>> fixed. I have also checked Phoenix 4.13 release tags. Looks like all
>>>>> versions of 4.13 packages now include that fix. Would you mind getting
the
>>>>> latest Phoenix-4.13 package and testing it again? Thank you!
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Maryann
>>>>>
>>>>> On Tue, May 8, 2018 at 12:24 PM, Gerald Sangudi <gsangudi@23andme.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I'm running Phoenix 4.13 on AWS EMR and getting the following EXPLAIN
>>>>>> plan:
>>>>>>
>>>>>> Table:
>>>>>>
>>>>>> *CREATE TABLE salted (       keyA BIGINT NOT NULL,       keyB BIGINT
>>>>>> NOT NULL,       val SMALLINT,       CONSTRAINT pk PRIMARY KEY (keyA,
>>>>>> keyB))SALT_BUCKETS = 64;*
>>>>>> EXPLAIN:
>>>>>>
>>>>>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */COUNT(*) cFROM salted t1
>>>>>> JOIN salted t2ON (t1.keyB = t2.keyB)WHERE t1.keyA = 10AND t2.keyA
= 20;*
>>>>>>
>>>>>> *+------------------------------------------------------------------------------+-----------------+-+|
>>>>>>                                     PLAN   | EST_BYTES_READ |
>>>>>> |+------------------------------------------------------------------------------+-----------------+-+|
>>>>>> SORT-MERGE-JOIN (INNER) TABLES
>>>>>>                                               | null | ||     CLIENT
>>>>>> 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED [0,10] - [63,10]
 | null |
>>>>>> ||         SERVER FILTER BY FIRST KEY ONLY
>>>>>>                                      | null | ||     CLIENT MERGE
SORT
>>>>>>                                                        | null | ||
AND
>>>>>> (SKIP MERGE)                                                    
        |
>>>>>> null | ||     CLIENT 64-CHUNK PARALLEL 64-WAY RANGE SCAN OVER SALTED
[0,20]
>>>>>> - [63,20]  | null | ||         SERVER FILTER BY FIRST KEY ONLY
>>>>>>                                      | null | ||         SERVER SORTED
BY
>>>>>> [T2.KEYB]                                           | null | || 
   CLIENT
>>>>>> MERGE SORT                                                      
 | null |
>>>>>> || CLIENT AGGREGATE INTO SINGLE ROW
>>>>>>                                             | null |
>>>>>> |+------------------------------------------------------------------------------+-----------------+-+*
>>>>>>
>>>>>> In the EXPLAIN output, I see "SERVER SORTED BY [T2.KEYB]". Is this
>>>>>> sort necessary? For both JOIN terms T1 and T2, the value of keyA,
the
>>>>>> leading part of the primary key, is fixed. Furthermore, there is
no
>>>>>> corresponding sort of T1.KEYB.
>>>>>>
>>>>>> When I EXPLAIN the same query on a non-salted table, neither T1.KEYB
>>>>>> nor T2.KEYB is re-sorted. I'm hoping the sort is unnecessary. If
so, is
>>>>>> there an open ticket? I would be happy to file a ticket and to contribute
>>>>>> to a fix. I would appreciate any guidance.
>>>>>>
>>>>>> Thanks,
>>>>>> Gerald
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Mime
View raw message