phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ethan <ew...@apache.org>
Subject Re: Query optimization
Date Fri, 22 Dec 2017 20:44:16 GMT
Hello Flavio,

From the plan looks like to me the second query is doing the filter at parent table (PEOPLE).
So what is the size of your PEOPLE and MYTABLE (after filtered) respectively?

For sort merge join, anyone knows are the both sides get shipped to client to do the merge
sort?

Thanks,


On December 22, 2017 at 9:58:30 AM, Flavio Pompermaier (pompermaier@okkam.it) wrote:

Any help here...?

On 20 Dec 2017 17:58, "Flavio Pompermaier" <pompermaier@okkam.it> wrote:
Hi to all,
I'm trying to find the best query for my use case but I found that one version work and the
other one does not (unless that I don't apply some tuning to timeouts etc like explained in
[1]).

The 2 queries extract the same data but, while the first query terminates the second does
not.
PS:  without the USE_SORT_MERGE_JOIN both queries weren't working

SELECT /*+ USE_SORT_MERGE_JOIN */ COUNT(*) 
FROM PEOPLE ds JOIN MYTABLE l ON ds.PERSON_ID = l.LOCALID
WHERE l.EID IS NULL AND l.DSID = 'PEOPLE' AND l.HAS_CANDIDATES = FALSE;
+---------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                     PLAN   
                                                  | EST_BYTES_READ 
| EST_ROWS_READ  |  EST_INFO_TS   |
+---------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| SORT-MERGE-JOIN (INNER) TABLES                                       
                                        | 14155777900     | 12077867 
     | 1513754378759  |
|     CLIENT 42-CHUNK 6168903 ROWS 11324622221 BYTES PARALLEL 3-WAY FULL SCAN OVER PEOPLE 
               | 14155777900     | 12077867       | 1513754378759  |
|         SERVER FILTER BY FIRST KEY ONLY                             
                                         | 14155777900     | 12077867 
     | 1513754378759  |
|     CLIENT MERGE SORT                                           
                                             | 14155777900     |
12077867       | 1513754378759  |
| AND (SKIP MERGE)                                               
                                              | 14155777900     |
12077867       | 1513754378759  |
|     CLIENT 15-CHUNK 5908964 ROWS 2831155679 BYTES PARALLEL 15-WAY RANGE SCAN OVER MYTABLE [0]
- [2]  | 14155777900     | 12077867       | 1513754378759  |
|         SERVER FILTER BY (EID IS NULL AND DSID = 'PEOPLE' AND HAS_CANDIDATES = false) 
                 | 14155777900     | 12077867       | 1513754378759  |
|         SERVER SORTED BY [L.LOCALID]                               
                                              | 14155777900     |
12077867       | 1513754378759  |
|     CLIENT MERGE SORT                                           
                                             | 14155777900     |
12077867       | 1513754378759  |
| CLIENT AGGREGATE INTO SINGLE ROW                                     
                                        | 14155777900     | 12077867 
     | 1513754378759  |
+---------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+


SELECT /*+ USE_SORT_MERGE_JOIN */ COUNT(*) 
FROM (SELECT LOCALID FROM MYTABLE
WHERE EID IS NULL AND DSID = 'PEOPLE' AND HAS_CANDIDATES = FALSE) l JOIN PEOPLE  ds ON ds.PERSON_ID
= l.LOCALID;
+--------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                     PLAN   
                                                 | EST_BYTES_READ 
| EST_ROWS_READ  |  EST_INFO_TS   |
+--------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| SORT-MERGE-JOIN (INNER) TABLES                                       
                                       | 14155777900     | 12077867 
     | 1513754378759  |
|     CLIENT 15-CHUNK 5908964 ROWS 2831155679 BYTES PARALLEL 3-WAY RANGE SCAN OVER MYTABLE [0]
- [2]  | 14155777900     | 12077867       | 1513754378759  |
|         SERVER FILTER BY (EID IS NULL AND DSID = 'PEOPLE' AND HAS_CANDIDATES = false) 
                | 14155777900     | 12077867       | 1513754378759  |
|     CLIENT MERGE SORT                                           
                                            | 14155777900     | 12077867 
     | 1513754378759  |
| AND (SKIP MERGE)                                               
                                             | 14155777900     |
12077867       | 1513754378759  |
|     CLIENT 42-CHUNK 6168903 ROWS 11324622221 BYTES PARALLEL 42-WAY FULL SCAN OVER PEOPLE 
             | 14155777900     | 12077867       | 1513754378759  |
|         SERVER FILTER BY FIRST KEY ONLY                             
                                        | 14155777900     | 12077867 
     | 1513754378759  |
|         SERVER SORTED BY [DS.PERSON_ID]                             
                                      | 14155777900     | 12077867 
     | 1513754378759  |
|     CLIENT MERGE SORT                                           
                                            | 14155777900     | 12077867 
     | 1513754378759  |
| CLIENT AGGREGATE INTO SINGLE ROW                                     
                                       | 14155777900     | 12077867 
     | 1513754378759  |
+--------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
10 rows selected (0.041 seconds)

What do you think? Whould I try to give more resources to HBase/Phoenix or is the first query
the best (and reliable) one?
Any insight about this is highly appreciated..

Best,
Flavio
Mime
View raw message