Hello,

Any guidance or thoughts on the thread below?

Thanks,
Gerald

On Fri, May 18, 2018 at 11:39 AM, Gerald Sangudi <gsangudi@23andme.com> wrote:
Maryann,

Can Phoenix provide hash aggregation on the client side? Are there design / implementation reasons not to, or should I file a ticket for this?

Thanks,
Gerald

On Fri, May 18, 2018 at 11:29 AM, Maryann Xue <maryann.xue@gmail.com> wrote:
Hi Gerald,

Phoenix does have hash aggregation. The reason why sort-based aggregation is used in your query plan is that the aggregation happens on the client side. And that is because sort-merge join is used (as hinted) which is a client driven join, and after that join stage all operations can only be on the client-side.


Thanks,
Marynn

On Fri, May 18, 2018 at 10:57 AM, Gerald Sangudi <gsangudi@23andme.com> wrote:
Hello,

Does Phoenix provide hash aggregation? If not, is it on the roadmap, or should I file a ticket? We have aggregation queries that do not require sorted results.

For example, this EXPLAIN plan shows a CLIENT SORT.

CREATE TABLE unsalted (

      keyA BIGINT NOT NULL,

      keyB BIGINT NOT NULL,

      val SMALLINT,

      CONSTRAINT pk PRIMARY KEY (keyA, keyB)

);


EXPLAIN

SELECT /*+ USE_SORT_MERGE_JOIN */

t1.val v1, t2.val v2, COUNT(*) c

FROM unsalted t1 JOIN unsalted t2

ON (t1.keyA = t2.keyA)

GROUP BY t1.val, t2.val;


+------------------------------------------------------------+-----------------+----------------+--+

|                            PLAN   | EST_BYTES_READ | EST_ROWS_READ  | |

+------------------------------------------------------------+-----------------+----------------+--+

| SORT-MERGE-JOIN (INNER) TABLES                             | null | null | |

|     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED  | null | null | |

| AND                                                        | null | null | |

|     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED  | null | null | |

| CLIENT SORTED BY [TO_DECIMAL(T1.VAL), T2.VAL]              | null | null | |

| CLIENT AGGREGATE INTO DISTINCT ROWS BY [T1.VAL, T2.VAL]    | null | null | |

+------------------------------------------------------------+-----------------+----------------+--+


Thanks,
Gerald