phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From NaHeon Kim <honey.and...@gmail.com>
Subject Re: Select date range performance issue
Date Fri, 24 Feb 2017 02:59:28 GMT
Thanks for suggestion.

Here's further questions:
1. create_dt (not obj_id, I think you confused) would have large sets of
date, so SKIP_SCAN hint might be not useful.

2. I created secondary index on create_dt
       create index IDX1_CREATE_DT on MY_TABLE(CREATE_DT;

However, EXPLAIN still shows query plan of FULL SCAN.
Giving index hint on SELECT doesn't work as well.

CLIENT 2-CHUNK 0 ROWS 0 BYTES PARALLEL 2-WAY FULL SCAN OVER
NEWS_KEYWORD_COUNT
   SERVER FILTER BY (CREATE_DT > TIMESTAMP '2017-03-01 00:00:00.000' AND
CREATE_DT < TIMESTAMP '2017-04-01 00:00:00.000')
   SERVER AGGREGATE INTO DISTINCT ROWS BY [KEYWORD]
CLIENT MERGE SORT
CLIENT 100 ROW LIMIT

3. ROW_TIMESTAMP is time of current query execution time, right?
Then it's not a right choice. :-(


2017-02-24 1:54 GMT+09:00 Jonathan Leech <jonathaz@gmail.com>:

> If there are not a large number of distinct values of obj_id, try a
> SKIP_SCAN hint. Otherwise, the secondary index should work, make sure it's
> actually used via explain. Finally, you might try the ROW_TIMESTAMP feature
> if it fits your use case.
>
> On Feb 22, 2017, at 11:30 PM, NaHeon Kim <honey.and.sw@gmail.com> wrote:
>
> Hi all,
> I've seen performance problem when selecting rows within date range.
>
> My table schema is:
>
> CREATE TABLE MY_TABLE (
>    OBJ_ID varchar(20) not null,
>    CREATE_DT timestamp not null,
>    KEYWORD varchar(100) not null,
>    COUNT integer,
>    CONSTRAINT PK PRIMARY KEY (OBJ_ID,CREATE_DT,KEYWORD)
> );
>
> MY_TABLE has almost 5,200,000 rows,
> CREATE_DT has about 6 months range.
>
> And executed query:
>
> SELECT KEYWORD, SUM(COUNT)
> FROM MY_TABLE
> WHERE CREATE_DT > to_timestamp('2016-03-01 00:00')
> AND CREATE_DT < to_timestamp('2016-04-01 00:00')
> GROUP BY KEYWORD;
>
> It tooks 46 seconds, too slow than expected, cause CREATE_DT is one of row
> key.
> I created a secondary index on CREATE_DT but there's no improvement.
>
> Query plan looks weird:
> CLIENT 2-CHUNK 0 ROWS 0 BYTES PARALLEL 2-WAY FULL SCAN OVER
> NEWS_KEYWORD_COUNT
>      SERVER FILTER BY (CREATE_DT > TIMESTAMP '2017-03-01 00:00:00.000' AND
> CREATE_DT < TIMESTAMP '2017-04-01 00:00:00.000')
>      SERVER AGGREGATE INTO DISTINCT ROWS BY [KEYWORD]
> CLIENT MERGE SORT
> CLIENT 100 ROW LIMIT
>
> BUT If CREATE_DT comes first of row key, plan says range scan will be done.
>
> Any suggestion? : )
>
> Thanks,
> NaHeon
>
>

Mime
View raw message