phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Soldatov <sergeysolda...@gmail.com>
Subject Re: hint to use a global index is not working - need to find out why
Date Thu, 19 Apr 2018 21:19:19 GMT
That looks strange. Could you please provide full DDLs for table and
indexes? I just tried a similar scenario and obviously index is used:

0: jdbc:phoenix:> create table VARIANTJOIN_RTSALTED24 (id integer primary
key, chrom_int integer, genomic_range integer);
No rows affected (6.339 seconds)
0: jdbc:phoenix:>    create index jv2_chrom_int on VARIANTJOIN_RTSALTED24
(chrom_int);
No rows affected (10.016 seconds)
0: jdbc:phoenix:> explain SELECT/*+ INDEX(VJ jv2_chrom_int) */
VJ.chrom_int, genomic_range  FROM VARIANTJOIN_RTSALTED24 as VJ WHERE
(chrom_int =18 ) limit 5;
+---------------------------------------------------------------------------------------+
|                                         PLAN
            |
+---------------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER
VARIANTJOIN_RTSALTED24       |
| CLIENT 5 ROW LIMIT
            |
|     SKIP-SCAN-JOIN TABLE 0
            |
|         CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER
JV2_CHROM_INT [18]  |
|             SERVER FILTER BY FIRST KEY ONLY
             |
|     DYNAMIC SERVER FILTER BY "VJ.ID" IN ($2.$4)
             |
|     JOIN-SCANNER 5 ROW LIMIT
            |
+---------------------------------------------------------------------------------------+
7 rows selected (0.936 seconds)


Thanks,
Sergey

On Thu, Apr 19, 2018 at 7:31 PM, Taylor, Ronald (Ronald) <
Ronald.Taylor@cchmc.org> wrote:

> Hello Phoenix users,
>
> I am a novice Phoenix user and this is my first post to this user list. I
> did some searching in the list archives, but could not find an answer to
> what I hope is a simple question: my global index is being ignored, even
> after I add a Hint, and I want to know why.
>
> We are using Phoenix 4.7 in the Hortonworks distribution. Looks like
> Hortonworks has been backporting at least some phoenix updates into their
> version of phoenix 4.7, so I guess it is a custom distribution. See
>
>
>
>      https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/
> bk_release-notes/content/patch_phoenix.html
>
>
>
> I have created a simple table of about 8 million rows, and about 15
> columns, with several fields having global indexes. I created the main
> table (variantjoin_rtsalted24) and its indexes, and then used a bulk loader
> to populate them from a tab-delimited file. That appeared to work fine.
>
> chrom_int is one field on which there is a global index, named
> vj2_chrom_int. And you can see the index being automatically being used
> below, where it is the only field being returned. Time required is 0.124
> sec.
>
> 0: jdbc:phoenix:> SELECT VJ.chrom_int  FROM VARIANTJOIN_RTSALTED24 as VJ
> WHERE (chrom_int =18 ) limit 5;
>
> +------------+
>
> | CHROM_INT  |
>
> +------------+
>
> | 18         |
>
> | 18         |
>
> | 18         |
>
> | 18         |
>
> | 18         |
>
> +------------+
>
> 5 rows selected (0.124 seconds)
>
> 0: jdbc:phoenix:>
>
> You can see that the vj2_chrom_int index is automatically being used, as I
> understand things  by the "RANGE SCAN" wording and "[0,1" in the explain
> plan:
>
> 0: jdbc:phoenix:> explain SELECT VJ.chrom_int  FROM VARIANTJOIN_RTSALTED24
> as VJ WHERE (chrom_int =18 ) limit 5;
>
> +-----------------------------------------------------------
> -------------------+
>
> |                                           PLAN
> |
>
> +-----------------------------------------------------------
> -------------------+
>
> | CLIENT 24-CHUNK SERIAL 24-WAY ROUND ROBIN RANGE SCAN OVER VJ2_CHROM_INT
> [0,1 |
>
> |     SERVER FILTER BY FIRST KEY ONLY
> |
>
> |     SERVER 5 ROW LIMIT
> |
>
> | CLIENT 5 ROW LIMIT
>                                   |
>
> +-----------------------------------------------------------
> -------------------+
>
> 4 rows selected (0.043 seconds)
>
> 0: jdbc:phoenix:>
>
>
> I can use a Hint to tell Phoenix to NOT use this index, as seen below. And
> that increases the time needed to 1.97 sec, over an order of magnitude more
> time than the 0.124 sec required with index use.
>
> 0: jdbc:phoenix:> SELECT /*+ NO_INDEX */ VJ.chrom_int  FROM
> VARIANTJOIN_RTSALTED24 as VJ WHERE (chrom_int =18 ) limit 5;
>
> +------------+
>
> | CHROM_INT  |
>
> +------------+
>
> | 18         |
>
> | 18         |
>
> | 18         |
>
> | 18         |
>
> | 18         |
>
> +------------+
>
> 5 rows selected (1.977 seconds)
>
> 0: jdbc:phoenix:>
>
> And here is the explain plan for that:
>
>
> 0: jdbc:phoenix:> explain SELECT /*+ NO_INDEX */ VJ.chrom_int  FROM
> VARIANTJOIN_RTSALTED24 as VJ WHERE (chrom_int =18 ) limit 5;
>
> +-----------------------------------------------------------
> -------------------+
>
> |
> PLAN                  |
>
> +-----------------------------------------------------------
> -------------------+
>
> | CLIENT 72-CHUNK 14325202 ROWS 15099524157 BYTES PARALLEL 24-WAY ROUND
> ROBIN  |
>
> |     SERVER FILTER BY CHROM_INT = 18
> |
>
> |     SERVER 5 ROW LIMIT
> |
>
> | CLIENT 5 ROW LIMIT
>                                   |
>
> +-----------------------------------------------------------
> -------------------+
>
> 4 rows selected (0.009 seconds)
>
> Now,  I want to add other fields for retrieval. For example,
> "genomic_range". The Phoenix documentation says in such a case I must add a
> Hint to force Phoenix to make use of the index (since it is a simple global
> index, not a covered index wherein genomic_range has been added.) So I
> tried that. See below. Alas, the response time is about the same as what I
> get with NO_INDEX.  It appears that, even with the Hint, the index is not
> being used.
>
> 0: jdbc:phoenix:> SELECT/*+ INDEX(VJ jv2_chrom_int) */ VJ.chrom_int,
> genomic_range  FROM VARIANTJOIN_RTSALTED24 as VJ WHERE (chrom_int =18 )
> limit 5;
> +------------+----------------------+
> | CHROM_INT  |    GENOMIC_RANGE     |
> +------------+----------------------+
> | 18         | [49546,49547)        |
> | 18         | [20003625,20003626)  |
> | 18         | [19618749,19618752)  |
> | 18         | [47561,47583)        |
> | 18         | [20024261,20024272)  |
> +------------+----------------------+
> 5 rows selected (1.799 seconds)
> 0: jdbc:phoenix:>
>
>
> And below is the explain plan for the query with the index failure. No
> indication of index use, that I can tell.
>
> 0: jdbc:phoenix:> explain SELECT/*+ INDEX(VJ jv2_chrom_int) */
> VJ.chrom_int, genomic_range  FROM VARIANTJOIN_RTSALTED24 as VJ WHERE
> (chrom_int =18 ) limit 5;
>
> +-----------------------------------------------------------
> -------------------+
> |
> PLAN                  |
> +-----------------------------------------------------------
> -------------------+
> | CLIENT 72-CHUNK 14325202 ROWS 15099524157 BYTES PARALLEL 24-WAY ROUND
> ROBIN  |
> |     SERVER FILTER BY CHROM_INT = 18
> |
> |     SERVER 5 ROW LIMIT
> |
> | CLIENT 5 ROW LIMIT
>                                   |
> +-----------------------------------------------------------
> -------------------+
> 4 rows selected (0.042 seconds)
> 0: jdbc:phoenix:>
>
> So I am puzzled and frustrated. How do I get my index Hint to work?  The
> difference in timing between automatic use (when the index is the only
> field being retrieved) and when NO_INDEX is used tells me that the index
> table is there and can indeed be used. But something is going wrong when I
> try to force its use via a Hint. Guidance would be very much appreciated on
> this basic point.
>
>   - Ron Taylor
>
> Ronald C. Taylor, Ph.D.
> Divisions of Immunobiology and Biomedical Informatics
>
> Cincinnati Children's Hospital Medical Center
>
> Office phone: 513-803-4880
>
> Cell phone:     509-783-7308
>
> Email: ronald.taylor@cchmc.org
>
>
>
>
>

Mime
View raw message