phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Phoenix ignoring index and index hint with some queries over mapped hbase tables.
Date Fri, 03 Mar 2017 18:52:54 GMT
Hi Marcin,
There's a few things going on here:
1) An index created on a view over HBase data won't be maintained by
Phoenix. You'd need to maintain it yourself through some other external
means. If you create a table that maps to your HBase data, then it will be
maintained.
2) An index would only be used if you match against a constant on the
right-hand side (while you're matching against the "number" column). For
example, the following query would use the index and limit the scan to only
rows in which "number" starts with 'Queen':

0: jdbc:phoenix:> explain select * from "traces" where
regexp_substr("number", 'Q.*') = 'Queen';
+------------------------------------------------------------------------------+
|                                            PLAN
   |
+------------------------------------------------------------------------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER traces
[1,'Queen'] |
|     SERVER FILTER BY FIRST KEY ONLY AND REGEXP_SUBSTR("number", 'Q.*', 1)
=  |
+------------------------------------------------------------------------------+

Note that with local indexes, interpreting when the index is used is a bit
subtle, but you'll see a [1, ...] after the RANGE SCAN as an indication.
The index may be used if it's a full table scan (since the data contained
in the index table may be smaller than that in the data table), but that
won't buying you very much.

3) The index would only be used if your REGEXP_SUBSTR has a constant string
before any wildcard matches in the second argument. You could also
potentially use a function index [1], but it'd only use the index if the
REGEXP_SUBSTR makes the same exact call (i.e. same pattern argument) as was
used when the functional index was created.

HTH.  Thanks,

    James

[1] https://phoenix.apache.org/secondary_indexing.html#Functional_Indexes

On Fri, Mar 3, 2017 at 12:59 AM, Marcin Januszkiewicz <
katamaran666@gmail.com> wrote:

> Hi,
>
> I have a table in hbase and created a view of it in phoenix, along
> with a local index:
>
> create view "traces" (rowkey VARCHAR PRIMARY KEY, "cf"."time" VARCHAR,
> "cf"."number" VARCHAR, "cf"."class" VARCHAR);
> create local index idxf on "dmstab_vehicle_traces" ("cf"."number",
> "cf"."class", rowkey);
>
> I need to filter rows based on a regex condition on the "number"
> column, so I use queries like these:
>
> explain select * from "traces" where regexp_substr("number", 'Q.*') =
> "number";
>
> CLIENT 2-CHUNK PARALLEL 2-WAY ROUND ROBIN RANGE SCAN OVER traces [2]
>     SERVER FILTER BY FIRST KEY ONLY AND REGEXP_SUBSTR("number", 'Q.*',
> 1) = "number"
>
>
> It's a little ugly and less efficient than using LIKE, but the
> performance is still relatively acceptable thanks to the index.
>
> However, if I want to range of rowkeys to include, Phoenix stops using
> the index, which slows down the query significantly:
>
> explain select * from "traces" where regexp_substr("number", 'Q.*') =
> "number" and rowkey < 'BY';
>
> CLIENT 4-CHUNK 687413 ROWS 629145681 BYTES PARALLEL 1-WAY ROUND ROBIN
> RANGE SCAN OVER traces [*] - ['BY']
>     SERVER FILTER BY REGEXP_SUBSTR(cf."number", 'Q.*', 1) = cf."number"
>
> Using an index hint doesn't change anything. Is there a way to make
> this work, and is this a bug?
>

Mime
View raw message