phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Workman <justinjwork...@gmail.com>
Subject Re: Indexes
Date Thu, 30 Jan 2014 16:50:08 GMT
I don't think that is the issue we are hitting. Details are below. The
Hbase table does have more columns than we are defining the phoenix table.
We were hoping to just be able to use the dynamic column features for
if/when we need to access data in other columns in the underlying table. As
you can see from the output of the explain statement below, a simple query
does not use the index.

However if I create another identical table using Phoenix and upsert into
that new table from the table below, create the same index on that table
and then run the same select query, it does use the index on that table.

So I am still very confused as to why the index is not invoked when the
table is created on top of an existing Hbase table.

Hbase Create Table
> create 'SEO.KEYWORDIDEAS', { NAME=>'keyword', COMPRESSION=>'SNAPPY' }

Phoenix Create Table
CREATE TABLE SEO.KEYWORDIDEAS (
    "pk" VARCHAR PRIMARY KEY,
    "keyword"."jobId" VARCHAR,
    "keyword"."jobName" VARCHAR,
    "keyword"."jobType" VARCHAR,
    "keyword"."keywordText" VARCHAR,
    "keyword"."parentKeywordText" VARCHAR,
    "keyword"."refinementName" VARCHAR,
    "keyword"."refinementValue" VARCHAR,
    "keyword"."relatedKeywordRank" VARCHAR
    ) IMMUTABLE_ROWS=true;

Create Index
CREATE INDEX KWDIDX ON SEO.KEYWORDIDEAS ("parentKeywordText");

Show and count indexes
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | NON_UNIQUE | INDEX_QUALIFIER |
INDEX_NAME | TYPE | ORDINAL_POSITION | COLUMN_NAME | ASC_OR_DESC |
CARDINALITY | PAGES  | FILTER_CONDITION | DATA_TYPE | TYPE_NAME |
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
| null      | SEO         | KEYWORDIDEAS | true       | null            |
KWDIDX     | 3    | 1                | keyword:parentKeywordText | A
    | null        | null   | null             | |
| null      | SEO         | KEYWORDIDEAS | true       | null            |
KWDIDX     | 3    | 2                | :pk         | A           | null
   | null   | null             | 12        | V |
| null      | SEO         | KEYWORDIDEAS | true       | null            |
RA_TEST_ID | 3    | 1                | keyword:jobId | A           | null
     | null   | null             | 12        | |
| null      | SEO         | KEYWORDIDEAS | true       | null            |
RA_TEST_ID | 3    | 2                | :pk         | A           | null
   | null   | null             | 12        | V |
+-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+

> select count(1) from seo.keywordideas;
+----------+
| COUNT(1) |
+----------+
| 423229   |
+----------+
> select count(1) from seo.kwdidx;
+----------+
| COUNT(1) |
+----------+
| 423229   |
+----------+

> explain select count(1) from seo.keywords where "parentKeywordText" =
'table';
+------------+
|    PLAN    |
+------------+
| CLIENT PARALLEL 18-WAY FULL SCAN OVER SEO.KEYWORDIDEAS |
|     SERVER FILTER BY keyword.parentKeywordText = 'sheets' |
|     SERVER AGGREGATE INTO SINGLE ROW |
+------------+

Now here is where I can get the index to be invoked.
> CREATE TABLE SEO.NEW_KEYWORDIDEAS (
    PK VARCHAR PRIMARY KEY,
    JOB_ID VARCHAR
    JOB_NAME VARCHAR,
    JOB_TYPE VARCHAR,
    KEYWORD_TEXT VARCHAR,
    PARENT_KEYWORD_TEXT VARCHAR,
    REFINEMENT_NAME VARCHAR,
    REFINEMENT_VALUE VARCHAR,
    RELATED_KEYWORD_RANK VARCHAR
    ) IMMUTABLE_ROWS=true;

> UPSERT INTO SEO.NEW_KEYWORDIEAS SELECT * FROM SEO.KEYWORDIDEAS;

> CREATE INDEX NEW_KWD_IDX ON SEO.NEW_KEYWORDIDEAS (PARENT_KEYWORD_TEXT);

> explain select count(1) from seo.new_keywordideas where
parent_keyword_text = 'table';

+------------+

|    PLAN    |

+------------+

| CLIENT PARALLEL 1-WAY RANGE SCAN OVER SEO.NEW_KWD_IDX ['table'] |

|     SERVER AGGREGATE INTO SINGLE ROW |





On Wed, Jan 29, 2014 at 5:21 PM, James Taylor <jamestaylor@apache.org>wrote:

> Hi Justin,
> Please take a look at this FAQ:
> http://phoenix.incubator.apache.org/faq.html#/Why_isnnullt_my_secondary_index_being_used
>
> If that's not the case for you, can you include your CREATE TABLE, CREATE
> INDEX, SELECT statement, and EXPLAIN plan?
>
> Thanks,
> James
>
>
> On Wed, Jan 29, 2014 at 4:13 PM, Justin Workman <justinjworkman@gmail.com>wrote:
>
>> I am seeing some odd behavior with indexes and want some clarification on
>> how they are used.
>>
>> When I create an table in phoenix on top of an existing Hbase table, and
>> then create an index on this table, I can see the index get built and
>> populated properly, however no queries show that they are using this index
>> when I run an explain on the query.
>>
>> However, if I create an seperete table in Phoenix and do an upsert from
>> my hbase table into the new table that I created, and create the same index
>> as on the previous table. Then my queries show that they would use the
>> index when running them through the explain plan.
>>
>> Are we not able to create or use an index on a table we create over an
>> exiting Hbase table, or am I doing something wrong?
>>
>> Thanks in advance for any help.
>> Justin
>>
>
>

Mime
View raw message