phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Indexes
Date Thu, 30 Jan 2014 20:24:52 GMT
Thanks for all the detail, Justin. Based on this, it looks like a bug
related to using case sensitive column names. Maryann checked in a fix
related to this, so it might be fixed in the latest on master.

If it's not fixed, would you mind filing a JIRA?

FWIW, you may want to consider a shorter column family name, like "k" or
"kw" as that'll make your table smaller. Also, did you know you can provide
your HBase table and column family config parameters in your CREATE TABLE
statement and it'll create the HBase table and the column families, like
below?

CREATE TABLE SEO.KEYWORDIDEAS (
    "pk" VARCHAR PRIMARY KEY,
    "keyword"."jobId" VARCHAR,
    "keyword"."jobName" VARCHAR,
    "keyword"."jobType" VARCHAR,
    "keyword"."keywordText" VARCHAR,
    "keyword"."parentKeywordText" VARCHAR,
    "keyword"."refinementName" VARCHAR,
    "keyword"."refinementValue" VARCHAR,
    "keyword"."relatedKeywordRank" VARCHAR
    ) IMMUTABLE_ROWS=true, COMPRESSION='SNAPPY' ;




On Thu, Jan 30, 2014 at 8:50 AM, Justin Workman <justinjworkman@gmail.com>wrote:

> I don't think that is the issue we are hitting. Details are below. The
> Hbase table does have more columns than we are defining the phoenix table.
> We were hoping to just be able to use the dynamic column features for
> if/when we need to access data in other columns in the underlying table. As
> you can see from the output of the explain statement below, a simple query
> does not use the index.
>
> However if I create another identical table using Phoenix and upsert into
> that new table from the table below, create the same index on that table
> and then run the same select query, it does use the index on that table.
>
> So I am still very confused as to why the index is not invoked when the
> table is created on top of an existing Hbase table.
>
> Hbase Create Table
> > create 'SEO.KEYWORDIDEAS', { NAME=>'keyword', COMPRESSION=>'SNAPPY' }
>
> Phoenix Create Table
> CREATE TABLE SEO.KEYWORDIDEAS (
>     "pk" VARCHAR PRIMARY KEY,
>     "keyword"."jobId" VARCHAR,
>     "keyword"."jobName" VARCHAR,
>     "keyword"."jobType" VARCHAR,
>     "keyword"."keywordText" VARCHAR,
>     "keyword"."parentKeywordText" VARCHAR,
>     "keyword"."refinementName" VARCHAR,
>     "keyword"."refinementValue" VARCHAR,
>     "keyword"."relatedKeywordRank" VARCHAR
>     ) IMMUTABLE_ROWS=true;
>
> Create Index
> CREATE INDEX KWDIDX ON SEO.KEYWORDIDEAS ("parentKeywordText");
>
> Show and count indexes
>
> +-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
> | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | NON_UNIQUE | INDEX_QUALIFIER |
> INDEX_NAME | TYPE | ORDINAL_POSITION | COLUMN_NAME | ASC_OR_DESC |
> CARDINALITY | PAGES  | FILTER_CONDITION | DATA_TYPE | TYPE_NAME |
>
> +-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
> | null      | SEO         | KEYWORDIDEAS | true       | null            |
> KWDIDX     | 3    | 1                | keyword:parentKeywordText | A
>     | null        | null   | null             | |
> | null      | SEO         | KEYWORDIDEAS | true       | null            |
> KWDIDX     | 3    | 2                | :pk         | A           | null
>    | null   | null             | 12        | V |
> | null      | SEO         | KEYWORDIDEAS | true       | null            |
> RA_TEST_ID | 3    | 1                | keyword:jobId | A           | null
>      | null   | null             | 12        | |
> | null      | SEO         | KEYWORDIDEAS | true       | null            |
> RA_TEST_ID | 3    | 2                | :pk         | A           | null
>    | null   | null             | 12        | V |
>
> +-----------+-------------+------------+------------+-----------------+------------+------+------------------+-------------+-------------+-------------+--------+------------------+-----------+-----------+
>
> > select count(1) from seo.keywordideas;
> +----------+
> | COUNT(1) |
> +----------+
> | 423229   |
> +----------+
> > select count(1) from seo.kwdidx;
> +----------+
> | COUNT(1) |
> +----------+
> | 423229   |
> +----------+
>
> > explain select count(1) from seo.keywords where "parentKeywordText" =
> 'table';
> +------------+
> |    PLAN    |
> +------------+
> | CLIENT PARALLEL 18-WAY FULL SCAN OVER SEO.KEYWORDIDEAS |
> |     SERVER FILTER BY keyword.parentKeywordText = 'sheets' |
> |     SERVER AGGREGATE INTO SINGLE ROW |
> +------------+
>
> Now here is where I can get the index to be invoked.
> > CREATE TABLE SEO.NEW_KEYWORDIDEAS (
>     PK VARCHAR PRIMARY KEY,
>     JOB_ID VARCHAR
>     JOB_NAME VARCHAR,
>     JOB_TYPE VARCHAR,
>     KEYWORD_TEXT VARCHAR,
>     PARENT_KEYWORD_TEXT VARCHAR,
>     REFINEMENT_NAME VARCHAR,
>     REFINEMENT_VALUE VARCHAR,
>     RELATED_KEYWORD_RANK VARCHAR
>     ) IMMUTABLE_ROWS=true;
>
> > UPSERT INTO SEO.NEW_KEYWORDIEAS SELECT * FROM SEO.KEYWORDIDEAS;
>
> > CREATE INDEX NEW_KWD_IDX ON SEO.NEW_KEYWORDIDEAS (PARENT_KEYWORD_TEXT);
>
> > explain select count(1) from seo.new_keywordideas where
> parent_keyword_text = 'table';
>
> +------------+
>
> |    PLAN    |
>
> +------------+
>
> | CLIENT PARALLEL 1-WAY RANGE SCAN OVER SEO.NEW_KWD_IDX ['table'] |
>
> |     SERVER AGGREGATE INTO SINGLE ROW |
>
>
>
>
>
> On Wed, Jan 29, 2014 at 5:21 PM, James Taylor <jamestaylor@apache.org>wrote:
>
>> Hi Justin,
>> Please take a look at this FAQ:
>> http://phoenix.incubator.apache.org/faq.html#/Why_isnnullt_my_secondary_index_being_used
>>
>> If that's not the case for you, can you include your CREATE TABLE, CREATE
>> INDEX, SELECT statement, and EXPLAIN plan?
>>
>> Thanks,
>> James
>>
>>
>> On Wed, Jan 29, 2014 at 4:13 PM, Justin Workman <justinjworkman@gmail.com
>> > wrote:
>>
>>> I am seeing some odd behavior with indexes and want some clarification
>>> on how they are used.
>>>
>>> When I create an table in phoenix on top of an existing Hbase table, and
>>> then create an index on this table, I can see the index get built and
>>> populated properly, however no queries show that they are using this index
>>> when I run an explain on the query.
>>>
>>> However, if I create an seperete table in Phoenix and do an upsert from
>>> my hbase table into the new table that I created, and create the same index
>>> as on the previous table. Then my queries show that they would use the
>>> index when running them through the explain plan.
>>>
>>> Are we not able to create or use an index on a table we create over an
>>> exiting Hbase table, or am I doing something wrong?
>>>
>>> Thanks in advance for any help.
>>> Justin
>>>
>>
>>
>

Mime
View raw message