phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serega Sheypak <serega.shey...@gmail.com>
Subject Re: Select by first part of composite primary key, is it effective?
Date Tue, 02 Feb 2016 09:34:57 GMT
> then you would be better off not using salt buckets all together rather
than having 100 parallel scan and block reads in your case. I
Didn't understand you correctly. What is difference between salted/not
salted table in case of "primary key leading-part select"?

2016-02-02 1:18 GMT+01:00 Mujtaba Chohan <mujtaba@apache.org>:

> If you are filtering on leading part of row key which is highly selective
> then you would be better off not using salt buckets all together rather
> than having 100 parallel scan and block reads in your case. In our test
> with billion+ row table, non-salted table offer much better performance
> since it ends up reading fewer blocks from a single region.
>
> //mujtaba
>
> On Mon, Feb 1, 2016 at 1:16 PM, Serega Sheypak <serega.sheypak@gmail.com>
> wrote:
>
>> Hi, here is my table DDL:
>> CREATE TABLE IF NOT EXISTS id_ref
>> (
>>    id1     VARCHAR   NOT NULL,
>>    value1  VARCHAR,
>>
>>    id2     VARCHAR NOT NULL,
>>    value2  VARCHAR
>>    CONSTRAINT id_ref_pk  PRIMARY KEY (id1, id2)
>> )IMMUTABLE_ROWS=true,SALT_BUCKETS=100, VERSIONS=1, TTL=691200
>>
>> I'm trying to analyze result of explain:
>>
>> explain select id1, value1, id2, value2 from id_ref where id1 = 'xxx'
>>
>> . . . . . . . . . . . . . . . . . . . . . . .> ;
>>
>> *+------------------------------------------+*
>>
>> *| **                  PLAN                  ** |*
>>
>> *+------------------------------------------+*
>>
>> *| *CLIENT 100-CHUNK PARALLEL 100-WAY RANGE SCAN OVER ID_REF
>> [0,'1fd5c44a75549162ca1602dda55f6d129cab61a6']* |*
>>
>> *| *CLIENT MERGE SORT                       * |*
>>
>> *+------------------------------------------+*
>>
>>
>> What happens? Client spawns 100 parallel scans (because of bucketing) and
>> waits for 100 responses?
>>
>> Is it effective? What is the right way to optimize such query pattern:
>> "select by first part of primary key"? Reduce the amount of buckets? I get
>> exeption a while after restarting app:
>>
>>
>> *Task org.apache.phoenix.job.JobManager$JobFutureTask@60a40644 rejected
>> from org.apache.phoenix.job.JobManager$1@58e3fe9aRunning, pool size = 128,
>> active threads = 121, queued tasks = 5000, completed tasks = 2629565*
>>
>>
>>
>

Mime
View raw message