Salting byte is calculated using a hash function for the whole row key (using all pk columns). So if you are using only one of PK columns in the WHERE clause, Phoenix is unable to identify which salting byte (bucket number) should be used, so it runs scans for all salting bytes.  All those threads are lightweight, mostly waiting for a response from HBase server, so you may consider the option to adjust nproc limit. Or you may decrease the number of phoenix threads by phoenix.query.threadPoolSize property. Decreasing number of salting buckets can be used as well.   

Thanks,
Sergey

On Tue, May 22, 2018 at 8:52 AM, Pradheep Shanmugam <Pradheep.Shanmugam@infor.com> wrote:

Hi,

 

We have table with key as (type, id1, id2) (type is same for all rows where as id1 and id2 are unique for each row) which is salted (30 salt buckets)
The load on this table is about 30 queries/sec with each query taking ~6ms
we are using phoenix 4.7.0 non-thin client
we have query like below


SELECT tab.a, tab.b

FROM tab

WHERE tab.id1 = '1F64F5DY0J0A03692'

AND tab.type = 4

AND tab.isActive = 1;

 

CLIENT 30-CHUNK 0 ROWS 0 BYTES PARALLEL 30-WAY ROUND ROBIN RANGE SCAN OVER TAB [0,4, '1F64F5DY0J0A03692']

    SERVER FILTER BY TS.ISACTIVE = 1

 

Here I could see that about 30 threads are being used for this query..here ‘type’ is same for all rows..and thought that it is the reason for looking into all the chunks to get the key and hence using 30 threads

 

Then I ran the same query on a similar table with keys rearranged (id1, id2, type) and salted (30)

 

But still I see same 30 threads are being used , thought it can uniquely identify a row with given id1 which should be in one of the chunks (is this due to salting that it does not know where the keys is)

 

CLIENT 30-CHUNK PARALLEL 30-WAY ROUND ROBIN RANGE SCAN OVER TAB [0, '1F64F5DY0J0A03692']

    SERVER FILTER BY (TYPE = 4 AND TS.ISACTIVE = 1)

 

Currently I am exceeding my nproc limit set in my app server with (phoenix threads 128 and hconnection threads reaching 256 = 384 threads). Can you please throw some light on phoenix connections and Hconnections  and how to reduce that to reasonable level..and also on the above query plans. Should we consider reducing the SALT Number to 10( we have 10 region servers)?

 

Thanks,

Pradheep