phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradheep Shanmugam <Pradheep.Shanmu...@infor.com>
Subject Re: Phoenix query performance
Date Thu, 23 Feb 2017 16:59:41 GMT
Hi Arvind,

The row key is PARENTID, OWNERORGID,  MILESTONETYPEID, PARENTTYPE
Each parentid will have a list of  MILESTONETYPEID (19661, 1, 2 , etc..). So your query will
return all the parentids.. I am looking of rparentid that does not have a MILESTONETYPEID

Thanks,
Pradheep

From: Arvind S <arvind18352@gmail.com<mailto:arvind18352@gmail.com>>
Reply-To: "user@phoenix.apache.org<mailto:user@phoenix.apache.org>" <user@phoenix.apache.org<mailto:user@phoenix.apache.org>>
Date: Thursday, February 23, 2017 at 1:19 AM
To: "user@phoenix.apache.org<mailto:user@phoenix.apache.org>" <user@phoenix.apache.org<mailto:user@phoenix.apache.org>>
Subject: Re: Phoenix query performance

why cant you reduce your query to

  select msbo1.PARENTID
  from msbo_phoenix_comp_rowkey
  where msbo1.PARENTTYPE = 'SHIPMENT'
   and msbo1.OWNERORGID = 100
   and msbo1.MILESTONETYPEID != 19661
   and msbo1.PARENTREFERENCETIME between 1479964000 and 1480464000
  group by msbo1.PARENTID
  order by msbo1.PARENTID


??

looks like that's what you get by this left outer.



Cheers !!
Arvind

On 22 February 2017 at 22:58, Pradheep Shanmugam <Pradheep.Shanmugam@infor.com<mailto:Pradheep.Shanmugam@infor.com>>
wrote:
Hi,

We have a hbase cluster with 8 region servers with 20G memory
We have a table  with 1 column family along with a secondary index.
Following query took only few milliseconds when we had less data(< 1 million)
After adding more data(~30M rows) the performance declined and took about a minute or more(not
stable)

select msbo1.PARENTID
  from msbo_phoenix_comp_rowkey msbo1
  left outer join (
         select PARENTID,MILESTONETYPEID
           from msbo_phoenix_comp_rowkey
          where PARENTREFERENCETIME between 1479964000 and 1480464000<tel:01480%20464%20000>
            and OWNERORGID = 100
            and PARENTTYPE = 'SHIPMENT'
            and MILESTONETYPEID = 19661
            group by PARENTID,MILESTONETYPEID
             ) msbo2
  on msbo1.PARENTID = msbo2.PARENTID
  where msbo1.PARENTTYPE = 'SHIPMENT'
   and msbo1.OWNERORGID = 100
   and msbo2.MILESTONETYPEID is null
   and msbo1.PARENTREFERENCETIME between 1479964000 and 1480464000<tel:01480%20464%20000>
group by msbo1.PARENTID
  order by msbo1.PARENTID

The RHS return about a 500K rows ..LHS about 18M rows…final result about 500K rows

MSBO_PHOENIX_COMP_ROWKEY_INDEX_PARENTTYPE_OWNERORGID_MILESTONETYPEID_PARENTREFERENCETIME 
is the index

Query plan:
CLIENT 8-CHUNK PARALLEL 8-WAY RANGE SCAN OVER MSBO_PHOENIX_COMP_ROWKEY_INDEX_PARENTTYPE_OWNERORGID_MILESTONETYPEID_PARENTREFERENCETIME
[0,'SHIPMENT',100]
    SERVER FILTER BY FIRST KEY ONLY AND (TO_UNSIGNED_LONG("PARENTREFERENCETIME") >= 1477958400
AND TO_UNSIGNED_LONG("PARENTREFERENCETIME") <= 1480464000<tel:01480%20464%20000>)
    SERVER AGGREGATE INTO DISTINCT ROWS BY ["MSBO1.:PARENTID"]
CLIENT MERGE SORT
    PARALLEL LEFT-JOIN TABLE 0
        CLIENT 8-CHUNK PARALLEL 8-WAY RANGE SCAN OVER MSBO_PHOENIX_COMP_ROWKEY_INDEX_PARENTTYPE_OWNERORGID_MILESTONETYPEID_PARENTREFERENCETIME
[0,'SHIPMENT',100,19661,1,477,958,400] - [0,'SHIPMENT',100,19661,1,480,464,000]
            SERVER FILTER BY FIRST KEY ONLY
            SERVER AGGREGATE INTO DISTINCT ROWS BY ["PARENTID", "MILESTONETYPEID"]
        CLIENT MERGE SORT
    AFTER-JOIN SERVER FILTER BY MSBO2.MILESTONETYPEID IS NULL

Attached the phoenix log.
I see the caching to set as 100..and "maxResultSize”:2097152..is that something that can
be tuned will help?
Is that the client merge sort consuming more time can be improved? Is there any other tuning
possible?

Thanks,
Pradheep

Mime
View raw message