phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Phoenix JDBC driver hangs/timeouts
Date Tue, 20 Oct 2015 18:57:49 GMT
Hi Alok,
Thanks for the additional information. I'm curious about your use of
salting on your table. We typically recommend salting to overcome
hotspotting which occurs when you have a row key that is monotonically
increasing. A salted table will put a higher load on your cluster during
range queries because Phoenix needs to query every salt bucket. Have you
tried perf testing with and without salting?
Thanks,
James

On Mon, Oct 19, 2015 at 1:08 PM, Alok Singh <alok@cloudability.com> wrote:

> Tracked the issue down to "phoenix.query.threadPoolSize" value being
> greater than "hbase.hconnection.meta.lookup.threads.max". It looks like
> "hbase.hconnection.meta..." value is used to create a pool for bookkeeping
> calls that phoenix makes to SYSTEM.CATALOG table, and having more threads
> in the phoenix queryPool causes the hang. Will keep looking to figure out
> the root cause...
>
>
> Alok
>
> alok@cloudability.com
>
>
> On Sun, Oct 18, 2015 at 12:09 PM, Alok Singh <alok@cloudability.com>
> wrote:
> >
> > Hi Samarth,
> >
> > 1) How many region servers are on the cluster?
> > 12 regionservers
> >
> > 2) What is the value configured for hbase.regionserver.handler.count?
> > 128
> >
> > 3) What kind of queries is your test executing - point look up / range /
> aggregate/ full table scan/ with limit clause / with order by ?
> > The queries are aggregations over a timeperiod, grouped by on or more
> columns
> > e.g:
> > SELECT dimension_1,
> >        Sum(metric_1),
> >        Count(metric_1)
> > FROM   fact_table
> > WHERE  (dimension_1 IN ('12312321''))  AND (START >= TO_DATE('2015-07-21
> 00:00:00'))  AND (START <= TO_DATE('2015-07-27 00:00:00'))  AND (PRECISION
> = 1 AND account_id IN ('1234', '5678',....))  group by dimension_1
> >
> > 4) What does the schema look like for the tables? Are they salted? How
> big are the row keys?
> > All the queries run against a single fact table. It has 32 cols, 11 of
> which are part fo the primary key.
> > CREATE TABLE IF NOT EXISTS FACT_TABLE (
> >      ACCOUNT_ID VARCHAR NOT NULL,
> >      PRECISION TINYINT NOT NULL,
> >      START TIMESTAMP NOT NULL,
> >      SECONDARY_ACCOUNT_ID VARCHAR NOT NULL,
> >      DIMENSION_1 VARCHAR NOT NULL,
> >      DIMENSION_2 VARCHAR NOT NULL,
> > ....
> > ....
> >      METRIC_1 DECIMAL,
> >      METRIC_2 DECIMAL,
> > .....
> >      UPDATED_AT TIMESTAMP,
> >      CONSTRAINT PK PRIMARY KEY (
> >                     ACCOUNT_ID,
> >                     PRECISION,
> >                     START,
> >                     SECONDARY_ACCOUNT_ID,
> >                     DIMENSION_1,
> >                     DIMENSION_2,
> >                     ....
> >                    DIMENSION_7
> >      )
> > )
> >
> > Salt is 16
> >
> > 5) Are you executing these queries concurrently or serially? If
> concurrently, what is the concurrency number?
> > The test runs the queries serially.
> >
> > 6) Do you have Phoenix stats enabled? If yes, can you tell us what does
> the below query returns for the tables your test is running queries on:
> > Stats are disabled (we have truncated system.stats table).
> >
> > Alok
> >
> > alok@cloudability.com
> >
> >
> > On Sun, Oct 18, 2015 at 11:25 AM, Samarth Jain <samarth@apache.org>
> wrote:
> > >
> > > Alok,
> > >
> > > Please answer the below questions to help us figure out what might be
> going on:
> > >
> > > 1) How many region servers are on the cluster?
> > >
> > > 2) What is the value configured for hbase.regionserver.handler.count?
> > >
> > > 3) What kind of queries is your test executing - point look up / range
> / aggregate/ full table scan/ with limit clause / with order by ?
> > >
> > > 4) What does the schema look like for the tables? Are they salted? How
> big are the row keys?
> > >
> > > 5) Are you executing these queries concurrently or serially? If
> concurrently, what is the concurrency number?
> > >
> > > 6) Do you have Phoenix stats enabled? If yes, can you tell us what
> does the below query returns for the tables your test is running queries on:
> > >  SELECT SUM(GUIDE_POSTS_ROW_COUNT) FROM SYSTEM.STATS WHERE
> PHYSICAL_NAME='your_table_name';
> > >
> > > - Samarth
> > >
> > >
> > >
> > >
> > > On Sun, Oct 18, 2015 at 11:05 AM, Alok Singh <alok@cloudability.com>
> wrote:
> > >>
> > >> HBase/Phoenix Environment:
> > >> HBase 1.1.2/Phoenix 4.5.1
> > >> JDK: 1.7
> > >> Regions: ~1900
> > >>
> > >> Client environment:
> > >> JDK: 1.8
> > >> Phoenix JDBC Driver: 4.5.1
> > >> hbase.rpc.timeout=600000
> > >> phoenix.query.threadPoolSize=256
> > >> phoenix.query.queueSize=20000
> > >>
> > >>
> > >> As part of validation testing, we run a set of queries against our
> production cluster. But, we have been unable to complete a full test run as
> the client performing the test starts timing out after a few minutes.
> Though we run the queries in the same order, no two test runs will hang at
> the same query.  Here is the link to the thread dumps from one such run:
> https://gist.githubusercontent.com/aloksingh/bc6b72acf79da366aa75/raw/e527ee3e7bc267e6007fc36250e4f2a914eac9f6/gistfile1.txt
> > >>
> > >> There are 3 thread dumps in the file, taken few seconds apart.
> > >>
> > >> The client creates a new JDBC connection for each query
> (DriverManager.getConnection(...)) and closes it after the query is
> complete.
> > >>
> > >> Any ideas?
> > >>
> > >>
> > >> Alok
> > >>
> > >
>

Mime
View raw message