phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Riesland, Zack" <>
Subject Help tuning for bursts of high traffic?
Date Fri, 04 Dec 2015 14:09:10 GMT
SHORT EXPLANATION: a much higher percentage of queries to phoenix return exceptionally slow
after querying very heavily for several minutes.


I've been using Pheonix for about a year as a data store for web-based reporting tools and
it works well.

Now, I'm trying to use the data in a different (much more request-intensive) way and encountering
some issues.

The scenario is basically this:

Daily, ingest very large CSV files with data for widgets.

Each input file has hundreds of rows of data for each widget, and tens of thousands of unique

As a first step, I want to de-duplicate this data against my Phoenix-based DB (I can't rely
on just upserting the data for de-dup because it will go through several ETL steps before
being stored into Phoenix/HBase).

So, per-widget, I perform a query against Phoenix (the table is keyed against the unique widget
ID + sample point). I get all the data for a given widget id, within a certain period of time,
and then I only ingest rows for that widget that are new to me.

I'm doing this in Java in a single step: I loop through my input file and perform one query
per widget, using the same Connection object to Phoenix.


What I'm finding is that for the first several thousand queries, I almost always get a very
fast (less than 10 ms) response (good).

But after 15-20 thousand queries, the response starts to get MUCH slower. Some queries respond
as expected, but many take as many as 2-3 minutes, pushing the total time to prime the data
structure into the 12-15 hour range, when it would only take 2-3 hours if all the queries
were fast.

The same exact queries, when run manually and not part of this bulk process, return in the
(expected) < 10 ms.

So it SEEMS like the burst of queries puts Phoenix into some sort of busy state that causes
it to respond far too slowly.

The connection properties I'm setting are:

Phoenix.query.timeoutMs: 90000
Phoenix.query.keepAliveMs: 90000
Phenix.query.threadPoolSize: 256

Our cluster is 9 (beefy) region servers and the table I'm referencing is 511 regions. We went
through a lot of pain to get the data split extremely well, and I don't think Schema design
is the issue here.

Can anyone help me understand how to make this better? Is there a better approach I could
take? A better set of configuration parameters? Is our cluster just too small for this?


View raw message