phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Help tuning for bursts of high traffic?
Date Fri, 04 Dec 2015 16:28:52 GMT
Kumar - I believe you mentioned you are seeing this in a cluster of ~20
regionservers.

Zack - Yours is smaller yet, at 9.

These clusters are small enough to make getting stack dumps through the
HBase debug servlet during periods of unusually slow response possible.
Perhaps you can write a script that queries all of the debug servlets (can
use curl) and dumps the received output into per-regionserver files? Scrape
every 10 or so seconds during the observed periods of slowness? Then
compress them and make them available for Phoenix devs up on S3? Consider
it a poor man's sampler. I don't know what we might find, but this could
prove very helpful.


On Fri, Dec 4, 2015 at 8:11 AM, Kumar Palaniappan <
kpalaniappan@marinsoftware.com> wrote:

> I'm in the same exact position as Zack described. Appreciate your feedback.
>
> So far we tried the call queue n the handlers, nope. Planned to try
> off-heap cache.
>
> Kumar Palaniappan <http://about.me/kumar.palaniappan>
> <https://twitter.com/intent/follow?original_referer=https://twitter.com/about/resources/buttons&region=follow_link&screen_name=megamda&source=followbutton&variant=2.0>
>  [image: Description: Macintosh HD:Users:Kumarappan:Desktop:linkedin.gif]
> <http://www.linkedin.com/in/kumarpalaniappan>
>
> On Dec 4, 2015, at 6:45 AM, Riesland, Zack <Zack.Riesland@sensus.com>
> wrote:
>
> Thanks Satish,
>
>
>
> To clarify: I’m not looking up single rows. I’m looking up the history of
> each widget, which returns hundreds-to-thousands of results per widget (per
> query).
>
>
>
> Each query is a range scan, it’s just that I’m performing thousands of
> them.
>
>
>
> *From:* Satish Iyengar [mailto:satysh@gmail.com <satysh@gmail.com>]
> *Sent:* Friday, December 04, 2015 9:43 AM
> *To:* user@phoenix.apache.org
> *Subject:* Re: Help tuning for bursts of high traffic?
>
>
>
> Hi Zack,
>
>
>
> Did you consider avoiding hitting hbase for every single row by doing that
> step in an offline mode? I was thinking if you could have some kind of
> daily export of hbase table and then use pig to perform join (co-group
> perhaps) to do the same. Obviously this would work only when your hbase
> table is not maintained by stream based system. Hbase is really good at
> range scans and may not be ideal for single row (large number of).
>
>
>
> Thanks,
>
> Satish
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Dec 4, 2015 at 9:09 AM, Riesland, Zack <Zack.Riesland@sensus.com>
> wrote:
>
> SHORT EXPLANATION: a much higher percentage of queries to phoenix return
> exceptionally slow after querying very heavily for several minutes.
>
>
>
> LONGER EXPLANATION:
>
>
>
> I’ve been using Pheonix for about a year as a data store for web-based
> reporting tools and it works well.
>
>
>
> Now, I’m trying to use the data in a different (much more
> request-intensive) way and encountering some issues.
>
>
>
> The scenario is basically this:
>
>
>
> Daily, ingest very large CSV files with data for widgets.
>
>
>
> Each input file has hundreds of rows of data for each widget, and tens of
> thousands of unique widgets.
>
>
>
> As a first step, I want to de-duplicate this data against my Phoenix-based
> DB (I can’t rely on just upserting the data for de-dup because it will go
> through several ETL steps before being stored into Phoenix/HBase).
>
>
>
> So, per-widget, I perform a query against Phoenix (the table is keyed
> against the unique widget ID + sample point). I get all the data for a
> given widget id, within a certain period of time, and then I only ingest
> rows for that widget that are new to me.
>
>
>
> I’m doing this in Java in a single step: I loop through my input file and
> perform one query per widget, using the same Connection object to Phoenix.
>
>
>
> THE ISSUE:
>
>
>
> What I’m finding is that for the first several thousand queries, I almost
> always get a very fast (less than 10 ms) response (good).
>
>
>
> But after 15-20 thousand queries, the response starts to get MUCH slower.
> Some queries respond as expected, but many take as many as 2-3 minutes,
> pushing the total time to prime the data structure into the 12-15 hour
> range, when it would only take 2-3 hours if all the queries were fast.
>
>
>
> The same exact queries, when run manually and not part of this bulk
> process, return in the (expected) < 10 ms.
>
>
>
> So it SEEMS like the burst of queries puts Phoenix into some sort of busy
> state that causes it to respond far too slowly.
>
>
>
> The connection properties I’m setting are:
>
>
>
> Phoenix.query.timeoutMs: 90000
>
> Phoenix.query.keepAliveMs: 90000
>
> Phenix.query.threadPoolSize: 256
>
>
>
> Our cluster is 9 (beefy) region servers and the table I’m referencing is
> 511 regions. We went through a lot of pain to get the data split extremely
> well, and I don’t think Schema design is the issue here.
>
>
>
> Can anyone help me understand how to make this better? Is there a better
> approach I could take? A better set of configuration parameters? Is our
> cluster just too small for this?
>
>
>
>
>
> Thanks!
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
> Satish Iyengar
>
> "Anyone who has never made a mistake has never tried anything new."
> Albert Einstein
>
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
View raw message