phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Estimating the "cost" of a query
Date Sat, 03 Oct 2015 01:20:46 GMT
Hi Alok,
Yes, you could calculate an estimate for this information, but it isn't
currently exposed through JDBC or through the explain plan (which would be
a good place for it to live). You'd need to dip down to the implementation
to get it. Something like this:

PhoenixStatement statement =
connection.createStatement().unwrap(PhoenixStatement.class);
ResultSet rs = statement.executeQuery("EXPLAIN SELECT ...");
QueryPlan plan = statement.getQueryPlan();
List<KeyRange> ranges = plan.getSplits();

Each KeyRange in ranges will be going over a configurable amount of bytes
(determined by phoenix.stats.guidepost.width
and/or phoenix.stats.guidepost.per.region), so a simple worst case estimate
would be to multiply the ranges.size() by this config value (using a
default of QueryServicesOptions.DEFAULT_STATS_GUIDEPOST_WIDTH_BYTES or
300MB). If the query is a point lookup (which you can check with
plan.getContext().getScanRanges().isPointLookup()), then the cost would be
ranges.size() * average_row_size.

Since these aren't exposed APIs, they're subject to change. Please file a
JIRA if you're interested in helping figure out what the "official" APIs
for this should be.

HTH. Thanks,

    James

On Fri, Oct 2, 2015 at 5:35 PM, Alok Singh <alok@cloudability.com> wrote:

> Is there a way to figure out how many rows/cells were scanned in
> hbase perform a phoenix query? I tried using the explain command, but, it
> is not clear how to estimate the number of rows touched by looking at the
> explain plan. Essentially, I want to be able to report back to users the
> "cost" of performing a phoenix query, where "cost" is some function of
> rows/cells scanned.
>
> Alok
>
> alok@cloudability.com
>

Mime
View raw message