phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alok Singh <a...@cloudability.com>
Subject Re: Estimating the "cost" of a query
Date Mon, 05 Oct 2015 16:05:56 GMT
https://issues.apache.org/jira/browse/PHOENIX-2306

Alok

alok@cloudability.com

On Sat, Oct 3, 2015 at 12:37 PM, Alok Singh <alok@cloudability.com> wrote:

> Thanks James.
>
> I will create a jira ticket for exposing the APIs in a standard way.
>
> Alok
>
> Alok
>
> alok@cloudability.com
>
> On Fri, Oct 2, 2015 at 6:20 PM, James Taylor <jamestaylor@apache.org>
> wrote:
>
>> Hi Alok,
>> Yes, you could calculate an estimate for this information, but it isn't
>> currently exposed through JDBC or through the explain plan (which would be
>> a good place for it to live). You'd need to dip down to the implementation
>> to get it. Something like this:
>>
>> PhoenixStatement statement =
>> connection.createStatement().unwrap(PhoenixStatement.class);
>> ResultSet rs = statement.executeQuery("EXPLAIN SELECT ...");
>> QueryPlan plan = statement.getQueryPlan();
>> List<KeyRange> ranges = plan.getSplits();
>>
>> Each KeyRange in ranges will be going over a configurable amount of bytes
>> (determined by phoenix.stats.guidepost.width
>> and/or phoenix.stats.guidepost.per.region), so a simple worst case estimate
>> would be to multiply the ranges.size() by this config value (using a
>> default of QueryServicesOptions.DEFAULT_STATS_GUIDEPOST_WIDTH_BYTES or
>> 300MB). If the query is a point lookup (which you can check with
>> plan.getContext().getScanRanges().isPointLookup()), then the cost would be
>> ranges.size() * average_row_size.
>>
>> Since these aren't exposed APIs, they're subject to change. Please file a
>> JIRA if you're interested in helping figure out what the "official" APIs
>> for this should be.
>>
>> HTH. Thanks,
>>
>>     James
>>
>> On Fri, Oct 2, 2015 at 5:35 PM, Alok Singh <alok@cloudability.com> wrote:
>>
>>> Is there a way to figure out how many rows/cells were scanned in
>>> hbase perform a phoenix query? I tried using the explain command, but, it
>>> is not clear how to estimate the number of rows touched by looking at the
>>> explain plan. Essentially, I want to be able to report back to users the
>>> "cost" of performing a phoenix query, where "cost" is some function of
>>> rows/cells scanned.
>>>
>>> Alok
>>>
>>> alok@cloudability.com
>>>
>>
>>
>

Mime
View raw message