phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <>
Subject Re: Secondary index row explosion due to N key combos to handle ad-hoc queries?
Date Fri, 28 Mar 2014 02:12:05 GMT
On Thu, Mar 27, 2014 at 3:02 PM, Otis Gospodnetic <> wrote:

> Hi,
> I wanted to extract the following in a separate thread:
> I was going to ask about partitioning as a way to handle (querying
>> against) large volumes of data.  This is related to my Q above about
>> date-based partitioning.  But I'm wondering if one can go further.
>>  Partitioning by date, partitioning by tenant, but then also partitioning
>> by some other columns, which would be different for each type of data being
>> inserted. e.g. for sales data maybe the partitions would be date, tenantID,
>> but then also customerCountry, customerGender, etc.  For performance
>> metrics data maybe it would be date, tenantID, but then also environment
>> (prod vs. dev), or applicationType (e.g. my HBase cluster performance
>> metrics vs. my Tomcat performance metrics), and so on.
> > Essentially, a secondary index is declaring a partitioning. The indexed
> columns make up the row > key which in HBase determines the partitioning.
> Aha!  Hmmm.  But, as far as I know, how one constructs the key is.... the
> key.  That is, doesn't one typically construct the key based on access
> patterns?
> How would that work in the the scenario I described in my other email -
> unknown number of columns and ad-hoc SQL queries?
> How do you handle the above without having to create all possible
> combinations of columns (to anticipate any sort of query) and having to
> insert N rows in the index table for each 1 row in the primary table?
>  Don't you have to do that in order to handle any ad-hoc query one may
> choose to run?

That's true - you'd want to selectively add indexes, based on anticipated
access patterns. It's similar to the RDBMS world in that regard.

> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support *

View raw message