phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mujtaba Chohan <mujt...@apache.org>
Subject Re: Index tables at scale
Date Mon, 11 Jul 2016 21:41:57 GMT
FYI if you keys are not written in order i.e. you are not concerned about
write hot-spotting/write throughput then try writing your data to an
un-salted table. Read performance for un-salted table can be comparable or
better to salted one with stats
<https://phoenix.apache.org/update_statistics.html>.

On Mon, Jul 11, 2016 at 2:31 PM, Simon Wang <simon.wang@airbnb.com> wrote:

> This indexes will be salted indeed. (so is the data table). If all indexes
> reside in the same table, there will be only 512 regions in total (256 for
> data table, 256 for the combined index table). Indeed the combined index
> table will be 12x large as a single index table. But it doesn’t cover all
> columns so it should be fine.
>
> On Jul 11, 2016, at 2:26 PM, James Taylor <jamestaylor@apache.org> wrote:
>
> Will the index be salted (and that's why it's 256 regions per table)? If
> not, how many regions would there be if all indexes are in the same table
> (assuming the table is 12x bigger than one index table)?
>
> On Monday, July 11, 2016, Simon Wang <simon.wang@airbnb.com> wrote:
>
>> Thanks, Mujtaba. What you wrote is exactly what I meant. While not all
>> our tables needs these many regions and indexes, the num of regions/region
>> server can grow quickly.
>>
>> -Simon
>>
>> On Jul 11, 2016, at 2:17 PM, Mujtaba Chohan <mujtaba@apache.org> wrote:
>>
>> 12 index tables * 256 region per table = ~3K regions for index tables
>> assuming we are talking of covered index which implies 200+ regions/region
>> server on a 15 node cluster.
>>
>> On Mon, Jul 11, 2016 at 1:58 PM, James Taylor <jamestaylor@apache.org>
>> wrote:
>>
>>> Hi Simon,
>>>
>>> I might be missing something, but with 12 separate index tables or 1
>>> index table, the amount of data will be the same. Won't there be the same
>>> number of regions either way?
>>>
>>> Thanks,
>>> James
>>>
>>> On Sun, Jul 10, 2016 at 10:50 PM, Simon Wang <simon.wang@airbnb.com>
>>> wrote:
>>>
>>>> Hi James,
>>>>
>>>> Thanks for the response.
>>>>
>>>> In our use case, there is a 256 region table, and we want to build ~12
>>>> indexes on it. We have 15 region servers. If each index is in its own
>>>> table, that would be a total of 221 regions per region server of this
>>>> single table. I think the extra write time cost is okay. But the number of
>>>> regions is too high for us.
>>>>
>>>> Best,
>>>> Simon
>>>>
>>>>
>>>> On Jul 9, 2016, at 1:18 AM, James Taylor <jamestaylor@apache.org>
>>>> wrote:
>>>>
>>>> Hi Simon,
>>>> The reason we've taken this approach with views is that it's possible
>>>> with multi-tenancy that the number of views would grow unbounded since you
>>>> might end up with a view per tenant (100K or 1M views or more - clearly too
>>>> many for HBase to handle as separate tables).
>>>>
>>>> With secondary indexes directly on physical tables, you're somewhat
>>>> bounded by the hit you're willing to take on the write side, as the cost
of
>>>> maintaining the index is similar to the cost of the write to the data
>>>> table. So the extra number of physical tables for indexes seems within the
>>>> bounds of what HBase could handle.
>>>>
>>>> How many secondary indexes are you creating and are you ok with the
>>>> extra write-time cost?
>>>>
>>>> From a code consistency standpoint, using the same approach across
>>>> local, global, and view indexes might simplify things, though. Please file
>>>> a JIRA with a bit more detail on your use case.
>>>>
>>>> Thanks,
>>>> James
>>>>
>>>>
>>>>
>>>> On Fri, Jul 8, 2016 at 8:59 PM, Simon Wang <simon.wang@airbnb.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I am writing to ask if there is a way to let Phoenix store all indexes
>>>>> on a single table in the same HBase table. If each index must be stored
in
>>>>> a separate table, creating more than a few indexes on table with a large
>>>>> number of regions will not scale well.
>>>>>
>>>>> From what I have learned, when Phoenix builds indexes on a view, it
>>>>> stores all indexes in a table associated with the underlying table of
the
>>>>> view. e.g. if V1 is a view of T1, all indexes on V1 will be stored in
>>>>> _IDX_T1. It would be great if this behavior can be optionally turned
on for
>>>>> indexes on tables.
>>>>>
>>>>> Best,
>>>>> Simon
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>

Mime
View raw message