phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Van Wely <mvanw...@salesforce.com>
Subject Re: LOCAL vs TRANSACTIONAL indexes
Date Wed, 21 Sep 2016 05:30:55 GMT
Thanks James, knowing that there are no race conditions (or very
unlikely) from the same client on a mutable table is really helpful.

Thx,
--Matt

On Sat, Sep 17, 2016 at 4:26 PM, James Taylor <jamestaylor@apache.org>
wrote:

> On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely <mvanwely@salesforce.com
> > wrote:
>
>> All,
>>
>> I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I
>> cannot quite get the details I need from the Phoenix site:
>> https://phoenix.apache.org/secondary_indexing.htm
>>
>> Transactional Tables
>>
>> <snip>
>> transactional tables with secondary indexes potentially lowers your
>> availability of being able to write to your data table, as both the data
>> table and its secondary index tables must be availalbe as otherwise the
>> write will fail
>> </snip>
>>
>> 1) What is the likelihood that an index is not available?
>>
> This is rare and unlikely. If a region server goes down, HBase relocates
> the regions it was hosting to another region server. If you write data
> exactly when this happens, it's possible that you'll get an exception back
> if this relocation takes longer than your # of retries and timeout settings.
>
>
>>
>> 2) If rebuilding, is this on the order of minutes, hours?
>>
> Not sure what rebuilding you're asking about. For mutable, non
> transactional secondary indexes, Phoenix has the ability to partially
> rebuild them if a write failure occurs. This will be relatively faster
> because it only rebuilds index rows that were added after the writes began
> failing. See the options listed under https://phoenix.apache.
> org/secondary_indexing.html#Mutable_Tables
>
> If on the other hand you're asking how long does it take to completely
> rebuild the index, then that depends on how much data the table has (so
> then you're really asking how fast does HBase write).
>
>
>>
>> 3) Does Phoenix give an indication the write failed due to unavailable
>> table/index (bc if so client could handle this with other write options)?
>>
>
> Yes, Phoenix throws an exception if the write fails. It never fails
> silently. If your data is immutable, then it's up to you to handle the
> write failure (usually by just continually retrying the failed write). If
> mutable, then Phoenix has some options that can automate catching the index
> up with the data table (see https://phoenix.apache.
> org/secondary_indexing.html#Consistency_Guarantees). If your table is
> transactional, then it cannot get out of sync with the index.
>
>
>>
>> Local Indexes
>>
>> <snip>
>> all local index data in the separate shadow column families in the
>> same data table. At read time when the local index is used, every region
>> must be examined for the data as the exact region location of index data
>> cannot be predetermined. Thus some overhead occurs at read-time.
>> </snip>
>>
>> 4) Are there any requirements on table PK and index key regarding key
>> ordering?
>>
> No
>
>
>>
>> 5) How is something locally indexed if the keys are completely mismatched?
>> I get the sense that it doesn't matter given that "every region must be
>> examined".
>>
>
> The rows of a local index are sorted in each region. The client just has
> to do a merge sort between all the data it gets back for the scans over
> each region. This is very fast, so not too much overhead here.
>
>
>>
>> Mutable Tables
>>
>> <snip>
>> indexes on non transactional mutable tables are only ever a single
>> batch of edits behind the primary table
>> </snip>
>>
>> 6) If my use case updates a table and then reads from an index, it seems a
>> likely race condition that I can read-my-write.
>>
>
> From the same client, there is no race condition. The upsert statement is
> synchronous, so when control returns back to you, all of your data has been
> written (both to the data and index table(s)).
>
> If the read happens from a different client than the write, with global,
> mutable, non transactional indexes, it's possible that a read could occur
> after the write to the data table but before the write to the index
> table(s) (since the with global indexes, the regions for the index table
> are potentially on different region servers than the regions of the data
> table).
>
> With local indexes the above is even more unlikely because the writes are
> all occurring to the same region server, but in theory it's still possible.
> With the fix that was made as part of HBASE-15600, this wouldn't be
> possible at all, though.
>
> With transactional tables, this scenario isn't possible.
>
>
>>
>> 7) Would you be willing to bet that most reads are consistent with the
>> table and only in rare scenarios is the table/index out of sync?
>>
> Yes
>
>>
>> I appreciate your help and feedback on these questions.  Thanks,
>> --Matthew
>>
>
> Thanks,
> James
>
>

Mime
View raw message