phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Table replication
Date Wed, 15 Jun 2016 01:37:33 GMT
On Tue, Jun 14, 2016 at 5:43 PM, Saurabh Agarwal (BLOOMBERG/ 731 LEX) <
sagarwal144@bloomberg.net> wrote:

> Hi James,
>
> Thanks for providing the detailed info on replication. Two questions.
>
> 1. I am not clear how the replication work in term of view. Is this open
> issue wrt replication?
>

The data for views resides in it's physical HBase table and is replicated
with that table. The metadata of a view resides in the SYSTEM.CATALOG table
and is replicated with it.

>
> 2. As you mention that there are still work required wrt the combination
> of transaction and replication? Does this work need to be done in hbase or
> Phoenix? Are there any existing Jira for this work?
>

This could be done at the HBase level or at the Phoenix level. The code
would need to be part of Tephra or part of Phoenix, though, since the
transaction code is outside of HBase. There are no JIRAs filed for this yet.


>
> Thanks,
> Saurabh.
> Sent from Bloomberg Professional for iPhone
>
>
> ----- Original Message -----
> From: James Taylor <user@phoenix.apache.org>
> To: user@phoenix.apache.org
> At: 09-Jun-2016 11:42:46
>
> Hi JM,
> Are you looking toward replication to support DR? If so, you can rely on
> HBase-level replication with a few gotchas and some operational hurdles:
>
> - When upgrading Phoenix versions, upgrade the server-side first for both
> the primary and secondary cluster. You can do a rolling upgrade and old
> clients will continue to work with the upgraded server, so no downtime is
> required (see Backward Compatibility[1] for more details).
> - Execute Phoenix DDL (i.e. user-level changes to existing Phoenix tables,
> creation of new tables, indexes, sequences) against both the primary and
> secondary cluster with replication suspended (as otherwise you end up with
> a race condition for the replication of the SYSTEM.CATALOG table and any
> not yet existing tables). If you've upgraded Phoenix, then even if there's
> no DDL, you should at a minimum connect a Phoenix client to both the
> primary and secondary cluster to trigger any upgrades to Phoenix system
> tables. Once the DDL is complete, resume replication.
> - Do not replicate the SYSTEM.SEQUENCE table since replication is
> asynchronous and may fall behind which would be a big issue if switching
> over to the secondary cluster as sequence values could start repeating.
> Instead, incorporate a cluster ID into any sequence-based identifiers and
> concatenate this with the sequence value. In that way, the identifiers will
> continue to be unique after a DR event.
> - Replicate Phoenix indexes just like data tables as the HBase-level
> replication of the data table will not trigger index updates.
> - In theory, you really only need to replicate views from SYSTEM.CATALOG
> since you're executing DDL on both the primary and secondary cluster,
> however I don't think HBase has that capability (but it sure would be
> nice). FWIW, we're thinking of separating views from table definitions into
> separate Phoenix tables but need to first make these tables transactional
> (we're using an HBase mechanism that allows all or none commits to the
> SYSTEM.CATALOG, but it only works if all updates are to the same RS which
> is too limiting).
> - It's a good idea to monitor the depth of the replication queue so you
> know if/when replication is falling behind.
> - Care has to be taken wrt keeping deleted cells on both clusters if you
> want to support point-in-time backup and restore, as it's possible that
> compaction would remove cells before you're backup window has passed (this
> orthogonal to replication, but just wanted to bring it up).
> - Given the asynchronous nature of HBase replication, there's no good way
> of knowing the transaction ID (i.e. timestamp) at which you have all of the
> data. Also, replication of the state that is kept by the transaction
> manager in terms of inflight and invalid transactions is left as an
> exercise to the reader. :-) In short - there's still some work to do wrt
> the combination of transactions and replication (but it'd be really
> interesting work if anyone is interested).
>
> HTH. Thanks,
>
> James
>
> [1] https://phoenix.apache.org/upgrading.html
>
> On Thu, Jun 9, 2016 at 7:56 AM, anil gupta <anilgupta84@gmail.com> wrote:
>
>> Hi Jean,
>>
>> Phoenix does not supports replication at present.(It will be super
>> awesome if it can) So, if you want to do replication of Phoenix tables you
>> will need to setup replication of all the underlying HBase tables for
>> corresponding Phoenix tables.
>>
>> I think you will need to replicate all the Phoenix system hbase tables,
>> Global/Local secondary index table and then Primary Phoenix table.
>>
>> I haven't done it yet. But, above is the way i would approach it.
>>
>> Thanks,
>> Anil Gupta.
>>
>>
>> On Thu, Jun 9, 2016 at 6:49 AM, Jean-Marc Spaggiari <
>> jean-marc@spaggiari.org> wrote:
>>
>>> Hi,
>>>
>>> When Phoenix is used, what is the recommended way to do replication?
>>>
>>> Replication acts as a client on the 2nd cluster, so should we simply
>>> configure Phoenix on both cluster and on the destination it will take care
>>> of updating the index tables, etc. Or should all the tables on the
>>> destination side, including Phoenix tables, be replicated on the
>>> destination side too? I seached a bit about that on the Phoenix site and
>>> google and did not find anything.
>>>
>>> Thanks,
>>>
>>> JMS
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>>
>
>

Mime
View raw message