phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Replication?
Date Thu, 11 Dec 2014 14:42:52 GMT
Never to much.

So, more questions! ;)

1) If we don't use the sequences on our application, we don't have this
constraint, right? We already have an external ID generator that we use for
all our IDs. So we don't use the Phoenix sequence and might not face this
issue? Or is this used behind the scene by Phoenix for some operations?

2) Regarding the secondary index, is it not possible to create an index
from a view?

Thanks,

JM

2014-12-10 22:27 GMT-05:00 James Taylor <jamestaylor@apache.org>:
>
> bq. Then we run a Phoenix DDL SQL script to create the views.
> In your CREATE TABLE statements, use the following syntax:
> CREATE TABLE IF NOT EXISTS my_table ...
> This will prevent an error message from occurring if the replication
> of SYSTEM.CATALOG rows for the table occur before the Phoenix DDL
> statement is run.
>
> bq. There is no indexes yet but for sure we will want them too.
> You mentioned that you're creating views, but if you're planning on
> using indexes, I'd recommend creating tables, not views so that your
> secondary indexes are maintained wrt your data tables automatically by
> Phoenix.
>
> bq. So based on what you said below, can I "simply" add the
> replication scope to ALL the tables including all Phoenix tables, both
> ways?
> Yes, that sounds correct, but I'm not familiar with the particular
> HBase settings for replication.
>
> bq. Regarding the sequence number, how can we bump it when we detect a
> failure of the other cluster?
> // Set autocommit to true on your connection first for better performance.
> // connection.setAutoCommit(true);
> UPSERT INTO SYSTEM.SEQUENCE(
>         TENANT_ID,SEQUENCE_SCHEMA,SEQUENCE_NAME,
>         CURRENT_VALUE)
>     SELECT
>         TENANT_ID,SEQUENCE_SCHEMA,SEQUENCE_NAME,
>         CURRENT_VALUE + CACHE_SIZE
>     FROM SYSTEM.SEQUENCE;
>
> bq. And related question, how to safely detect the failure ;)
> You mean how can you detect when a cluster has failed? Good question
> for the HBase dev list :-)
>
> In the spirit of "more information is better", let me try to
> illustrate the potential problem with sequences when a cluster failure
> occurs through an example.
> 1) assume table T and sequence S are being used in the application
> with a statement like this:
>     UPSERT INTO T (ID, VAL) VALUES (NEXT VALUE FOR S, 'my_value');
> 2) to increment the sequence (that's what the NEXT VALUE FOR will do),
> Phoenix will make an RPC to allocate 1000 sequence values by
> incrementing the CURRENT_VALUE of S in the SYSTEM.SEQUENCE table by
> 1000 (the sequence is represented by a row in the SYSTEM.SEQUENCE
> table with the 1000 coming from the CACHE value when you create the
> sequence).
> 3) the client will dole out sequences from these 1000 and once
> exhausted will rinse and repeat with (2) again.
> 4) let's say that the following sequence of events occurs:
> a) the SYSTEM.SEQUENCE row S is incremented by 1000
> b) the client commits rows that use one or more of these sequences
> c) the rows that use these new sequence values are replicated to the
> other cluster.
> d) the cluster goes down after the commit, but before the replication
> of the increment of the SYSTEM.SEQUENCE row
>
> Now you have an issue, because when you fail over to the other
> cluster, the sequence value won't have been incremented, but the rows
> using the new sequence values were replicated.
>
> So, if you bump up the sequence values, you can lower the possibility
> of this corner case occurring. Note that it's also possible that more
> than one 1000 batch of sequence values were allocated before the
> SYSTEM.SEQUENCE row was replicated. If the rate of data insertion is
> very high, then in theory you wouldn't know by how much to bump up the
> sequence values.
>
> Andrew pointed out one way around this by allocating IDs through a
> stateless mechanism (PHOENIX-1422) which seems like a good solution
> for many use cases (often sequences don't need to be monotonically
> increasing). Another solution if that doesn't work would be if the
> SYSTEM.SEQUENCE table could be replicated synchronously (HBASE-12672).
>
> TMI? HTH.
>
>     James
>
> On Wed, Dec 10, 2014 at 7:42 AM, Jean-Marc Spaggiari
> <jean-marc@spaggiari.org> wrote:
> > Thanks James (And Andrew). I think there can not be to much information.
> The
> > more information we share, the more knowledge we get.
> >
> > So here is the situation.
> >
> > We have 2 clusters that we want to configure in master/master mode.
> > Application is built using HBase 0.98 and Phoenix.
> >
> > We deploy our HBase Schema with an RPM. This creates all the tables we
> need
> > and activate the replication for all of them. Then we run a Phoenix DDL
> SQL
> > script to create the views. We do this on both clusters so they are
> > identical, only the peer ID changes.
> >
> > There is no indexes yet but for sure we will want them too.
> >
> > The goal is to have 2 identical clusters with the same performances in
> case
> > one of them fails.
> >
> > So based on what you said below, can I "simply" add the replication
> scope to
> > ALL the tables including all Phoenix tables, both ways?
> >
> > Regarding the sequence number, how can we bump it when we detect a
> failure
> > of the other cluster? And related question, how to safely detect the
> failure
> > ;)
> >
> > Thanks,
> >
> > JM
> >
> >
> >
> > 2014-12-09 20:48 GMT-05:00 James Taylor <jamestaylor@apache.org>:
> >
> >> No, we're not saying to avoid replication: at SFDC, we rely on
> >> replication to provide an active/active configuration for failover.
> >> Lars H. & co. can explain in more detail, but there are some nuances
> >> of which you should be aware. For example, the HBase table metadata
> >> needs to exist on both clusters. How is this done in your environment?
> >> One way to do this is the run the Phoenix DDL statements on both
> >> sides, but this requires some extra processing, as replication won't
> >> know about Phoenix DDL.
> >>
> >> Whether or not you replicate indexes depends on 1) how much your use
> >> case depends on them - if they're not available, will crucial queries
> >> become so slow that it's as if the system is down?, and 2) the size of
> >> your data and how long it takes to regenerate the index. Our current
> >> thinking is to replicate the indexes just as we replicate tables (an
> >> index just looks like any other HBase table as far as HBase is
> >> concerned), as we want to be able to failover immediately without
> >> performance degradation.
> >>
> >> As far as replicating the SYSTEM.CATALOG table, that's important
> >> depending on your use case as well. If you're using views (including
> >> multi-tenant tables) that are created dynamically/on-the-fly, then
> >> you'd likely want to replicate this table as otherwise this DDL has
> >> the potential to be lost. Adding the IF NOT EXISTS that Andrew
> >> referred to would prevent an error message when running the DDL on the
> >> secondary cluster if the row from the SYSTEM.CATALOG table was already
> >> replicated.
> >>
> >> For the SYSTEM.SEQUENCE table, as Andrew pointed out, we allocate
> >> chunks of sequences and dole them out on the client. You'd want to
> >> replicate this table, as otherwise when you switch to the other
> >> cluster, you'd start repeating the same sequence values. Once
> >> replicated, if the primary cluster goes down, then the sequences will
> >> pick up at the value after the already allocated chunk (which is fine,
> >> as it's fine to have "holes" in the sequence values that get doled
> >> out). There is a potential for a race condition if the primary cluster
> >> returns a batch of new sequences and then dies before replicating the
> >> updated sequence value to the other cluster. This can be mitigated, as
> >> Andrew points out by bumping up the sequence values on a failover
> >> event.
> >>
> >> HTH. Maybe more information than you wanted? Tell us more about how
> >> you're relying on replication when you get a chance.
> >>
> >> Thanks,
> >> James
> >>
> >>
> >>
> >> On Tue, Dec 9, 2014 at 5:00 PM, Jean-Marc Spaggiari
> >> <jean-marc@spaggiari.org> wrote:
> >> > Hum. Thanks for al those updates.
> >> >
> >> > So are we saying that master/master HBase replication should be
> avoided
> >> > when
> >> > using Phoenix with latest stable version?
> >> >
> >> > 2014-12-09 19:51 GMT-05:00 Andrew Purtell <apurtell@apache.org>:
> >> >
> >> >> You also need to replicate the Phoenix system tables. It's still
> >> >> necessary
> >> >> to run DDL operations on both clusters to keep Phoenix schema and
> HBase
> >> >> tables in sync. Use IF EXISTS or IF NOT EXISTS to avoid DDL statement
> >> >> failures. Phoenix should do the right thing. If not, it's a bug.
> >> >>
> >> >> The sequence table is interesting. The Phoenix client caches a range
> of
> >> >> sequence values to use when inserting data that include generated
> >> >> sequence
> >> >> values. You'll want to always grab a new cached range of sequence
> >> >> values
> >> >> when failing over from one site to another and back to avoid
> potential
> >> >> duplication. It's possible upon site failure that the latest updates
> to
> >> >> the
> >> >> sequence table did not replicate. Or,
> >> >> https://issues.apache.org/jira/browse/PHOENIX-1422 would side step
> this
> >> >> issue if implemented.
> >> >>
> >> >>
> >> >> On Mon, Dec 8, 2014 at 10:22 PM, Jeffrey Zhong <
> jzhong@hortonworks.com>
> >> >> wrote:
> >> >>>
> >> >>>
> >> >>> You need to enable replication on both data & index table in
Hbase
> >> >>> level
> >> >>> using Phoenix 4.2(previous 4.2 Phoenix version may have issues
on
> >> >>> local
> >> >>> index). There is a test case MutableIndexReplicationIT where you
can
> >> >>> see
> >> >>> some details. Ideally Phoenix should provide a customer replication
> >> >>> sink so
> >> >>> that a user doesn't have to setup replication on index table.
> >> >>>
> >> >>> From: Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> >> >>> Reply-To: <user@phoenix.apache.org>
> >> >>> Date: Monday, December 8, 2014 at 9:29 AM
> >> >>> To: user <user@phoenix.apache.org>
> >> >>> Subject: Replication?
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> How do we replicate data between 2 cluster when Phoenix is in the
> >> >>> picture?
> >> >>>
> >> >>> Can we simply replicate the table we want from A to B and on
> cluster B
> >> >>> Phoenix will do the required re-indexing? Or should we also
> replicate
> >> >>> the
> >> >>> Phoenix tables too?
> >> >>>
> >> >>> Thanks,
> >> >>>
> >> >>> JM
> >> >>>
> >> >>> CONFIDENTIALITY NOTICE
> >> >>> NOTICE: This message is intended for the use of the individual
or
> >> >>> entity
> >> >>> to which it is addressed and may contain information that is
> >> >>> confidential,
> >> >>> privileged and exempt from disclosure under applicable law. If
the
> >> >>> reader of
> >> >>> this message is not the intended recipient, you are hereby notified
> >> >>> that any
> >> >>> printing, copying, dissemination, distribution, disclosure or
> >> >>> forwarding of
> >> >>> this communication is strictly prohibited. If you have received
this
> >> >>> communication in error, please contact the sender immediately and
> >> >>> delete it
> >> >>> from your system. Thank You.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Best regards,
> >> >>
> >> >>    - Andy
> >> >>
> >> >> Problems worthy of attack prove their worth by hitting back. - Piet
> >> >> Hein
> >> >> (via Tom White)
> >> >
> >> >
> >
> >
>

Mime
View raw message