kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jun...@apache.org
Subject svn commit: r1561522 - /kafka/site/08/design.html
Date Sun, 26 Jan 2014 17:35:34 GMT
Author: junrao
Date: Sun Jan 26 17:35:33 2014
New Revision: 1561522

URL: http://svn.apache.org/r1561522
Log:
minor change to unclean leader election

Modified:
    kafka/site/08/design.html

Modified: kafka/site/08/design.html
URL: http://svn.apache.org/viewvc/kafka/site/08/design.html?rev=1561522&r1=1561521&r2=1561522&view=diff
==============================================================================
--- kafka/site/08/design.html (original)
+++ kafka/site/08/design.html Sun Jan 26 17:35:33 2014
@@ -213,15 +213,15 @@ Another important design distinction is 
 
 <h4>Unclean leader election: What if they all die?</h4>
 
-Note that Kafka's guarantee with respect to data loss is predicated on at least on replica
remaining in sync. If all the nodes replicating a partition die, this guarantee no longer
holds.
+Note that Kafka's guarantee with respect to data loss is predicated on at least on replica
remaining in sync. If the current leader dies and no remaining live replicas are in the ISR,
this guarantee no longer holds. If your have more than one replica assigned to a partiiton,
this should be relatively rare since at least two brokers have to fail for this to happen.
 <p>
-However a practical system needs to do something reasonable when all the replicas die. If
you are unlucky enough to have this occur, it is important to consider what will happen. There
are two behaviors that could be implemented:
+However a practical system needs to do something reasonable when all in-sync replicas die.
If you are unlucky enough to have this occur, it is important to consider what will happen.
There are two behaviors that could be implemented:
 <ol>
 	<li>Wait for a replica in the ISR to come back to life and choose this replica as
the leader (hopefully it still has all its data).
 	<li>Choose the first replica (not necessarily in the ISR) that comes back to life
as the leader.
 </ol>
 <p>
-This is a simple tradeoff between availability and consistency. If we wait for replicas in
the ISR, then we will remain unavailable as long as those replicas are down. If such replicas
were destroyed or their data was lost, then we are permanently down. If, on the other hand,
a non-in-sync replica comes back to life and we allow it to become leader, then its log becomes
the source of truth even though it is not guaranteed to have every committed message. In our
current release we choose the second strategy and favor choosing a potentially inconsistent
replica when all replicas in the ISR are dead. In the future, we would like to make this configurable
to better support use cases where downtime is preferable to inconsistency.
+This is a simple tradeoff between availability and consistency. If we wait for replicas in
the ISR, then we will remain unavailable as long as those replicas are down. If such replicas
were destroyed or their data was lost, then we are permanently down. If, on the other hand,
a non-in-sync replica comes back to life and we allow it to become the leader, then its log
becomes the source of truth even though it is not guaranteed to have every committed message.
In our current release we choose the second strategy and favor choosing a potentially inconsistent
replica when all replicas in the ISR are dead. In the future, we would like to make this configurable
to better support use cases where downtime is preferable to inconsistency.
 <p>
 This dilemma is not specific to Kafka. It exists in any quorum-based scheme. For example
in a majority voting scheme, if a majority of servers suffer a permanent failure, then you
must either choose to lose 100% of your data or violate consistency by taking what remains
on an existing server as your new source of truth.
 



Mime
View raw message