kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nehanarkh...@apache.org
Subject svn commit: r1521324 - /kafka/site/08/design.html
Date Tue, 10 Sep 2013 01:04:50 GMT
Author: nehanarkhede
Date: Tue Sep 10 01:04:49 2013
New Revision: 1521324

URL: http://svn.apache.org/r1521324
Log:
Some minor changes to the replication design section

Modified:
    kafka/site/08/design.html

Modified: kafka/site/08/design.html
URL: http://svn.apache.org/viewvc/kafka/site/08/design.html?rev=1521324&r1=1521323&r2=1521324&view=diff
==============================================================================
--- kafka/site/08/design.html (original)
+++ kafka/site/08/design.html Tue Sep 10 01:04:49 2013
@@ -168,7 +168,7 @@ Kafka replicates the log for each topic'
 <p>
 Other messaging systems provide some replication-related features, but, in our (totally biased)
opinion, this appears to be a tacked-on thing, not heavily used, and with large downsides:
slaves are inactive, throughput is heavily impacted, it requires fiddly manual configuration,
etc. Kafka is meant to be used with replication by default&mdash;in fact we implement
un-replicated topics as replicated topics where the replication factor is one.
 <p>
-The unit of replication is the topic partition. Under non-failure conditions, each partition
Kafka has a single leader and zero or more followers. We call the total number of replicas
including the leader the replication factor. All reads and writes go to the leader of the
partition. Typically, there are many more partitions than brokers and the leaders are evenly
distributed among brokers. The logs on the followers are identical to the leader's log&mdash;all
have the same offsets and messages in the same order (though, of course, at any given time
the leader may have a few as-yet unreplicated messages at the end of its log).
+The unit of replication is the topic partition. Under non-failure conditions, each partition
in Kafka has a single leader and zero or more followers. The total number of replicas including
the leader constitute the replication factor. All reads and writes go to the leader of the
partition. Typically, there are many more partitions than brokers and the leaders are evenly
distributed among brokers. The logs on the followers are identical to the leader's log&mdash;all
have the same offsets and messages in the same order (though, of course, at any given time
the leader may have a few as-yet unreplicated messages at the end of its log).
 <p>
 Followers consume messages from the leader just as a normal Kafka consumer would and apply
them to their own log. Having the followers pull from the leader has the nice property of
allowing the follower to naturally batch together log entries they are applying to their log.
 <p>
@@ -177,21 +177,21 @@ As with most distributed systems automat
 	<li>A node must be able to maintain its session with Zookeeper (via Zookeeper's heartbeat
mechanism)
 	<li>If it is a slave it must replicate the writes happening on the leader and not
fall "too far" behind
 </ol>
-We refer to nodes satisfying these two conditions as being "in sync" to avoid the vagueness
of "alive" or "failed". The leader keeps track of the set of "in sync" nodes. If a follower
dies or falls behind, the leader will remove it from the list of in sync replicas. The definition
of how far behind is too far is controlled by the replica.lag.max.messages configuration.
+We refer to nodes satisfying these two conditions as being "in sync" to avoid the vagueness
of "alive" or "failed". The leader keeps track of the set of "in sync" nodes. If a follower
dies, gets stuck, or falls behind, the leader will remove it from the list of in sync replicas.
The definition of, how far behind is too far, is controlled by the replica.lag.max.messages
configuration and the definition of a stuck replica is controlled by the replica.lag.time.max.ms
configuration.
 <p>
 In distributed systems terminology we only attempt to handle a "fail/recover" model of failures
where nodes suddenly cease working and then later recover (perhaps without knowing that they
have died). Kafka does not handle so-called "Byzantine" failures in which nodes produce arbitrary
or malicious responses (perhaps due to bugs or foul play).
 <p>
-A message is considered "committed" when all in sync replicas for that partition have applied
it to their log. Only committed messages are ever given out to the consumer. This means that
the consumer need not worry about potentially seeing a message that could be lost if the leader
fails. Producers, on the other hand, have the option of either waiting for the message to
be committed or not, depending on their preference for latency and durability. This preference
is controlled by the request.required.acks setting that the producer uses.
+A message is considered "committed" when all in sync replicas for that partition have applied
it to their log. Only committed messages are ever given out to the consumer. This means that
the consumer need not worry about potentially seeing a message that could be lost if the leader
fails. Producers, on the other hand, have the option of either waiting for the message to
be committed or not, depending on their preference for tradeoff between latency and durability.
This preference is controlled by the request.required.acks setting that the producer uses.
 <p>
-The guarantee that Kafka offers is that a committed message will not be lost as long as a
single in sync replica remains.
+The guarantee that Kafka offers is that a committed message will not be lost, as long as
there is at least one in sync replica alive, at all times.
 <p>
 Kafka will remain available in the presence of node failures after a short fail-over period,
but may not remain available in the presence of network partitions.
 
 <h4>Replicated Logs: Quorums, ISRs, and State Machines (Oh my!)</h4>
 
-At it's heart a Kafka partition is a replicated log. The replicated log is one of the most
basic primitives in distributed data systems, and there are many approaches to implementation.
A replicated log can be used by other systems as a primitive for implementing other distributed
systems in the <a href="http://en.wikipedia.org/wiki/State_machine_replication">state-machine
style</a>.
+At it's heart a Kafka partition is a replicated log. The replicated log is one of the most
basic primitives in distributed data systems, and there are many approaches for implementing
one. A replicated log can be used by other systems as a primitive for implementing other distributed
systems in the <a href="http://en.wikipedia.org/wiki/State_machine_replication">state-machine
style</a>.
 <p>
-A replicated log models the process of coming into consensus on the order of a series of
values (generally numbering the log entries 0, 1, 2, ...). There are many ways to implement
this, but the simplest and fastest is with a leader who chooses the ordering of values provided
to it. As long as the leader remains alive and all followers need only copy the values and
ordering the leader chooses.
+A replicated log models the process of coming into consensus on the order of a series of
values (generally numbering the log entries 0, 1, 2, ...). There are many ways to implement
this, but the simplest and fastest is with a leader who chooses the ordering of values provided
to it. As long as the leader remains alive, all followers need to only copy the values and
ordering, the leader chooses.
 <p>
 Of course if leaders didn't fail we wouldn't need followers! When the leader does die we
need to choose a new leader from among the followers. But followers themselves may fall behind
or crash so we must ensure we choose an up-to-date follower. The fundamental guarantee a log
replication algorithm must provide is that if we tell the client a message is committed, and
the leader fails, the new leader we elect must also have that message. This yields a tradeoff:
if the leader waits for more followers to acknowledge a message before declaring it committed
then there will be more potentially electable leaders.
 <p>



Mime
View raw message