kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jkr...@apache.org
Subject svn commit: r1499548 - /kafka/site/introduction.html
Date Wed, 03 Jul 2013 20:37:17 GMT
Author: jkreps
Date: Wed Jul  3 20:37:16 2013
New Revision: 1499548

URL: http://svn.apache.org/r1499548
Log:
Fix broken link.


Modified:
    kafka/site/introduction.html

Modified: kafka/site/introduction.html
URL: http://svn.apache.org/viewvc/kafka/site/introduction.html?rev=1499548&r1=1499547&r2=1499548&view=diff
==============================================================================
--- kafka/site/introduction.html (original)
+++ kafka/site/introduction.html Wed Jul  3 20:37:16 2013
@@ -28,10 +28,12 @@ A topic is the category or feed name. Fo
 </div>
 Each partition is an ordered, immutable sequence of messages that is continually appended
to&mdash;a commit log. The messages in the partitions are each assigned a sequential id
number called the <i>offset</i> that uniquely identifies each message within the
partition.
 <p>
-The Kafka cluster retains all published messages&mdash;whether or not they have been
consumed&mdash;for a configurable period of time. For example if the retention is set
for two days, then for the two days after a message is published it is available for consumption,
after which it will be discarded to free up space. This approach to data retention allows
more flexibility for consumers: unlike most systems Kafka does not retain a separate copy
of the data for each consumer, and consumers can reprocess data if they need to.
+The Kafka cluster retains all published messages&mdash;whether or not they have been
consumed&mdash;for a configurable period of time. For example if the retention is set
for two days, then for the two days after a message is published it is available for consumption,
after which it will be discarded to free up space. Kafka's performance is effectively constant
with respect to data size so retaining lots of data is not a problem.
 <p>
 In fact the only metadata retained on a per-consumer basis is the position of the consumer
in in the log, called the "offset". This offset is controlled by the consumer: normally a
consumer will read sequentially advance its offset in the same way, but the consumer can reset
its position if need be.
 <p>
+This combination of features means consumers are very cheap&mdash;they can come and go
without much impact on the performance of the system. For example, you can use our command
line tools to "tail" the contents of any topic without impacting other consumers.
+<p>
 The partitions in the log serve several purposes. First, they allow the log to scale beyond
a size that will fit on a single server. Each individual partition must fit on the servers
that host it, but a topic may have many partitions so it can handle an arbitrary amount of
data. Second they act as the unit of parallelism&mdash;more on that in a bit. 
 
 <h3>Distribution</h3>
@@ -70,6 +72,6 @@ Kafka gives the following guarantees
 
 <h3>Getting Started</h3>
 
-For more detailed information on how things work and help getting started see our <a href="/08/design.html">design
page</a> and overview of <a href="/uses.html">use cases</a>.
+For more detailed information on how things work and help getting started see our <a href="/design.html">design
page</a> and overview of <a href="/uses.html">use cases</a>.
 
 <!--#include virtual="includes/footer.html" -->
\ No newline at end of file



Mime
View raw message