kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From guozh...@apache.org
Subject kafka git commit: KAFKA-2665: Add images to code github
Date Sat, 17 Oct 2015 00:47:24 GMT
Repository: kafka
Updated Branches:
  refs/heads/trunk 636e14a99 -> 78a2e2f8f

KAFKA-2665: Add images to code github

…art of the code github

Author: Gwen Shapira <cshapi@gmail.com>

Reviewers: Guozhang Wang

Closes #325 from gwenshap/KAFKA-2665

Project: http://git-wip-us.apache.org/repos/asf/kafka/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka/commit/78a2e2f8
Tree: http://git-wip-us.apache.org/repos/asf/kafka/tree/78a2e2f8
Diff: http://git-wip-us.apache.org/repos/asf/kafka/diff/78a2e2f8

Branch: refs/heads/trunk
Commit: 78a2e2f8f39b24774e5edc8cf0c1093889db2fdf
Parents: 636e14a
Author: Gwen Shapira <cshapi@gmail.com>
Authored: Fri Oct 16 17:52:06 2015 -0700
Committer: Guozhang Wang <wangguoz@gmail.com>
Committed: Fri Oct 16 17:52:06 2015 -0700

 docs/design.html                      |   4 ++--
 docs/images/consumer-groups.png       | Bin 0 -> 26820 bytes
 docs/images/kafka_log.png             | Bin 0 -> 134321 bytes
 docs/images/kafka_multidc.png         | Bin 0 -> 33959 bytes
 docs/images/kafka_multidc_complex.png | Bin 0 -> 38559 bytes
 docs/images/log_anatomy.png           | Bin 0 -> 19579 bytes
 docs/images/log_cleaner_anatomy.png   | Bin 0 -> 18638 bytes
 docs/images/log_compaction.png        | Bin 0 -> 41414 bytes
 docs/images/mirror-maker.png          | Bin 0 -> 17054 bytes
 docs/images/producer_consumer.png     | Bin 0 -> 8691 bytes
 docs/images/tracking_high_level.png   | Bin 0 -> 82759 bytes
 docs/implementation.html              |   2 +-
 docs/introduction.html                |   6 +++---
 docs/ops.html                         |   2 +-
 14 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/docs/design.html b/docs/design.html
index b1e4387..f0ab6ca 100644
--- a/docs/design.html
+++ b/docs/design.html
@@ -306,7 +306,7 @@ This functionality is inspired by one of LinkedIn's oldest and most successful
 Here is a high-level picture that shows the logical structure of a Kafka log with the offset
for each message.
-<img src="/images/log_cleaner_anatomy.png">
+<img src="images/log_cleaner_anatomy.png">
 The head of the log is identical to a traditional Kafka log. It has dense, sequential offsets
and retains all messages. Log compaction adds an option for handling the tail of the log.
The picture above shows a log with a compacted tail. Note that the messages in the tail of
the log retain the original offset assigned when they were first written&mdash;that never
changes. Note also that all offsets remain valid positions in the log, even if the message
with that offset has been compacted away; in this case this position is indistinguishable
from the next highest offset that does appear in the log. For example, in the picture above
the offsets 36, 37, and 38 are all equivalent positions and a read beginning at any of these
offsets would return a message set beginning with 38.
@@ -314,7 +314,7 @@ Compaction also allows for deletes. A message with a key and a null payload
 The compaction is done in the background by periodically recopying log segments. Cleaning
does not block reads and can be throttled to use no more than a configurable amount of I/O
throughput to avoid impacting producers and consumers. The actual process of compacting a
log segment looks something like this:
-<img src="/images/log_compaction.png">
+<img src="images/log_compaction.png">
 <h4>What guarantees does log compaction provide?</h4>

diff --git a/docs/images/consumer-groups.png b/docs/images/consumer-groups.png
new file mode 100644
index 0000000..16fe293
Binary files /dev/null and b/docs/images/consumer-groups.png differ

diff --git a/docs/images/kafka_log.png b/docs/images/kafka_log.png
new file mode 100644
index 0000000..75abd96
Binary files /dev/null and b/docs/images/kafka_log.png differ

diff --git a/docs/images/kafka_multidc.png b/docs/images/kafka_multidc.png
new file mode 100644
index 0000000..7bc56f4
Binary files /dev/null and b/docs/images/kafka_multidc.png differ

diff --git a/docs/images/kafka_multidc_complex.png b/docs/images/kafka_multidc_complex.png
new file mode 100644
index 0000000..ab88deb
Binary files /dev/null and b/docs/images/kafka_multidc_complex.png differ

diff --git a/docs/images/log_anatomy.png b/docs/images/log_anatomy.png
new file mode 100644
index 0000000..a649499
Binary files /dev/null and b/docs/images/log_anatomy.png differ

diff --git a/docs/images/log_cleaner_anatomy.png b/docs/images/log_cleaner_anatomy.png
new file mode 100644
index 0000000..fb425b0
Binary files /dev/null and b/docs/images/log_cleaner_anatomy.png differ

diff --git a/docs/images/log_compaction.png b/docs/images/log_compaction.png
new file mode 100644
index 0000000..4e4a833
Binary files /dev/null and b/docs/images/log_compaction.png differ

diff --git a/docs/images/mirror-maker.png b/docs/images/mirror-maker.png
new file mode 100644
index 0000000..b25e8cb
Binary files /dev/null and b/docs/images/mirror-maker.png differ

diff --git a/docs/images/producer_consumer.png b/docs/images/producer_consumer.png
new file mode 100644
index 0000000..4b10cc9
Binary files /dev/null and b/docs/images/producer_consumer.png differ

diff --git a/docs/images/tracking_high_level.png b/docs/images/tracking_high_level.png
new file mode 100644
index 0000000..b643230
Binary files /dev/null and b/docs/images/tracking_high_level.png differ

diff --git a/docs/implementation.html b/docs/implementation.html
index 25f9b39..d9ffa46 100644
--- a/docs/implementation.html
+++ b/docs/implementation.html
@@ -191,7 +191,7 @@ payload        : n bytes
 The use of the message offset as the message id is unusual. Our original idea was to use
a GUID generated by the producer, and maintain a mapping from GUID to offset on each broker.
But since a consumer must maintain an ID for each server, the global uniqueness of the GUID
provides no value. Furthermore the complexity of maintaining the mapping from a random id
to an offset requires a heavy weight index structure which must be synchronized with disk,
essentially requiring a full persistent random-access data structure. Thus to simplify the
lookup structure we decided to use a simple per-partition atomic counter which could be coupled
with the partition id and node id to uniquely identify a message; this makes the lookup structure
simpler, though multiple seeks per consumer request are still likely. However once we settled
on a counter, the jump to directly using the offset seemed natural&mdash;both after all
are monotonically increasing integers unique to a partition. Since the offs
 et is hidden from the consumer API this decision is ultimately an implementation detail and
we went with the more efficient approach.
-<img src="../images/kafka_log.png">
+<img src="images/kafka_log.png">
 The log allows serial appends which always go to the last file. This file is rolled over
to a fresh file when it reaches a configurable size (say 1GB). The log takes two configuration
parameter <i>M</i> which gives the number of messages to write before forcing
the OS to flush the file to disk, and <i>S</i> which gives a number of seconds
after which a flush is forced. This gives a durability guarantee of losing at most <i>M</i>
messages or <i>S</i> seconds of data in the event of a system crash.

diff --git a/docs/introduction.html b/docs/introduction.html
index 92a7826..7e0b150 100644
--- a/docs/introduction.html
+++ b/docs/introduction.html
@@ -30,7 +30,7 @@ First let's review some basic messaging terminology:
 So, at a high level, producers send messages over the network to the Kafka cluster which
in turn serves them up to consumers like this:
 <div style="text-align: center; width: 100%">
-  <img src="../images/producer_consumer.png">
+  <img src="images/producer_consumer.png">
 Communication between the clients and the servers is done with a simple, high-performance,
language agnostic <a href="https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol">TCP
protocol</a>. We provide a Java client for Kafka, but clients are available in <a
href="https://cwiki.apache.org/confluence/display/KAFKA/Clients">many languages</a>.
@@ -40,7 +40,7 @@ Let's first dive into the high-level abstraction Kafka provides&mdash;the
 A topic is a category or feed name to which messages are published. For each topic, the Kafka
cluster maintains a partitioned log that looks like this:
 <div style="text-align: center; width: 100%">
-  <img src="../images/log_anatomy.png">
+  <img src="images/log_anatomy.png">
 Each partition is an ordered, immutable sequence of messages that is continually appended
to&mdash;a commit log. The messages in the partitions are each assigned a sequential id
number called the <i>offset</i> that uniquely identifies each message within the
@@ -76,7 +76,7 @@ More commonly, however, we have found that topics have a small number of
 <div style="float: right; margin: 20px; width: 500px" class="caption">
-  <img src="../images/consumer-groups.png"><br>
+  <img src="images/consumer-groups.png"><br>
   A two server Kafka cluster hosting four partitions (P0-P3) with two consumer groups. Consumer
group A has two consumer instances and group B has four.

diff --git a/docs/ops.html b/docs/ops.html
index 2164ab7..0645d1c 100644
--- a/docs/ops.html
+++ b/docs/ops.html
@@ -98,7 +98,7 @@ Since running this command can be tedious you can also configure Kafka to
do thi
 We refer to the process of replicating data <i>between</i> Kafka clusters "mirroring"
to avoid confusion with the replication that happens amongst the nodes in a single cluster.
Kafka comes with a tool for mirroring data between Kafka clusters. The tool reads from one
or more source clusters and writes to a destination cluster, like this:
-<img src="/images/mirror-maker.png">
+<img src="images/mirror-maker.png">
 A common use case for this kind of mirroring is to provide a replica in another datacenter.
This scenario will be discussed in more detail in the next section.

View raw message