kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gwens...@apache.org
Subject kafka-site git commit: additional improvements to 0.10.0 docs
Date Tue, 10 May 2016 01:44:48 GMT
Repository: kafka-site
Updated Branches:
  refs/heads/asf-site 1ad8525f1 -> 76217f0b9


additional improvements to 0.10.0 docs


Project: http://git-wip-us.apache.org/repos/asf/kafka-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka-site/commit/76217f0b
Tree: http://git-wip-us.apache.org/repos/asf/kafka-site/tree/76217f0b
Diff: http://git-wip-us.apache.org/repos/asf/kafka-site/diff/76217f0b

Branch: refs/heads/asf-site
Commit: 76217f0b996e0c563359fa3b8aad32d3f2ed46de
Parents: 1ad8525
Author: Gwen Shapira <cshapi@gmail.com>
Authored: Mon May 9 18:44:25 2016 -0700
Committer: Gwen Shapira <cshapi@gmail.com>
Committed: Mon May 9 18:44:25 2016 -0700

----------------------------------------------------------------------
 0100/connect.html        |  8 +++++---
 0100/implementation.html | 10 +++++-----
 0100/introduction.html   |  2 +-
 0100/ops.html            |  4 ++--
 0100/upgrade.html        |  5 +++++
 5 files changed, 18 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/connect.html
----------------------------------------------------------------------
diff --git a/0100/connect.html b/0100/connect.html
index 5cd4130..c3cf583 100644
--- a/0100/connect.html
+++ b/0100/connect.html
@@ -95,7 +95,9 @@ Since Kafka Connect is intended to be run as a service, it also provides
a REST
     <li><code>GET /connectors/{name}</code> - get information about a specific
connector</li>
     <li><code>GET /connectors/{name}/config</code> - get the configuration
parameters for a specific connector</li>
     <li><code>PUT /connectors/{name}/config</code> - update the configuration
parameters for a specific connector</li>
+    <li><code>GET /connectors/{name}/status</code> - get current status
of the connector, including if it is running, failed, paused, etc., which worker it is assigned
to, error information if it has failed, and the state of all its tasks</li>
     <li><code>GET /connectors/{name}/tasks</code> - get a list of tasks
currently running for a connector</li>
+    <li><code>GET /connectors/{name}/tasks/{taskid}/status</code> - get
current status of the task, including if it is running, failed, paused, etc., which worker
it is assigned to, and error information if it has failed</li>
     <li><code>DELETE /connectors/{name}</code> - delete a connector, halting
all tasks and deleting its configuration</li>
 </ul>
 
@@ -191,8 +193,8 @@ public List&lt;Map&lt;String, String&gt;&gt; getTaskConfigs(int
maxTasks) {
 }
 </pre>
 
-Although not used in the example, <code>SourceTask</code> also provides two APIs
to commit offsets in the source system: <code>commit</code> and <code>commitSourceRecord</code>.
The APIs are provided for source systems which have an acknowledgement mechanism for messages.
Overriding these methods allows the source connector to acknowledge messages in the source
system, either in bulk or individually, once they have been written to Kafka.
-The <code>commit<code> API stores the offsets in the source system, up to the
offsets that have been returned by <code>poll</code>. The implementation of this
API should block until the commit is complete. The <code>commitSourceRecord</code>
API saves the offset in the source system for each <code>SourceRecord</code> after
it is written to Kafka. As Kafka Connect will record offsets automatically, <code>SourceTask<code>s
are not required to implement them. In cases where a connector does need to acknowledge messages
in the source system, only one of the APIs is typically required.
+Although not used in the example, <code>SourceTask</code> also provides two APIs
to commit offsets in the source system: <code>commit</code> and <code>commitRecord</code>.
The APIs are provided for source systems which have an acknowledgement mechanism for messages.
Overriding these methods allows the source connector to acknowledge messages in the source
system, either in bulk or individually, once they have been written to Kafka.
+The <code>commit</code> API stores the offsets in the source system, up to the
offsets that have been returned by <code>poll</code>. The implementation of this
API should block until the commit is complete. The <code>commitRecord</code> API
saves the offset in the source system for each <code>SourceRecord</code> after
it is written to Kafka. As Kafka Connect will record offsets automatically, <code>SourceTask</code>s
are not required to implement them. In cases where a connector does need to acknowledge messages
in the source system, only one of the APIs is typically required.
 
 Even with multiple tasks, this method implementation is usually pretty simple. It just has
to determine the number of input tasks, which may require contacting the remote service it
is pulling data from, and then divvy them up. Because some patterns for splitting work among
tasks are so common, some utilities are provided in <code>ConnectorUtils</code>
to simplify these cases.
 
@@ -232,7 +234,7 @@ Next, we implement the main functionality of the task, the <code>poll()</code>
m
 public List&lt;SourceRecord&gt; poll() throws InterruptedException {
     try {
         ArrayList&lt;SourceRecord&gt; records = new ArrayList&lt;&gt;();
-        while (streamValid(stream) && records.isEmpty()) {
+        while (streamValid(stream) &amp;&amp; records.isEmpty()) {
             LineAndOffset line = readToNextLine(stream);
             if (line != null) {
                 Map<String, Object> sourcePartition = Collections.singletonMap("filename",
filename);

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/implementation.html
----------------------------------------------------------------------
diff --git a/0100/implementation.html b/0100/implementation.html
index be81227..0a36c22 100644
--- a/0100/implementation.html
+++ b/0100/implementation.html
@@ -282,7 +282,7 @@ When an element in a path is denoted [xyz], that means that the value
of xyz is
 
 <h4><a id="impl_zkbroker" href="#impl_zkbroker">Broker Node Registry</a></h4>
 <pre>
-/brokers/ids/[0...N] --> host:port (ephemeral node)
+/brokers/ids/[0...N] --> {"jmx_port":...,"timestamp":...,"endpoints":[...],"host":...,"version":...,"port":...}
(ephemeral node)
 </pre>
 <p>
 This is a list of all present broker nodes, each of which provides a unique logical broker
id which identifies it to consumers (which must be given as part of its configuration). On
startup, a broker node registers itself by creating a znode with the logical broker id under
/brokers/ids. The purpose of the logical broker id is to allow a broker to be moved to a different
physical machine without affecting consumers. An attempt to register a broker id that is already
in use (say because two servers are configured with the same broker id) results in an error.
@@ -292,7 +292,7 @@ Since the broker registers itself in ZooKeeper using ephemeral znodes,
this regi
 </p>
 <h4><a id="impl_zktopic" href="#impl_zktopic">Broker Topic Registry</a></h4>
 <pre>
-/brokers/topics/[topic]/[0...N] --> nPartitions (ephemeral node)
+/brokers/topics/[topic]/partitions/[0...N]/state --> {"controller_epoch":...,"leader":...,"version":...,"leader_epoch":...,"isr":[...]}
(ephemeral node)
 </pre>
 
 <p>
@@ -317,7 +317,7 @@ The consumers in a group divide up the partitions as fairly as possible,
each pa
 <p>
 In addition to the group_id which is shared by all consumers in a group, each consumer is
given a transient, unique consumer_id (of the form hostname:uuid) for identification purposes.
Consumer ids are registered in the following directory.
 <pre>
-/consumers/[group_id]/ids/[consumer_id] --> {"topic1": #streams, ..., "topicN": #streams}
(ephemeral node)
+/consumers/[group_id]/ids/[consumer_id] --> {"version":...,"subscription":{...:...},"pattern":...,"timestamp":...}
(ephemeral node)
 </pre>
 Each of the consumers in the group registers under its group and creates a znode with its
consumer_id. The value of the znode contains a map of &lt;topic, #streams&gt;. This
id is simply used to identify each of the consumers which is currently active within a group.
This is an ephemeral node so it will disappear if the consumer process dies.
 </p>
@@ -327,7 +327,7 @@ Each of the consumers in the group registers under its group and creates
a znode
 Consumers track the maximum offset they have consumed in each partition. This value is stored
in a ZooKeeper directory if <code>offsets.storage=zookeeper</code>.
 </p>
 <pre>
-/consumers/[group_id]/offsets/[topic]/[broker_id-partition_id] --> offset_counter_value
((persistent node)
+/consumers/[group_id]/offsets/[topic]/[partition_id] --> offset_counter_value ((persistent
node)
 </pre>
 
 <h4><a id="impl_zkowner" href="#impl_zkowner">Partition Owner registry</a></h4>
@@ -337,7 +337,7 @@ Each broker partition is consumed by a single consumer within a given
consumer g
 </p>
 
 <pre>
-/consumers/[group_id]/owners/[topic]/[broker_id-partition_id] --> consumer_node_id (ephemeral
node)
+/consumers/[group_id]/owners/[topic]/[partition_id] --> consumer_node_id (ephemeral node)
 </pre>
 
 <h4><a id="impl_brokerregistration" href="#impl_brokerregistration">Broker node
registration</a></h4>

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/introduction.html
----------------------------------------------------------------------
diff --git a/0100/introduction.html b/0100/introduction.html
index ad81e97..c2e3554 100644
--- a/0100/introduction.html
+++ b/0100/introduction.html
@@ -33,7 +33,7 @@ So, at a high level, producers send messages over the network to the Kafka
clust
   <img src="images/producer_consumer.png">
 </div>
 
-Communication between the clients and the servers is done with a simple, high-performance,
language agnostic <a href="https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol">TCP
protocol</a>. We provide a Java client for Kafka, but clients are available in <a
href="https://cwiki.apache.org/confluence/display/KAFKA/Clients">many languages</a>.
+Communication between the clients and the servers is done with a simple, high-performance,
language agnostic <a href="https://kafka.apache.org/protocol.html">TCP protocol</a>.
We provide a Java client for Kafka, but clients are available in <a href="https://cwiki.apache.org/confluence/display/KAFKA/Clients">many
languages</a>.
 
 <h4><a id="intro_topics" href="#intro_topics">Topics and Logs</a></h4>
 Let's first dive into the high-level abstraction Kafka provides&mdash;the topic.

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/ops.html
----------------------------------------------------------------------
diff --git a/0100/ops.html b/0100/ops.html
index 8b1cc23..f64a701 100644
--- a/0100/ops.html
+++ b/0100/ops.html
@@ -134,7 +134,7 @@ my-group        my-topic                       1   0               0
 </pre>
 
 
-Note, however, after 0.9.0, the kafka.tools.ConsumerOffsetChecker tool is deprecated and
you should use the kafka.admin.ConsumerGroupCommand (or the bin/kafka-consumer-groups.sh script)
to manage consumer groups, including consumers created with the <a href="https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design">new
consumer-groups API</a>.
+Note, however, after 0.9.0, the kafka.tools.ConsumerOffsetChecker tool is deprecated and
you should use the kafka.admin.ConsumerGroupCommand (or the bin/kafka-consumer-groups.sh script)
to manage consumer groups, including consumers created with the <a href="http://kafka.apache.org/documentation.html#newconsumerapi">new
consumer API</a>.
 
 <h4><a id="basic_ops_consumer_group" href="#basic_ops_consumer_group">Managing
Consumer Groups</a></h4>
 
@@ -156,7 +156,7 @@ test-consumer-group            test-foo                       0      
   1
 </pre>
 
 
-When you're using the <a href="https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design">new
consumer-groups API</a> where the broker handles coordination of partition handling
and rebalance, you can manage the groups with the "--new-consumer" flags:
+When you're using the <a href="http://kafka.apache.org/documentation.html#newconsumerapi">new
consumer API</a> where the broker handles coordination of partition handling and rebalance,
you can manage the groups with the "--new-consumer" flags:
 
 <pre>
  &gt; bin/kafka-consumer-groups.sh --new-consumer --bootstrap-server broker1:9092 --list

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/upgrade.html
----------------------------------------------------------------------
diff --git a/0100/upgrade.html b/0100/upgrade.html
index b9c4bec..486954c 100644
--- a/0100/upgrade.html
+++ b/0100/upgrade.html
@@ -80,6 +80,11 @@ work with 0.10.0.x brokers. Therefore, 0.9.0.0 clients should be upgraded
to 0.9
     <li> MirrorMakerMessageHandler no longer exposes the <code>handle(record:
MessageAndMetadata[Array[Byte], Array[Byte]])</code> method as it was never called.
</li>
     <li> The 0.7 KafkaMigrationTool is no longer packaged with Kafka. If you need to
migrate from 0.7 to 0.10.0, please migrate to 0.8 first and then follow the documented upgrade
process to upgrade from 0.8 to 0.10.0. </li>
     <li> The new consumer has standardized its APIs to accept <code>java.util.Collection</code>
as the sequence type for method parameters. Existing code may have to be updated to work with
the 0.10.0 client library. </li>
+    <li> LZ4-compressed message handling was changed to use an interoperable framing
specification (LZ4f v1.5.1).
+         To maintain compatibility with old clients, this change only applies to Message
format 0.10.0 and later.
+         Clients that Produce/Fetch LZ4-compressed messages using v0/v1 (Message format 0.9.0)
should continue
+         to use the 0.9.0 framing implementation. Clients that use Produce/Fetch protocols
v2 or later
+         should use interoperable LZ4f framing. A list of interoperable LZ4 libraries is
available at http://www.lz4.org/
 </ul>
 
 <h5><a id="upgrade_10_notable" href="#upgrade_10_notable">Notable changes in
0.10.0.0</a></h5>


Mime
View raw message