kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nehanarkh...@apache.org
Subject svn commit: r1577937 - /kafka/site/081/ops.html
Date Sat, 15 Mar 2014 20:45:31 GMT
Author: nehanarkhede
Date: Sat Mar 15 20:45:30 2014
New Revision: 1577937

URL: http://svn.apache.org/r1577937
Log:
Improved documentation for partition reassignment in 0.8.1

Modified:
    kafka/site/081/ops.html

Modified: kafka/site/081/ops.html
URL: http://svn.apache.org/viewvc/kafka/site/081/ops.html?rev=1577937&r1=1577936&r2=1577937&view=diff
==============================================================================
--- kafka/site/081/ops.html (original)
+++ kafka/site/081/ops.html Sat Mar 15 20:45:30 2014
@@ -112,16 +112,34 @@ my-group        my-topic                
 
 <h4><a id="basic_ops_cluster_expansion">Expanding your cluster</a></h4>
 
-Adding servers to a Kafka cluster is easy, just assign them a unique broker id and start
up Kafka on your the new servers. However these new servers will not automatically be assigned
any data partitions, so unless partitions are moved to them they won't be doing any work until
new topics are created. So usually when you add machines to your cluster you will want to
migrate some existing data to these machines.
+Adding servers to a Kafka cluster is easy, just assign them a unique broker id and start
up Kafka on your new servers. However these new servers will not automatically be assigned
any data partitions, so unless partitions are moved to them they won't be doing any work until
new topics are created. So usually when you add machines to your cluster you will want to
migrate some existing data to these machines.
 <p>
 The process of migrating data is manually initiated but fully automated. Under the covers
what happens is that Kafka will add the new server as a follower of the partition it is migrating
and allow it to fully replicate the existing data in that partition. When the new server has
fully replicated the contents of this partition and joined the in-sync replica one of the
existing replicas will delete their partition's data.
-
+<p>
+The partition reassignment tool can be used to move partitions across brokers. An ideal partition
distribution would ensure even data load and partition sizes across all brokers. In 0.8.1,
the partition reassignment tool does not have the capability to automatically study the data
distribution in a Kafka cluster and move partitions around to attain an even load distribution.
As such, the admin has to figure out which topics or partitions should be moved around. 
+<p>
+The partition reassignment tool can run in 3 mutually exclusive modes -
+<ul>
+<li>--generate: In this mode, given a list of topics and a list of brokers, the tool
generates a candidate reassignment to move all partitions of the specified topics to the new
brokers. This option merely provides a convenient way to generate a partition reassignment
plan given a list of topics and target brokers.</li>
+<li>--execute: In this mode, the tool kicks off the reassignment of partitions based
on the user provided reassignment plan. (using the --reassignment-json-file option). This
can either be a custom reassignment plan hand crafted by the admin or provided by using the
--generate option</li>
+<li>--verify: In this mode, the tool verifies the status of the reassignment for all
partitions listed during the last --execute. The status can be either of successfully completed,
failed or in progress</li>
+</ul>
 <h5>Automatically migrating data to new machines</h5>
-The partition reassignment tool can be used to move some topics off of the current set of
brokers to the newly added brokers. When used to do this, the user should provide a list of
topics that should be moved to the new set of brokers and a target list of new brokers. The
tool then evenly distributes all partitions for the given list of topics to the new set of
brokers. During this move, the replication factor of the topic is kept constant. Effectively
the replicas for all partitions are moved from the old set of brokers to the newly added brokers.

-
-For example, the following will move all partitions for topics foo1,foo2 to the new set of
brokers 5,6. At the end of this move, all partitions for topics foo1 and foo2 will only exist
on brokers 5,6
+The partition reassignment tool can be used to move some topics off of the current set of
brokers to the newly added brokers. This is typically useful while expanding an existing cluster
since it is easier to move entire topics to the new set of brokers, than moving one partition
at a time. When used to do this, the user should provide a list of topics that should be moved
to the new set of brokers and a target list of new brokers. The tool then evenly distributes
all partitions for the given list of topics across the new set of brokers. During this move,
the replication factor of the topic is kept constant. Effectively the replicas for all partitions
for the input list of topics are moved from the old set of brokers to the newly added brokers.

+<p>
+For example, the following example will move all partitions for topics foo1,foo2 to the new
set of brokers 5,6. At the end of this move, all partitions for topics foo1 and foo2 will
<i>only</i> exist on brokers 5,6
+<p>
+Since, the tool accepts the input list of topics as a json file, you first need to identify
the topics you want to move and create the json file as follows-
+<pre>
+> cat topics-to-move.json
+{"topics":
+     [{"topic": "foo1"},{"topic": "foo2"}],
+     "version":1
+}
+</pre>
+Once the json file is ready, use the partition reassignment tool to generate a candidate
assignment-
 <pre>
-bin/kafka-reassign-partitions.sh --topics-to-move-json-file topics-to-move.json --broker-list
"5,6" --generate 
+> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file
topics-to-move.json --broker-list "5,6" --generate 
 Current partition replica assignment
 
 {"version":1,"partitions":[{"topic":"foo1","partition":2,"replicas":[1,2]},{"topic":"foo1","partition":0,"replicas":[3,4]},{"topic":"foo2","partition":2,"replicas":[1,2]},{"topic":"foo2","partition":0,"replicas":[3,4]},{"topic":"foo1","partition":1,"replicas":[2,3]},{topic":"foo2","partition":1,"replicas":[2,3]}]}
@@ -129,18 +147,11 @@ Current partition replica assignment
 Proposed partition reassignment configuration
 
 {"version":1,"partitions":[{"topic":"foo1","partition":2,"replicas":[5,6]},{"topic":"foo1","partition":0,"replicas":[5,6]},{"topic":"foo2","partition":2,"replicas":[5,6]},{"topic":"foo2","partition":0,"replicas":[5,6]},{"topic":"foo1","partition":1,"replicas":[5,6]},{topic":"foo2","partition":1,"replicas":[5,6]}]}
-
-cat topics-to-move.json
-{"topics":
-     [{"topic": "foo1"},{"topic": "foo2"}],
-     "version":1
-}
 </pre>
 <p>
-The tool generates a candidate assignment that will move all partitions from topics foo1,foo2
to brokers 5,6. Note, however, that at this point, the partition movement has not started,
it merely tells you the current assignment and the proposed new assignment. The current assignment
should be saved in case you want to rollback to it. The new assignment should be input to
the tool with the --execute option as follows
-
+The tool generates a candidate assignment that will move all partitions from topics foo1,foo2
to brokers 5,6. Note, however, that at this point, the partition movement has not started,
it merely tells you the current assignment and the proposed new assignment. The current assignment
should be saved in case you want to rollback to it. The new assignment should be saved in
a json file (e.g. expand-cluster-reassignment.json) to be input to the tool with the --execute
option as follows-
 <pre>
-bin/kafka-reassign-partitions.sh --reassignment-json-file expand-cluster-reassignment.json
--execute
+> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file
expand-cluster-reassignment.json --execute
 Current partition replica assignment
 
 {"version":1,"partitions":[{"topic":"foo1","partition":2,"replicas":[1,2]},{"topic":"foo1","partition":0,"replicas":[3,4]},{"topic":"foo2","partition":2,"replicas":[1,2]},{"topic":"foo2","partition":0,"replicas":[3,4]},{"topic":"foo1","partition":1,"replicas":[2,3]},{topic":"foo2","partition":1,"replicas":[2,3]}]}
@@ -150,10 +161,9 @@ Successfully started reassignment of par
 {"version":1,"partitions":[{"topic":"foo1","partition":2,"replicas":[5,6]},{"topic":"foo1","partition":0,"replicas":[5,6]},{"topic":"foo2","partition":2,"replicas":[5,6]},{"topic":"foo2","partition":0,"replicas":[5,6]},{"topic":"foo1","partition":1,"replicas":[5,6]},{topic":"foo2","partition":1,"replicas":[5,6]}]}
 </pre>
 <p>
-The --verify option can be used with the tool to check the status of the partition reassignment.
Note that the same expand-cluster-reassignment.json (used with the --execute option) should
be used with the --verify option
-
+Finally, the --verify option can be used with the tool to check the status of the partition
reassignment. Note that the same expand-cluster-reassignment.json (used with the --execute
option) should be used with the --verify option
 <pre>
-bin/kafka-reassign-partitions.sh --reassignment-json-file expand-cluster-reassignment.json
--verify
+> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file
expand-cluster-reassignment.json --verify
 Status of partition reassignment:
 Reassignment of partition [foo1,0] completed successfully
 Reassignment of partition [foo1,1] is in progress
@@ -164,15 +174,18 @@ Reassignment of partition [foo2,2] compl
 </pre>
 
 <h5>Custom partition assignment and migration</h5>
-The partition reassignment tool can also be used to selectively move replicas of a partition
to a specific set of brokers. When used in this manner, it is assumed that the user knows
the reassignment and does not require the tool to generate a candidate reassignment, effectively
skipping the --generate step and moving straight to the --execute step
+The partition reassignment tool can also be used to selectively move replicas of a partition
to a specific set of brokers. When used in this manner, it is assumed that the user knows
the reassignment plan and does not require the tool to generate a candidate reassignment,
effectively skipping the --generate step and moving straight to the --execute step
 <p>
-For example, the following moves partition 0 of topic foo1 to brokers 5,6 and partition 1
of topic foo2 to brokers 2,3
-
+For example, the following example moves partition 0 of topic foo1 to brokers 5,6 and partition
1 of topic foo2 to brokers 2,3
+<p>
+The first step is to hand craft the custom reassignment plan in a json file-
 <pre>
-cat custom-reassignment.json
+> cat custom-reassignment.json
 {"version":1,"partitions":[{"topic":"foo1","partition":0,"replicas":[5,6]},{"topic":"foo2","partition":1,"replicas":[2,3]}]}
-
-bin/kafka-reassign-partitions.sh --reassignment-json-file custom-reassignment.json --execute
+</pre>
+Then, use the json file with the --execute option to start the reassignment process-
+<pre>
+> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file
custom-reassignment.json --execute
 Current partition replica assignment
 
 {"version":1,"partitions":[{"topic":"foo1","partition":0,"replicas":[1,2]},{"topic":"foo2","partition":1,"replicas":[3,4]}]}
@@ -183,16 +196,15 @@ Successfully started reassignment of par
 </pre>
 <p>
 The --verify option can be used with the tool to check the status of the partition reassignment.
Note that the same expand-cluster-reassignment.json (used with the --execute option) should
be used with the --verify option
-
 <pre>
-bin/kafka-reassign-partitions.sh --reassignment-json-file custom-reassignment.json --verify
+bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file custom-reassignment.json
--verify
 Status of partition reassignment:
 Reassignment of partition [foo1,0] completed successfully
 Reassignment of partition [foo2,1] completed successfully 
 </pre>
 
 <h5>Decommissioning machines</h5>
-The partition reassignment tool does not have the ability to decommission machines yet. As
such, you cannot decommission machines without effectively reducing the replication factor
of the partitions that existed on the decommissioned machine. We plan to add support for this
in 0.8.2
+The partition reassignment tool does not have the ability to automatically generate a reassignment
plan for decommissioning brokers yet. As such, the admin has to come up with a reassignment
plan to move the replica for all partitions hosted on the broker to be decommissioned, to
the rest of the brokers. This can be relatively tedious as the reassignment needs to ensure
that all the replicas are not moved from the decommissioned broker to only one other broker.
To make this process effortless, we plan to add tooling support for decommissioning brokers
in 0.8.2.
 
 <h3><a id="datacenters">6.2 Datacenters</a></h3>
 



Mime
View raw message