kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From guozh...@apache.org
Subject kafka git commit: MINOR: updated incorrect JavaDocs for joins
Date Wed, 22 Mar 2017 01:36:00 GMT
Repository: kafka
Updated Branches:
  refs/heads/0.10.2 6e2698b25 -> 51ffddcbc


MINOR: updated incorrect JavaDocs for joins

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Damian Guy, Eno Thereska, Guozhang Wang

Closes #2670 from mjsax/fixJavaDocsRepartitioning


Project: http://git-wip-us.apache.org/repos/asf/kafka/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka/commit/51ffddcb
Tree: http://git-wip-us.apache.org/repos/asf/kafka/tree/51ffddcb
Diff: http://git-wip-us.apache.org/repos/asf/kafka/diff/51ffddcb

Branch: refs/heads/0.10.2
Commit: 51ffddcbc8846be57936166d2d43eed75336e0fa
Parents: 6e2698b
Author: Matthias J. Sax <matthias@confluent.io>
Authored: Tue Mar 21 16:21:28 2017 -0700
Committer: Guozhang Wang <wangguoz@gmail.com>
Committed: Tue Mar 21 18:35:53 2017 -0700

----------------------------------------------------------------------
 .../apache/kafka/streams/kstream/KStream.java   | 76 +++++++++++++++-----
 .../apache/kafka/streams/kstream/KTable.java    | 45 ++----------
 2 files changed, 64 insertions(+), 57 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kafka/blob/51ffddcb/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
----------------------------------------------------------------------
diff --git a/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java b/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
index 21135fb..1b49f40 100644
--- a/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
+++ b/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
@@ -927,7 +927,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -998,7 +1002,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1082,7 +1090,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1157,7 +1169,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1244,7 +1260,11 @@ public interface KStream<K, V> {
      * <td>&lt;K3:ValueJoiner(null,c)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1320,7 +1340,11 @@ public interface KStream<K, V> {
      * <td>&lt;K3:ValueJoiner(null,c)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1408,8 +1432,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1418,7 +1446,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.
@@ -1477,8 +1505,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1487,7 +1519,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.
@@ -1556,8 +1588,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1566,7 +1602,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.
@@ -1628,8 +1664,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1638,7 +1678,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.

http://git-wip-us.apache.org/repos/asf/kafka/blob/51ffddcb/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
----------------------------------------------------------------------
diff --git a/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java b/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
index 9de2644..eebac8f 100644
--- a/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
+++ b/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
@@ -628,19 +628,8 @@ public interface KTable<K, V> {
      * <td>&lt;K1:null&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
-     * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
-     * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
-     * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
-     * user-specified in {@link StreamsConfig} via parameter
-     * {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG}, "XXX" is an internally
generated name, and
-     * "-repartition" is a fixed suffix.
-     * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
-     * <p>
-     * Repartitioning can happen both input {@code KTable}s.
-     * For this case, all data of a {@code KTable} will be redistributed through the repartitioning
topic by writing all
-     * update records to and rereading all update records from it, such that the join input
{@code KTable} is
-     * partitioned correctly on its key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
      *
      * @param other  the other {@code KTable} to be joined with this {@code KTable}
      * @param joiner a {@link ValueJoiner} that computes the join result for a pair of matching
records
@@ -720,19 +709,8 @@ public interface KTable<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
-     * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
-     * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
-     * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
-     * user-specified in {@link StreamsConfig} via parameter
-     * {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG}, "XXX" is an internally
generated name, and
-     * "-repartition" is a fixed suffix.
-     * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
-     * <p>
-     * Repartitioning can happen both input {@code KTable}s.
-     * For this case, all data of a {@code KTable} will be redistributed through the repartitioning
topic by writing all
-     * update records to and rereading all update records from it, such that the join input
{@code KTable} is
-     * partitioned correctly on its key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
      *
      * @param other  the other {@code KTable} to be joined with this {@code KTable}
      * @param joiner a {@link ValueJoiner} that computes the join result for a pair of matching
records
@@ -812,19 +790,8 @@ public interface KTable<K, V> {
      * <td>&lt;K1:null&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
-     * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
-     * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
-     * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
-     * user-specified in {@link StreamsConfig} via parameter
-     * {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG}, "XXX" is an internally
generated name, and
-     * "-repartition" is a fixed suffix.
-     * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
-     * <p>
-     * Repartitioning can happen both input {@code KTable}s.
-     * For this case, all data of a {@code KTable} will be redistributed through the repartitioning
topic by writing all
-     * update records to and rereading all update records from it, such that the join input
{@code KTable} is
-     * partitioned correctly on its key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
      *
      * @param other  the other {@code KTable} to be joined with this {@code KTable}
      * @param joiner a {@link ValueJoiner} that computes the join result for a pair of matching
records


Mime
View raw message