kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From guozh...@apache.org
Subject kafka git commit: MINOR: updated incorrect JavaDocs for joins
Date Tue, 21 Mar 2017 23:21:30 GMT
Repository: kafka
Updated Branches:
  refs/heads/trunk c2653754d -> 6f7780cf1


MINOR: updated incorrect JavaDocs for joins

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Damian Guy, Eno Thereska, Guozhang Wang

Closes #2670 from mjsax/fixJavaDocsRepartitioning


Project: http://git-wip-us.apache.org/repos/asf/kafka/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka/commit/6f7780cf
Tree: http://git-wip-us.apache.org/repos/asf/kafka/tree/6f7780cf
Diff: http://git-wip-us.apache.org/repos/asf/kafka/diff/6f7780cf

Branch: refs/heads/trunk
Commit: 6f7780cf127c8f4bb0edabaa5cb228fc566681f2
Parents: c265375
Author: Matthias J. Sax <matthias@confluent.io>
Authored: Tue Mar 21 16:21:28 2017 -0700
Committer: Guozhang Wang <wangguoz@gmail.com>
Committed: Tue Mar 21 16:21:28 2017 -0700

----------------------------------------------------------------------
 .../apache/kafka/streams/kstream/KStream.java   | 76 +++++++++++++++-----
 .../apache/kafka/streams/kstream/KTable.java    | 45 ++----------
 2 files changed, 64 insertions(+), 57 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kafka/blob/6f7780cf/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
----------------------------------------------------------------------
diff --git a/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java b/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
index f41218d..bb37af8 100644
--- a/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
+++ b/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java
@@ -941,7 +941,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1012,7 +1016,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1096,7 +1104,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1171,7 +1183,11 @@ public interface KStream<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1258,7 +1274,11 @@ public interface KStream<K, V> {
      * <td>&lt;K3:ValueJoiner(null,c)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1334,7 +1354,11 @@ public interface KStream<K, V> {
      * <td>&lt;K3:ValueJoiner(null,c)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} (for one
input stream) before doing the
+     * join, using a pre-created topic with the "correct" number of partitions.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner).
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1422,8 +1446,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1432,7 +1460,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.
@@ -1491,8 +1519,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1501,7 +1533,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.
@@ -1570,8 +1602,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1580,7 +1616,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.
@@ -1642,8 +1678,12 @@ public interface KStream<K, V> {
      * <td>&lt;K1:ValueJoiner(C,b)&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key (cf.
-     * {@link #leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)}).
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
+     * If this is not the case, you would need to call {@link #through(String)} for this
{@code KStream} before doing
+     * the join, using a pre-created topic with the same number of partitions as the given
{@link KTable}.
+     * Furthermore, both input streams need to be co-partitioned on the join key (i.e., use
the same partitioner);
+     * cf. {@link #join(GlobalKTable, KeyValueMapper, ValueJoiner)}.
      * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
      * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
      * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
@@ -1652,7 +1692,7 @@ public interface KStream<K, V> {
      * "-repartition" is a fixed suffix.
      * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
      * <p>
-     * Repartitioning can happen only for this {@code KStream}s.
+     * Repartitioning can happen only for this {@code KStream} but not for the provided {@link
KTable}.
      * For this case, all data of the stream will be redistributed through the repartitioning
topic by writing all
      * records to it, and rereading all records from it, such that the join input {@code
KStream} is partitioned
      * correctly on its key.

http://git-wip-us.apache.org/repos/asf/kafka/blob/6f7780cf/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
----------------------------------------------------------------------
diff --git a/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java b/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
index de7c153..2833a01 100644
--- a/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
+++ b/streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java
@@ -629,19 +629,8 @@ public interface KTable<K, V> {
      * <td>&lt;K1:null&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
-     * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
-     * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
-     * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
-     * user-specified in {@link StreamsConfig} via parameter
-     * {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG}, "XXX" is an internally
generated name, and
-     * "-repartition" is a fixed suffix.
-     * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
-     * <p>
-     * Repartitioning can happen both input {@code KTable}s.
-     * For this case, all data of a {@code KTable} will be redistributed through the repartitioning
topic by writing all
-     * update records to and rereading all update records from it, such that the join input
{@code KTable} is
-     * partitioned correctly on its key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
      *
      * @param other  the other {@code KTable} to be joined with this {@code KTable}
      * @param joiner a {@link ValueJoiner} that computes the join result for a pair of matching
records
@@ -721,19 +710,8 @@ public interface KTable<K, V> {
      * <td></td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
-     * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
-     * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
-     * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
-     * user-specified in {@link StreamsConfig} via parameter
-     * {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG}, "XXX" is an internally
generated name, and
-     * "-repartition" is a fixed suffix.
-     * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
-     * <p>
-     * Repartitioning can happen both input {@code KTable}s.
-     * For this case, all data of a {@code KTable} will be redistributed through the repartitioning
topic by writing all
-     * update records to and rereading all update records from it, such that the join input
{@code KTable} is
-     * partitioned correctly on its key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
      *
      * @param other  the other {@code KTable} to be joined with this {@code KTable}
      * @param joiner a {@link ValueJoiner} that computes the join result for a pair of matching
records
@@ -813,19 +791,8 @@ public interface KTable<K, V> {
      * <td>&lt;K1:null&gt;</td>
      * </tr>
      * </table>
-     * Both input streams need to be co-partitioned on the join key.
-     * If this requirement is not met, Kafka Streams will automatically repartition the data,
i.e., it will create an
-     * internal repartitioning topic in Kafka and write and re-read the data via this topic
before the actual join.
-     * The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId"
is
-     * user-specified in {@link StreamsConfig} via parameter
-     * {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG}, "XXX" is an internally
generated name, and
-     * "-repartition" is a fixed suffix.
-     * You can retrieve all generated internal topic names via {@link KafkaStreams#toString()}.
-     * <p>
-     * Repartitioning can happen both input {@code KTable}s.
-     * For this case, all data of a {@code KTable} will be redistributed through the repartitioning
topic by writing all
-     * update records to and rereading all update records from it, such that the join input
{@code KTable} is
-     * partitioned correctly on its key.
+     * Both input streams (or to be more precise, their underlying source topics) need to
have the same number of
+     * partitions.
      *
      * @param other  the other {@code KTable} to be joined with this {@code KTable}
      * @param joiner a {@link ValueJoiner} that computes the join result for a pair of matching
records


Mime
View raw message