kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vvcep...@apache.org
Subject [kafka] branch 2.2 updated: MINOR: clarify node grouping of input topics using pattern subscription (#7793)
Date Sat, 07 Dec 2019 05:20:11 GMT
This is an automated email from the ASF dual-hosted git repository.

vvcephei pushed a commit to branch 2.2
in repository https://gitbox.apache.org/repos/asf/kafka.git


The following commit(s) were added to refs/heads/2.2 by this push:
     new eca17e5  MINOR: clarify node grouping of input topics using pattern subscription
(#7793)
eca17e5 is described below

commit eca17e51cb7a7240f47850f72341b3eecc345676
Author: A. Sophie Blee-Goldman <sophie@confluent.io>
AuthorDate: Fri Dec 6 14:03:42 2019 -0800

    MINOR: clarify node grouping of input topics using pattern subscription (#7793)
    
    Updates the HTML docs and the javadoc.
    
    Reviewers: John Roesler <vvcephei@apache.org>
---
 docs/streams/developer-guide/dsl-api.html                         | 2 +-
 .../src/main/java/org/apache/kafka/streams/StreamsBuilder.java    | 8 ++++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/docs/streams/developer-guide/dsl-api.html b/docs/streams/developer-guide/dsl-api.html
index f5c3df9..07a02da 100644
--- a/docs/streams/developer-guide/dsl-api.html
+++ b/docs/streams/developer-guide/dsl-api.html
@@ -259,7 +259,7 @@
                         <p>You <strong>must specify SerDes explicitly</strong>
if the key or value types of the records in the Kafka input
                             topics do not match the configured default SerDes. For information
about configuring default SerDes, available
                             SerDes, and implementing your own custom SerDes see <a class="reference
internal" href="datatypes.html#streams-developer-guide-serdes"><span class="std std-ref">Data
Types and Serialization</span></a>.</p>
-                        <p class="last">Several variants of <code class="docutils
literal"><span class="pre">stream</span></code> exist, for example to
specify a regex pattern for input topics to read from).</p>
+                        <p class="last">Several variants of <code class="docutils
literal"><span class="pre">stream</span></code> exist. For example, you
can specify a regex pattern for input topics to read from (note that all matching topics will
be part of the same input topic group, and the work will not be parallelized for different
topics if subscribed to in this way).</p>
                     </td>
                 </tr>
                 <tr class="row-odd"><td><p class="first"><strong>Table</strong></p>
diff --git a/streams/src/main/java/org/apache/kafka/streams/StreamsBuilder.java b/streams/src/main/java/org/apache/kafka/streams/StreamsBuilder.java
index 1b3b4a2..7b31174 100644
--- a/streams/src/main/java/org/apache/kafka/streams/StreamsBuilder.java
+++ b/streams/src/main/java/org/apache/kafka/streams/StreamsBuilder.java
@@ -145,7 +145,9 @@ public class StreamsBuilder {
      * deserializers as specified in the {@link StreamsConfig config} are used.
      * <p>
      * If multiple topics are matched by the specified pattern, the created {@link KStream}
will read data from all of
-     * them and there is no ordering guarantee between records from different topics.
+     * them and there is no ordering guarantee between records from different topics. This
also means that the work
+     * will not be parallelized for multiple topics, and the number of tasks will scale with
the maximum partition
+     * count of any matching topic rather than the total number of partitions across all
topics.
      * <p>
      * Note that the specified input topics must be partitioned by key.
      * If this is not the case it is the user's responsibility to repartition the data before
any key based operation
@@ -164,7 +166,9 @@ public class StreamsBuilder {
      * are defined by the options in {@link Consumed} are used.
      * <p>
      * If multiple topics are matched by the specified pattern, the created {@link KStream}
will read data from all of
-     * them and there is no ordering guarantee between records from different topics.
+     * them and there is no ordering guarantee between records from different topics. This
also means that the work
+     * will not be parallelized for multiple topics, and the number of tasks will scale with
the maximum partition
+     * count of any matching topic rather than the total number of partitions across all
topics.
      * <p>
      * Note that the specified input topics must be partitioned by key.
      * If this is not the case it is the user's responsibility to repartition the data before
any key based operation


Mime
View raw message