kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From guozh...@apache.org
Subject kafka git commit: KAFKA-5839: Upgrade Guide doc changes for KIP-130
Date Wed, 20 Sep 2017 08:50:22 GMT
Repository: kafka
Updated Branches:
  refs/heads/trunk bd0146d98 -> 398714b75


KAFKA-5839: Upgrade Guide doc changes for KIP-130

Author: Florian Hussonnois <florian.hussonnois@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #3811 from fhussonnois/KAFKA-5839

minor fixes


Project: http://git-wip-us.apache.org/repos/asf/kafka/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka/commit/398714b7
Tree: http://git-wip-us.apache.org/repos/asf/kafka/tree/398714b7
Diff: http://git-wip-us.apache.org/repos/asf/kafka/diff/398714b7

Branch: refs/heads/trunk
Commit: 398714b75888fca00319493c38c8ead2a02fd7cc
Parents: bd0146d
Author: Florian Hussonnois <florian.hussonnois@gmail.com>
Authored: Wed Sep 20 16:39:49 2017 +0800
Committer: Guozhang Wang <wangguoz@gmail.com>
Committed: Wed Sep 20 16:50:08 2017 +0800

----------------------------------------------------------------------
 docs/streams/developer-guide.html | 10 +++++
 docs/streams/upgrade-guide.html   | 70 ++++++++++++++++++++++++----------
 2 files changed, 59 insertions(+), 21 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kafka/blob/398714b7/docs/streams/developer-guide.html
----------------------------------------------------------------------
diff --git a/docs/streams/developer-guide.html b/docs/streams/developer-guide.html
index ab5a823..3368757 100644
--- a/docs/streams/developer-guide.html
+++ b/docs/streams/developer-guide.html
@@ -2905,6 +2905,16 @@ Note that in the <code>WordCountProcessor</code> implementation,
users need to r
     </pre>
 
     <p>
+        To retrieve information about the local running threads, you can use the <code>localThreadsMetadata()</code>
method after you start the application.
+    </p>
+
+    <pre class="brush: java;">
+    // For instance, use this method to print/monitor the partitions assigned to each local
tasks.
+    Set&lt;ThreadMetadata&gt; threads = streams.localThreadsMetadata();
+    ...
+    </pre>
+
+    <p>
         To stop the application instance call the <code>close()</code> method:
     </p>
 

http://git-wip-us.apache.org/repos/asf/kafka/blob/398714b7/docs/streams/upgrade-guide.html
----------------------------------------------------------------------
diff --git a/docs/streams/upgrade-guide.html b/docs/streams/upgrade-guide.html
index 96c5941..c4ae54f 100644
--- a/docs/streams/upgrade-guide.html
+++ b/docs/streams/upgrade-guide.html
@@ -49,7 +49,7 @@
 
     <p>
         With 1.0 a major API refactoring was accomplished and the new API is cleaner and
easier to use.
-        This change includes the five main classes <code>KafkaStreams<code>,
<code>KStreamBuilder</code>,
+        This change includes the five main classes <code>KafkaStreams</code>,
<code>KStreamBuilder</code>,
         <code>KStream</code>, <code>KTable</code>, and <code>TopologyBuilder</code>
(and some more others).
         All changes are fully backward compatible as old API is only deprecated but not removed.
         We recommend to move to the new API as soon as you can.
@@ -59,7 +59,7 @@
     <p>
         The two main classes to specify a topology via the DSL (<code>KStreamBuilder</code>)
         or the Processor API (<code>TopologyBuilder</code>) were deprecated and
replaced by
-        <code>StreamsBuilder</code> and <code>Topology<code> (both
new classes are located in
+        <code>StreamsBuilder</code> and <code>Topology</code> (both
new classes are located in
         package <code>org.apache.kafka.streams</code>).
         Note, that <code>StreamsBuilder</code> does not extend <code>Topology</code>,
i.e.,
         the class hierarchy is different now.
@@ -74,7 +74,7 @@
     </p>
 
     <p>
-        Changing how a topology is specified also affects <code>KafkaStreams<code>
constructors,
+        Changing how a topology is specified also affects <code>KafkaStreams</code>
constructors,
         that now only accept a <code>Topology</code>.
         Using the DSL builder class <code>StreamsBuilder</code> one can get the
constructed
         <code>Topology</code> via <code>StreamsBuilder#build()</code>.
@@ -86,33 +86,61 @@
     </p>
 
     <p>
-        With the introduction of <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines">KIP-182</a>
-        you should no longer pass in <code>Serde</code> to <code>KStream#print</code>
operations.
-        If you can't rely on using <code>toString</code> to print your keys an
values, you should instead you provide a custom <code>KeyValueMapper</code> via
the <code>Printed#withKeyValueMapper</code> call.
+        New methods in <code>KafkaStreams</code>:
     </p>
-
+    <ul>
+        <li> retrieve the current runtime information about the local threads via <code>#localThreadsMetadata()</code>
</li>
+    </ul>
+    <p>
+        Deprecated methods in <code>KafkaStreams</code>:
+    </p>
+    <ul>
+        <li><code>toString()</code></li>
+        <li><code>toString(final String indent)</code></li>
+    </ul>
     <p>
-        Windowed aggregations have moved from <code>KGroupedStream</code> to
<code>WindowedKStream</code>.
-        You can now perform a windowed aggregation by, for example, using <code>KGroupedStream#windowedBy(Windows)#reduce(Reducer)</code>.
-        Note: the previous aggregate functions on <code>KGroupedStream</code>
still work, but have been deprecated.
+        Previously the above methods were used to return static and runtime information.
+        They have been deprecated in favor of using the new classes/methods <code>#localThreadsMetadata()</code>
/ <code>ThreadMetadata</code> (returning runtime information) and
+        <code>TopologyDescription</code> / <code>Topology#describe()</code>
(returning static information).
     </p>
 
     <p>
-        The Processor API was extended to allow users to schedule <code>punctuate</code>
functions either based on data-driven <b>stream time</b> or wall-clock time.
-        As a result, the original <code>ProcessorContext#schedule</code> is deprecated
with a new overloaded function that accepts a user customizable <code>Punctuator</code>
callback interface, which triggers its <code>punctuate</code> API method periodically
based on the <code>PunctuationType</code>.
-        The <code>PunctuationType</code> determines what notion of time is used
for the punctuation scheduling: either <a href="/{{version}}/documentation/streams/core-concepts#streams_time">stream
time</a> or wall-clock time (by default, <b>stream time</b> is configured
to represent event time via <code>TimestampExtractor</code>).
-        In addition, the <code>punctuate</code> function inside <code>Processor</code>
is also deprecated.
+        More deprecated methods in <code>KafkaStreams</code>:
     </p>
+    <ul>
+        <li>With the introduction of <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines">KIP-182</a>
+            you should no longer pass in <code>Serde</code> to <code>KStream#print</code>
operations.
+            If you can't rely on using <code>toString</code> to print your keys
an values, you should instead you provide a custom <code>KeyValueMapper</code>
via the <code>Printed#withKeyValueMapper</code> call.
+        </li>
+        <li>
+            Windowed aggregations have moved from <code>KGroupedStream</code>
to <code>WindowedKStream</code>.
+            You can now perform a windowed aggregation by, for example, using <code>KGroupedStream#windowedBy(Windows)#reduce(Reducer)</code>.
+            Note: the previous aggregate functions on <code>KGroupedStream</code>
still work, but have been deprecated.
+        </li>
+    </ul>
 
     <p>
-        Before this, users could only schedule based on stream time (i.e. <code>PunctuationType.STREAM_TIME</code>)
and hence the <code>punctuate</code> function was data-driven only because stream
time is determined (and advanced forward) by the timestamps derived from the input data.
-        If there is no data arriving at the processor, the stream time would not advance
and hence punctuation will not be triggered.
-        On the other hand, When wall-clock time (i.e. <code>PunctuationType.WALL_CLOCK_TIME</code>)
is used, <code>punctuate</code> will be triggered purely based on wall-clock time.
-        So for example if the <code>Punctuator</code> function is scheduled based
on <code>PunctuationType.WALL_CLOCK_TIME</code>, if these 60 records were processed
within 20 seconds,
-        <code>punctuate</code> would be called 2 times (one time every 10 seconds);
-        if these 60 records were processed within 5 seconds, then no <code>punctuate</code>
would be called at all.
-        Users can schedule multiple <code>Punctuator</code> callbacks with different
<code>PunctuationType</code>s within the same processor by simply calling <code>ProcessorContext#schedule</code>
multiple times inside processor's <code>init()</code> method.
+        Modified methods in <code>Processor</code>:
     </p>
+    <ul>
+        <li>
+            <p>
+                The Processor API was extended to allow users to schedule <code>punctuate</code>
functions either based on data-driven <b>stream time</b> or wall-clock time.
+                As a result, the original <code>ProcessorContext#schedule</code>
is deprecated with a new overloaded function that accepts a user customizable <code>Punctuator</code>
callback interface, which triggers its <code>punctuate</code> API method periodically
based on the <code>PunctuationType</code>.
+                The <code>PunctuationType</code> determines what notion of time
is used for the punctuation scheduling: either <a href="/{{version}}/documentation/streams/core-concepts#streams_time">stream
time</a> or wall-clock time (by default, <b>stream time</b> is configured
to represent event time via <code>TimestampExtractor</code>).
+                In addition, the <code>punctuate</code> function inside <code>Processor</code>
is also deprecated.
+            </p>
+            <p>
+                Before this, users could only schedule based on stream time (i.e. <code>PunctuationType.STREAM_TIME</code>)
and hence the <code>punctuate</code> function was data-driven only because stream
time is determined (and advanced forward) by the timestamps derived from the input data.
+                If there is no data arriving at the processor, the stream time would not
advance and hence punctuation will not be triggered.
+                On the other hand, When wall-clock time (i.e. <code>PunctuationType.WALL_CLOCK_TIME</code>)
is used, <code>punctuate</code> will be triggered purely based on wall-clock time.
+                So for example if the <code>Punctuator</code> function is scheduled
based on <code>PunctuationType.WALL_CLOCK_TIME</code>, if these 60 records were
processed within 20 seconds,
+                <code>punctuate</code> would be called 2 times (one time every
10 seconds);
+                if these 60 records were processed within 5 seconds, then no <code>punctuate</code>
would be called at all.
+                Users can schedule multiple <code>Punctuator</code> callbacks
with different <code>PunctuationType</code>s within the same processor by simply
calling <code>ProcessorContext#schedule</code> multiple times inside processor's
<code>init()</code> method.
+            </p>
+        </li>
+    </ul>
 
     <p>
         If you are monitoring on task level or processor-node / state store level Streams
metrics, please note that the metrics sensor name and hierarchy was changed:


Mime
View raw message