kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bbej...@apache.org
Subject [kafka] branch 2.3 updated: MINOR: Extend RocksDB section of Memory Management Docs (#6793)
Date Thu, 30 May 2019 11:31:33 GMT
This is an automated email from the ASF dual-hosted git repository.

bbejeck pushed a commit to branch 2.3
in repository https://gitbox.apache.org/repos/asf/kafka.git


The following commit(s) were added to refs/heads/2.3 by this push:
     new f268917  MINOR: Extend RocksDB section of Memory Management Docs (#6793)
f268917 is described below

commit f268917f0285b7acec82e2f7cbb4ccc5c2e0b002
Author: A. Sophie Blee-Goldman <ableegoldman@gmail.com>
AuthorDate: Thu May 30 04:28:43 2019 -0700

    MINOR: Extend RocksDB section of Memory Management Docs (#6793)
    
    Now that we can configure RocksDB to bound the total memory we should include docs describing
how, as well as touching on some possible options that should be considered when taking advantage
of this feature.
    
    Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jim Galasyn <jim.galasyn@confluent.io>,
Bill Bejeck <bbejeck@gmail.com>
---
 docs/streams/developer-guide/memory-mgmt.html | 61 ++++++++++++++++++++++++---
 1 file changed, 56 insertions(+), 5 deletions(-)

diff --git a/docs/streams/developer-guide/memory-mgmt.html b/docs/streams/developer-guide/memory-mgmt.html
index f21ed34..68c379b 100644
--- a/docs/streams/developer-guide/memory-mgmt.html
+++ b/docs/streams/developer-guide/memory-mgmt.html
@@ -167,7 +167,61 @@
 </pre></div>
       </div>
     </div>
-    <div class="section" id="other-memory-usage">
+    <div class="section" id="rocksdb">
+      <h2><a class="toc-backref" href="#id3">RocksDB</a><a class="headerlink"
href="#rocksdb" title="Permalink to this headline"></a></h2>
+      <p> Each instance of RocksDB allocates off-heap memory for a block cache (with
data), index and filter blocks, and memtable (write buffer). Critical configs (for RocksDB
version 4.1.0) include
+        <code class="docutils literal"><span class="pre">block_cache_size</span></code>,
<code class="docutils literal"><span class="pre">write_buffer_size</span></code>
and <code class="docutils literal"><span class="pre">max_write_buffer_number</span></code>.
 These can be specified through the
+        <code class="docutils literal"><span class="pre">rocksdb.config.setter</span></code>
configuration.</li>
+      <p> As of 2.3.0 the memory usage across all instances can be bounded, limiting
the total off-heap memory of your Streams app. To do so you must configure RocksDB to cache
the index and filter blocks in the block cache, limit the memtable memory through a shared
<a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager">WriteBufferManager</a>
and count its memory against the block cache, and then pass the same Cache object to each
instance. Se [...]
+
+      <div class="highlight-java"><div class="highlight"><pre><span></span>
   <span class="kd">public</span> <span class="kd">static</span> <span
class="kd">class</span> <span class="nc">BoundedMemoryRocksDBConfig</span>
<span class="kd">implements</span> <span class="n">RocksDBConfigSetter</span>
<span class="o">{</span>
+
+       <span class="kd">private</span> <span class="kt">static</span>
<span class="n">org.rocksdb.Cache</span> <span class="n">cache</span>
<span class="o">=</span> <span class="k">new</span> <span class="n">org</span><span
class="o">.</span><span class="na">rocksdb</span><span class="o">.</span><span
class="na">LRUCache</span><span class="o">(</span><span class="mi">TOTAL_OFF_HEAP_MEMORY</span><span
class="o">,</span> <span class="n">-1</span><span class="o">,</span>
<span class="n">fal [...]
+       <span class="kd">private</span> <span class="kt">static</span>
<span class="n">org.rocksdb.WriteBufferManager</span> <span class="n">writeBufferManager</span>
<span class="o">=</span> <span class="k">new</span> <span class="n">org</span><span
class="o">.</span><span class="na">rocksdb</span><span class="o">.</span><span
class="na">WriteBufferManager</span><span class="o">(</span><span
class="mi">TOTAL_MEMTABLE_MEMORY</span><span class="o">,</span> cache<span
class="o">);</span>
+       <span class="kd">private</span> <span class="n">org.rocksdb.Filter</span>
<span class="n">filter</span> <span class="o">=</span> <span class="k">new</span>
<span class="n">org</span><span class="o">.</span><span class="na">rocksdb</span><span
class="o">.</span><span class="na">BloomFilter</span><span class="o">();</span>
+
+       <span class="nd">@Override</span>
+       <span class="kd">public</span> <span class="kt">void</span>
<span class="nf">setConfig</span><span class="o">(</span><span
class="kd">final</span> <span class="n">String</span> <span class="n">storeName</span><span
class="o">,</span> <span class="kd">final</span> <span class="n">Options</span>
<span class="n">options</span><span class="o">,</span> <span class="kd">final</span>
<span class="n">Map</span><span class="o">&lt;</span><span
class="n">String</span><span class="o">,</span [...]
+
+         <span class="n">BlockBasedTableConfig</span> <span class="n">tableConfig</span>
<span class="o">=</span> <span class="k">new</span> <span class="n">org</span><span
class="o">.</span><span class="na">rocksdb</span><span class="o">.</span><span
class="na">BlockBasedTableConfig</span><span class="o">();</span>
+
+         <span class="c1"> // These three options in combination will limit the memory
used by RocksDB to the size passed to the block cache (TOTAL_OFF_HEAP_MEMORY)</span>
+         <span class="n">tableConfig</span><span class="o">.</span><span
class="na">setBlockCache</span><span class="o">(</span><span class="mi">cache</span><span
class="o">);</span>
+         <span class="n">tableConfig</span><span class="o">.</span><span
class="na">setCacheIndexAndFilterBlocks</span><span class="o">(</span><span
class="kc">true</span><span class="o">);</span>
+         <span class="n">options</span><span class="o">.</span><span
class="na">setWriteBufferManager</span><span class="o">(</span><span
class="mi">writeBufferManager</span><span class="o">);</span>
+
+         <span class="c1"> // These options are recommended to be set when bounding
the total memory</span>
+         <span class="n">tableConfig</span><span class="o">.</span><span
class="na">setCacheIndexAndFilterBlocksWithHighPriority</span><span class="o">(</span><span
class="mi">true</span><span class="o">);</span>
+         <span class="n">tableConfig</span><span class="o">.</span><span
class="na">setPinTopLevelIndexAndFilter</span><span class="o">(</span><span
class="mi">true</span><span class="o">);</span>
+         <span class="n">tableConfig</span><span class="o">.</span><span
class="na">setBlockSize</span><span class="o">(</span><span class="mi">BLOCK_SIZE</span><span
class="o">);</span><sup><a href="#fn3" id="ref3">3</a></sup>
+         <span class="n">options</span><span class="o">.</span><span
class="na">setMaxWriteBufferNumber</span><span class="o">(</span><span
class="mi">N_MEMTABLES</span><span class="o">);</span><sup><a
href="#fn4" id="ref4">4</a></sup>
+         <span class="n">options</span><span class="o">.</span><span
class="na">setWriteBufferSize</span><span class="o">(</span><span
class="mi">MEMTABLE_SIZE</span><span class="o">);</span>
+
+         <span class="n">options</span><span class="o">.</span><span
class="na">setTableFormatConfig</span><span class="o">(</span><span
class="n">tableConfig</span><span class="o">);</span>
+       <span class="o">}</span>
+
+       <span class="nd">@Override</span>
+       <span class="kd">public</span> <span class="kt">void</span>
<span class="nf">close</span><span class="o">(</span><span class="kd">final</span>
<span class="n">String</span> <span class="n">storeName</span><span
class="o">,</span> <span class="kd">final</span> <span class="n">Options</span>
<span class="n">options</span><span class="o">)</span> <span class="o">{</span>
+         <span class="c1">// Cache and WriteBufferManager should not be closed here,
as the same objects are shared by every store instance.</span>
+         <span class="c1">// The filter, however, is not shared and should be closed
to avoid leaking memory.</span>
+         <span class="n">filter</span><span class="o">.</span><span
class="na">close</span><span class="o">();</span>
+       <span class="o">}</span>
+    <span class="o">}</span>
+      </div>
+        <sup id="fn1">1. INDEX_FILTER_BLOCK_RATIO can be used to set a fraction of
the block cache to set aside for "high priority" (aka index and filter) blocks, preventing
them from being evicted by data blocks. See the full signature of the LRUCache constructor
<a class="reference external" href="https://github.com/facebook/rocksdb/blob/master/java/src/main/java/org/rocksdb/LRUCache.java#L72">here</a>.
</sup>
+        <br>
+        <sup id="fn2">2. This must be set in order for INDEX_FILTER_BLOCK_RATIO to
take effect (see footnote 1) as described <a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-and-filter-blocks">here</a></sup>
+        <br>
+        <sup id="fn3">3. You may want to modify the default <a class="reference
external" href="https://github.com/apache/kafka/blob/2.3/streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java#L79">block
size</a> per these instructions from the <a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB#indexes-and-filter-blocks">RocksDB
GitHub</a>. A larger block size means index blocks will be smaller, but the cached dat
[...]
+          <br>
+          <dl class="docutils">
+            <dt>Note:</dt>
+            While we recommend setting at least the above configs, the specific options that
yield the best performance are workload dependent and you should consider experimenting with
these to determine the best choices for your specific use case. Keep in mind that the optimal
configs for one app may not apply to one with a different topology or input topic.
+            In addition to the recommended configs above, you may want to consider using
partitioned index filters as described by the <a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters">RocksDB
docs</a>
+
+          </dl>
+      </div>
+      <div class="section" id="other-memory-usage">
       <h2><a class="toc-backref" href="#id3">Other memory usage</a><a
class="headerlink" href="#other-memory-usage" title="Permalink to this headline"></a></h2>
       <p>There are other modules inside Apache Kafka that allocate memory during runtime.
They include the following:</p>
       <ul class="simple">
@@ -179,9 +233,6 @@
         <li>Deserialized objects buffering: after <code class="docutils literal"><span
class="pre">consumer.poll()</span></code> returns records, they will be deserialized
to extract
           timestamp and buffered in the streams space. Currently this is only indirectly
controlled by
           <code class="docutils literal"><span class="pre">buffered.records.per.partition</span></code>.</li>
-        <li>RocksDB&#8217;s own memory usage, both on-heap and off-heap; critical
configs (for RocksDB version 4.1.0) include
-          <code class="docutils literal"><span class="pre">block_cache_size</span></code>,
<code class="docutils literal"><span class="pre">write_buffer_size</span></code>
and <code class="docutils literal"><span class="pre">max_write_buffer_number</span></code>.
 These can be specified through the
-          <code class="docutils literal"><span class="pre">rocksdb.config.setter</span></code>
configuration.</li>
       </ul>
       <div class="admonition tip">
         <p><b>Tip</b></p>
@@ -237,4 +288,4 @@
         // Display docs subnav items
         $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
     });
-</script>
\ No newline at end of file
+</script>


Mime
View raw message