mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <bmah...@apache.org>
Subject Re: Review Request 68132: Batch '/state' requests on Master.
Date Wed, 01 Aug 2018 01:59:54 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68132/#review206714
-----------------------------------------------------------



Overall the approach looks good, didn't see any bugs, so just minor comments below.

* I'm not sure where to put it but it seems we need a TODO to de-duplicate response processing
when the principal is identical? E.g. if "ben" asks for state three times in one batch, ideally
we only compute the response for "ben" once since they're all identical within a principal?
* Can you document the consistency model in the description?


src/master/http.cpp
Line 2812 (original), 2815 (patched)
<https://reviews.apache.org/r/68132/#comment289793>

    Can we keep the existing name? I believe the idea is to have them match the path, so "/state"
-> Http::state seems ideal as is.



src/master/http.cpp
Lines 2838-2910 (original), 2841-2853 (patched)
<https://reviews.apache.org/r/68132/#comment289799>

    It seems a little brittle to inline the batching related logic and deal with promises
here, could we use a function?
    
    ```
    .then(... {
      return _state(request, approvers);
    })
    ```
    
    ```
    Future<Response> _state(request, approvers)
    {
      bool scheduleBatch = batchedStateRequests.empty();
      
      ... Add entry and grab `future` ...
      
      if (scheduleBatch) { dispatch ... }
      
      return future;
    }
    ```
    
    We could name it differently, e.g. `s/_state/deferStateRequest/`. This way the handler
doesn't have to inline batching and promise logic.



src/master/http.cpp
Lines 2887-2889 (original), 2849-2851 (patched)
<https://reviews.apache.org/r/68132/#comment289796>

    Can you move in the request and the promise (without Owned) here? (the lambda will need
to be `mutable` for request to be moved here).



src/master/http.cpp
Lines 5173 (patched)
<https://reviews.apache.org/r/68132/#comment289800>

    Per my comment above, I guess we could name these like:
    
    ```
    state
    _state
    __state
    ```
    
    Or:
    
    ```
    state
    deferStateRequest
    processStateRequestBatch
    ```
    
    The former seems a little easier to guess the flow, the latter tries to name the functions
a bit more meaningfully (which can often make the flow harder to see from function names alone).



src/master/http.cpp
Lines 5175-5178 (patched)
<https://reviews.apache.org/r/68132/#comment289795>

    No need for the special case and the early return? The code will handle 0 items correctly.
    
    If this is trying to let us know in the future about a bug where the batching is firing
incorrectly such that there are 0 items, we could CHECK:
    
    ```
    CHECK(!batchedStateRequests.empty())
      << "Bug in state batching logic";
    ```
    
    Seems ok without the CHECK to me as well.



src/master/http.cpp
Lines 5319 (patched)
<https://reviews.apache.org/r/68132/#comment289797>

    This makes batchedRequest not so const :), might as well have it come in as a `BatchedRequest&&`
unless `process::async` doesn't support moving yet?



src/master/http.cpp
Lines 5323-5328 (patched)
<https://reviews.apache.org/r/68132/#comment289801>

    It seems a little odd to have the lambda have to know about the batch struct and do promise
setting, instead of just returning the Response:
    
    ```
    auto response = [this](Owned<ObjectApprovers> approvers) {
      ...
      
      return http::OK(...);
    }
    ```
    
    Then this code here is the one that deals with promise setting, e.g.
    
    ```
      // Fire off the workers.
      foreach (const BatchedStateRequest& request, batchedStateRequests) {
        request.promise.associate(process::async(response, request.approvers));
      }
      
      // Wait for all responses to transition.
      vector<Future<Response>> responses;
      foreach (const BatchedStateRequest& request, batchedStateRequests) {
        responses.push_back(request.promise.future());
      }
      process::await(responses).await();
    ```
    
    This lets us keep the response lambda agnostic of batching and we could more cleanly move
it up in the future.



src/master/http.cpp
Lines 5327 (patched)
<https://reviews.apache.org/r/68132/#comment289798>

    Can we move the request in here? If async doesn't support it, can you add a TODO?



src/master/master.hpp
Lines 1842 (patched)
<https://reviews.apache.org/r/68132/#comment289794>

    Can we avoid Owned and std::move this struct instead of copying it?


- Benjamin Mahler


On July 31, 2018, 5:24 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68132/
> -----------------------------------------------------------
> 
> (Updated July 31, 2018, 5:24 p.m.)
> 
> 
> Review request for mesos, Benno Evers and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9122
>     https://issues.apache.org/jira/browse/MESOS-9122
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> With this patch handlers for '/state' requests are not scheduled
> directly after authorization, but are accumulated and then scheduled
> for later parallel processing.
> 
> This approach allows, if there are N '/state' requests in the Master's
> mailbox and T is the request response time, to block the Master actor
> only once for time O(T) instead of blocking it for time N*T prior to
> this patch.
> 
> This batching technique reduces both the time Master is spending
> answering '/state' requests and the average request response time
> in presence of multiple requests in the Master's mailbox. However,
> for seldom '/state' requests that don't accumulate in the Master's
> mailbox, the response time might increase due to an added trip
> through the mailbox.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 6947031da3ce3523408d69d6dac60551a39a4601 
>   src/master/master.hpp 0353d550308816f219aedb6afe15c643fc8bb340 
>   src/master/master.cpp 2af976f7ea7f81d4b06a45ce13286dbd61b9b144 
> 
> 
> Diff: https://reviews.apache.org/r/68132/diff/1/
> 
> 
> Testing
> -------
> 
> `make check` on Mac OS 10.13.5 and various Linux distros.
> 
> Run `MasterStateQueryLoad_BENCHMARK_Test.v0State` benchmark.
> 
> Average improvement without optimization: 62%–70%.
> Average improvement with optimization: 17%–62%.
> 
> **[No batching, no optimization](https://dobianchi.files.wordpress.com/2011/11/no-barrique-no-berlusconi.jpg?w=638)**
> ```
> Test setup: 100 agents with a total of 10000 running tasks and 10000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 1.102349605secs, 10 responses are in [2.662342ms, 2.143755433secs]
> '/state' response on average took 1.549122019secs, 10 responses are in [494.278454ms,
2.633971927secs]
> 
> Test setup: 1000 agents with a total of 100000 running tasks and 100000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 18.436968137secs, 10 responses are in [2.578238ms,
33.210561732secs]
> '/state' response on average took 23.916379537secs, 10 responses are in [5.170660597secs,
43.008091744secs]
> ```
> 
> **With batching but no optimization**
> ```
> Test setup: 100 agents with a total of 10000 running tasks and 10000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 417.211022ms, 10 responses are in [4.066901ms, 728.045442ms]
> '/state' response on average took 830.351291ms, 10 responses are in [459.033455ms, 1.208880892secs]
> 
> Test setup: 1000 agents with a total of 100000 running tasks and 100000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 5.439950928secs, 10 responses are in [3.246906ms, 9.343994388secs]
> '/state' response on average took 16.764607823secs, 10 responses are in [4.980333091secs,
18.461983916secs]
> ```
> 
> **No batching but `-O3` optimization**
> ```
> Test setup: 100 agents with a total of 10000 running tasks and 10000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 2.396221ms, 10 responses are in [1.628583ms, 2.816639ms]
> '/state' response on average took 113.469574ms, 10 responses are in [104.218099ms, 134.477062ms]
> 
> Test setup: 1000 agents with a total of 100000 running tasks and 100000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 3.892615876secs, 10 responses are in [2.480517ms, 7.630934838secs]
> '/state' response on average took 5.205245306secs, 10 responses are in [1.578161651secs,
8.789315237secs]
> ```
> 
> **Batching and `-O3` optimization**
> ```
> Test setup: 100 agents with a total of 10000 running tasks and 10000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 1.973573ms, 10 responses are in [1.221193ms, 2.694713ms]
> '/state' response on average took 113.331551ms, 10 responses are in [102.593397ms, 142.028555ms]
> 
> Test setup: 1000 agents with a total of 100000 running tasks and 100000 completed tasks;
10 '/state' and '/flags' requests will be sent with 200ms interval
> Launching 10 '/state' requests in background
> Launching 10 '/flags' requests
> '/flags' response on average took 1.475842691secs, 10 responses are in [2.437217ms, 3.815589561secs]
> '/state' response on average took 4.742303751secs, 10 responses are in [4.047655443secs,
6.00752698secs]
> ```
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message