mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 64713: Fixed a crash when resubscribing resource providers.
Date Tue, 19 Dec 2017 15:28:39 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64713/#review194146
-----------------------------------------------------------




src/resource_provider/manager.cpp
Lines 602-607 (patched)
<https://reviews.apache.org/r/64713/#comment272884>

    I am not sure this is the semantics we want.
    
    The protocol to add a RP is to first subscribe, then update state, potentially with resources.
If we do not explicitly propagate the disconnected event to the master we might continue to
perform operations on the previous resources which the resource provider might accept or not,
depending on e.g., how it manages updating its resource version. This sounds pretty complicated
and with potential for mistakes, and I'd prefer to just propagate the state change to the
master which would only make the previous resources available after the RP has updated it
state to the outside.



src/tests/resource_provider_manager_tests.cpp
Line 1150 (original), 1150 (patched)
<https://reviews.apache.org/r/64713/#comment272885>

    Let's add a comment here outlining that we start a second RP with the same ID which will
take `resourceProvider1`'s place with the connection being closed by the manager.



src/tests/resource_provider_manager_tests.cpp
Line 1157 (original), 1157 (patched)
<https://reviews.apache.org/r/64713/#comment272882>

    It would be great to check here that `resourceProvider1` actually got disconnected.



src/tests/resource_provider_manager_tests.cpp
Line 1184 (original), 1184 (patched)
<https://reviews.apache.org/r/64713/#comment272886>

    Let's add a comment here that this closes the connection on the RP side.


- Benjamin Bannier


On Dec. 19, 2017, 4:14 p.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64713/
> -----------------------------------------------------------
> 
> (Updated Dec. 19, 2017, 4:14 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Jie Yu.
> 
> 
> Bugs: MESOS-8346
>     https://issues.apache.org/jira/browse/MESOS-8346
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> If a resource provider resubscribed while its old HTTP connection was
> still open, the agent would crash, as a continuation would be called
> erroneously. This continuation is now only called when a HTTP connection
> is closed by a remote side (i.e. the resource provider) and not when
> the resource provider manager closes the connection.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/manager.cpp 2a167aa5c50598758cab4ab6c13f1b1e33e692dd 
>   src/tests/resource_provider_manager_tests.cpp 4e86016a450c30fea90e9c6e5c1f2f1aab5f42c0

> 
> 
> Diff: https://reviews.apache.org/r/64713/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message