mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Schlicht <...@mesosphere.io>
Subject Re: Review Request 64713: Fixed a crash when resubscribing resource providers.
Date Wed, 20 Dec 2017 13:38:25 GMT


> On Dec. 19, 2017, 4:28 p.m., Benjamin Bannier wrote:
> > src/tests/resource_provider_manager_tests.cpp
> > Line 1150 (original), 1150 (patched)
> > <https://reviews.apache.org/r/64713/diff/1/?file=1923996#file1923996line1150>
> >
> >     Let's add a comment here outlining that we start a second RP with the same ID
which will take `resourceProvider1`'s place with the connection being closed by the manager.

Test case is no longer changed. Dropping.


> On Dec. 19, 2017, 4:28 p.m., Benjamin Bannier wrote:
> > src/tests/resource_provider_manager_tests.cpp
> > Line 1157 (original), 1157 (patched)
> > <https://reviews.apache.org/r/64713/diff/1/?file=1923996#file1923996line1157>
> >
> >     It would be great to check here that `resourceProvider1` actually got disconnected.

I ran into problems doing that. Seems like the `disconnected` callback is never called. And
if it would, the changes in this test case wouldn't work, because the old RP instance would
immediately try to resubscribe. This would result in both instances always trying to resubscribe,
which is not what we want. I've removed the test for now, will investigate, why `disconnected`
isn't called and come up with a separate test case for that. I've created https://issues.apache.org/jira/browse/MESOS-8349
for that.


> On Dec. 19, 2017, 4:28 p.m., Benjamin Bannier wrote:
> > src/tests/resource_provider_manager_tests.cpp
> > Line 1184 (original), 1184 (patched)
> > <https://reviews.apache.org/r/64713/diff/1/?file=1923996#file1923996line1184>
> >
> >     Let's add a comment here that this closes the connection on the RP side.

Test case is no longer changed. Dropping.


- Jan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64713/#review194146
-----------------------------------------------------------


On Dec. 20, 2017, 2:24 p.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64713/
> -----------------------------------------------------------
> 
> (Updated Dec. 20, 2017, 2:24 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Jie Yu.
> 
> 
> Bugs: MESOS-8346
>     https://issues.apache.org/jira/browse/MESOS-8346
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> If a resource provider resubscribed while its old HTTP connection was
> still open, the agent would crash, as a continuation would be called
> erroneously. This continuation is now only called when a HTTP connection
> is closed by a remote side (i.e. the resource provider) and not when
> the resource provider manager closes the connection.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/manager.cpp e3fcb64b630924e1bb497625708cad3f0fdc064a 
> 
> 
> Diff: https://reviews.apache.org/r/64713/diff/2/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message