mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 69338: Change agent to not report disconnected resource providers.
Date Thu, 13 Dec 2018 13:51:18 GMT


> On Dec. 7, 2018, 3:28 a.m., Chun-Hung Hsiao wrote:
> > src/slave/slave.cpp
> > Line 7595 (original), 7598 (patched)
> > <https://reviews.apache.org/r/69338/diff/1/?file=2107898#file2107898line7598>
> >
> >     It seems to me that the compiler should be able to hoist `message.mutable_resource_providers()`
outside the loop so there is no need to do it manually. But I'm fine with this change.

We want to set this field even if there are no `resourceProviders` to iterate over. I'll add
a comment.


> On Dec. 7, 2018, 3:28 a.m., Chun-Hung Hsiao wrote:
> > src/slave/slave.cpp
> > Lines 7873-7874 (original), 7875-7876 (patched)
> > <https://reviews.apache.org/r/69338/diff/1/?file=2107898#file2107898line7876>
> >
> >     These comments need to be updated.
> >     
> >     I was wondering now that, since we already have this empty resource logic here,
why do we still see stale resource in the master? Is it because that in the case of agent
restart, no `DISCONNECT` message is triggered, and thus we never clean up the resources?
> >     
> >     It seems to me that we have the original logic in place for a reason. Is it
because we want the master to cache the resource provider infos without any resource for a
reason? If yes, we may want to revisit  this and think about if we want to erase it, or keep
the original behavior but just clean up the resources/operations in the master.

Fixed the comment.

This only handles the case were we successfully transported a `DISCONNECT` even to the master,
but if e.g., the slave never observed it (e.g., RP doesn't come back after agent failover)
we wouldn't send this.

AFAIR we added the caching behavior to the master because we thought it would be useful for
clients to have a single point where they could get all RP-related information. I don't think
that turned out to be very useful, and also always violated the encapsulation we had in mind
for LRPs. I'd suggest we carry out the agent-related changes in this patch and let the master
decide independently what to GC and what not to GC, e.g., in https://reviews.apache.org/r/69337/.

Dropping for now; feel free to reopen.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69338/#review211106
-----------------------------------------------------------


On Dec. 13, 2018, 2:51 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69338/
> -----------------------------------------------------------
> 
> (Updated Dec. 13, 2018, 2:51 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Bugs: MESOS-9384
>     https://issues.apache.org/jira/browse/MESOS-9384
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch changes the agent to not report disconnected resource
> providers in the `UpdateSlaveMessage` it emits. We also fix how it
> generates resource provider-related updates when no resource providers
> are connected to allow the master to react properly when a resource
> provider disappears.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 8467a60ad84a4e900c2eb587fdf2fc6ff9c72c54 
>   src/tests/resource_provider_manager_tests.cpp b61c50f839b7a0f49a781a6f44aa4f10ad7ebafe

> 
> 
> Diff: https://reviews.apache.org/r/69338/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message