mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 61183: Triggered 'UpdateSlaveMessage' when 'ResourceProviderManager' updates.
Date Thu, 07 Sep 2017 15:13:18 GMT


> On Sept. 7, 2017, 1:32 a.m., Jie Yu wrote:
> > src/slave/slave.hpp
> > Lines 658 (patched)
> > <https://reviews.apache.org/r/61183/diff/5/?file=1816402#file1816402line658>
> >
> >     I am not sure if keeping another field just for resoruce provider provided resources
in the agent is a good idea or not.
> >     
> >     I'd much prefer we keep a single `totalResources` for regular resources (both
resource provider provided or not), and `oversubscribedResources` only for oversubsribed resources.

The advantage of using a dedicated member is that with it it is possible to detect whether
any resources came from resource providers. With that we can avoid sending `UpdateSlaveMessage`s
e.g., on reregistration when no resource providers registered (not sure how to do this without
storing at least some data).

I'd suggest introducing this member as it captures everything we need.


> On Sept. 7, 2017, 1:32 a.m., Jie Yu wrote:
> > src/slave/slave.cpp
> > Lines 1361 (patched)
> > <https://reviews.apache.org/r/61183/diff/5/?file=1816403#file1816403line1361>
> >
> >     is it possible that agent is terminating?

This was a result of a wrong copy-paste.

I removed the `CHECK` here and below as we could just send the message in any possible state,
like is already done for oversubscribed resources.


> On Sept. 7, 2017, 1:32 a.m., Jie Yu wrote:
> > src/slave/slave.cpp
> > Lines 1370 (patched)
> > <https://reviews.apache.org/r/61183/diff/5/?file=1816403#file1816403line1370>
> >
> >     what about checkpointed resources? I think we need to use `totalResources` (with
checkpointed reosurces applied) here.
> >     
> >     Also, i checked the master handler for `UpdateSlaveMessage`. Should we rescind
offer too (like the oversubscription case)? Add a TODO there?

Fixed the code to use the correct resources now.

re:master handling of `UpdateSlaveMessage` with `TOTAL`, it makes sense to rescind these offers
as well. I pushed a minimal implementation, https://reviews.apache.org/r/62158/.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61183/#review184751
-----------------------------------------------------------


On Sept. 7, 2017, 5:13 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61183/
> -----------------------------------------------------------
> 
> (Updated Sept. 7, 2017, 5:13 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Jan Schlicht.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The agent's resource provider manager sends a
> 'ResourceProviderMessage' when its managed resources change. This
> commit adds handling in the agent so that an 'UpdateSlaveMessage' is
> sent to the master to update the total resource available on the
> agent. We also store this total in the agent memory so that it can be
> resent on agent resubscription.
> 
> In order to provide push-like handling of the resource provider
> manager's message queue, we chain recursive calls to the handler for
> continuous processing. Initially, processing is kicked off from
> 'Slave::initialize'. In this simple implementation we e.g., provide no
> direct way to stop processing of messages, yet, but it can be achieved
> by e.g., replacing the manager with a new instance (this would also
> require updating routes).
> 
> Since the agent can only send an 'UpdateSlaveMessage' when it is
> registered with a master, a simple back-off of 5 s is implemented which
> will defer processing of a ready message should the agent not yet have
> registered.
> 
> To facilitate logging we add a stringification function for
> 'ResourceProviderMessage's.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/message.hpp 3c7c3f2baeb726e04edd6ffbb9784699d7afe521 
>   src/slave/slave.hpp 7d07868451e93d34ba694d40216c1e4036fd4094 
>   src/slave/slave.cpp 6d1516a5d5b5db684f79385e60d892ff75fd00fd 
>   src/tests/slave_tests.cpp 1bdadce4c50cbff958f2be2a4261e130b414acfd 
> 
> 
> Diff: https://reviews.apache.org/r/61183/diff/6/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message