mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 69876: Removed operations from master state when an agent is downgraded.
Date Tue, 12 Feb 2019 21:43:41 GMT


> On Feb. 5, 2019, 12:28 a.m., Gastón Kleiman wrote:
> > src/tests/master_tests.cpp
> > Lines 9419 (patched)
> > <https://reviews.apache.org/r/69876/diff/1/?file=2123554#file2123554line9419>
> >
> >     We should consider making the agent not recover the operation status update
manager if it isn't started with the `AGENT_OPERATION_FEEDBACK` capability.
> 
> Gastón Kleiman wrote:
>     If we don't, we should not drop this message and make sure that the framework can
acknowledge the update, so that the agent stops resending it.
> 
> Greg Mann wrote:
>     I created a ticket to track this work: https://issues.apache.org/jira/browse/MESOS-9561
> 
> Greg Mann wrote:
>     Unfortunately I need to remove this test entirely since I'm making the AGENT_OPERATION_FEEDBACK
capability required for agent startup in https://reviews.apache.org/r/69958/

To address the original comment: as we discussed offline, it seems reasonable to let the agent
recover the operation SUM when started without the new capability, since this will allow it
to keep sending updates for operations submitted while the capability was enabled. The master
will simply refuse to forward future operations to the agent which request feedback for agent
default resources.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69876/#review212538
-----------------------------------------------------------


On Feb. 12, 2019, 9:42 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69876/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2019, 9:42 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Chun-Hung Hsiao, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9535
>     https://issues.apache.org/jira/browse/MESOS-9535
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When an agent is downgraded from one with the AGENT_OPERATION_FEEDBACK
> capability to one without this capability, the master needs to remove
> terminal-but-unACKed operations from its state which operate on agent
> default resources, since the downgraded agent will not resend status
> updates for these operations.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp cf2210ec26642028d5e4fb7fc1841eb0a1ed3396 
> 
> 
> Diff: https://reviews.apache.org/r/69876/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*CleanupOperationsAfterAgentDowngrade*" --gtest_repeat=-1
--gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message