mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaston Kleiman <gas...@mesosphere.io>
Subject Re: Review Request 65695: Made the default executor allow schedulers to retry task kills.
Date Tue, 20 Mar 2018 21:34:18 GMT


> On March 20, 2018, 5:53 a.m., Alexander Rukletsov wrote:
> > A high level question I have is the semantics of a task kill. You retry `SIGKILL`
but not `SIGTERM`. Why is it so? Do you expect frameworks to retry kills at some point of
time (which should be less that the kill policy timeout to make sense) if they do not receive
a terminal status update? Since the escalation timer is started anyway, is it correct to transition
the container back to `killing=false`? This can also lead to multiple `SIGKILL` requests going
to the agent for the same container, which is likely not a big deal, but maybe is worth checking?
> > 
> > My suggestion would be to retry everything ourselves and do not expect frameworks
to do this for us. When we faced a similar problem with the docker executor, we decided that
once the task has been transitioned to `killed`, there is no way back, even if the kill attempt
failed.

Yes, I expect schedulers to retry kills at some point. Some schedulers already do this, i.e.,
Marathon.

A problem with the approach is that Mesos doesn't send any kind of signal when a kill fails.
So probably the easiest way for a scheduler to detect a KILL failure is to check if a task
is still in `TASK_KILLING` after the grace period elapses. This means that the retries will
probably happen after the kill grace period lapsed.

Given that the escalation is not cancelled when a kill call fails, it is possible for the
task to get a `SIGKILL` and no `SIGTERM`. We could fix this by only scheduling the escalation
once the `SIGTERM` kill call succeeds.

We definitely want the default executor to retry kills that it initiates (see [MESOS-8532](https://issues.apache.org/jira/browse/MESOS-8532]).
I am not sure if the default executor should also retry kills initiated by a scheduler. It
could be better to let it somehow signal the scheduler that the kill request failed and then
to let the scheduler decide when to retry the kill, also giving it the chance to adjust the
kill policy.


> On March 20, 2018, 5:53 a.m., Alexander Rukletsov wrote:
> > src/launcher/default_executor.cpp
> > Lines 1242-1243 (original)
> > <https://reviews.apache.org/r/65695/diff/3/?file=1978910#file1978910line1242>
> >
> >     Is this change related to your chain? Or does this fix a different issue with
the broken invariant?

This method can be called by `taskHealthUpdated()` when the executor is not subscribed, so
it is wrong to assume that the executor will always be subscribed.

Furthermore, the method doesn't require the executor to be subscribed, because it triggers
a `KILL_NESTED_CONTAINER` call, which is part of the agent API and not of the executor API.

After having talked to Anand, I think that when this was added, the following was assumed:

1) That the executor will be unsubscribed only for brief periods of time, mostly during agent
recovery, when `KILL_NESTED_CONTAINER` calls will most likely fail. Crashing the executor
leads to the destruction of all the child containers, so it was a way of working around `KILL_NESTED_CONTAINER`
failures.
2) At that time the default executor supported launching only one task group. Assuming this,
killing a task would eventually trigger the destruction of all the child containers, which
will also happen (in a less graceful) when the executor crashes. This made it acceptable to
fail this CHECK.

If the executor detects kill failures and allows the scheduler to retry or retries by itself,
we don't need this check as a workaround, addressing #1.

The default executor now supports running multiple task groups, which makes assumption #2
invalid and dangerous. If this CHECK fails, all the containers started by the executor will
be killed, regardles of which task group they belong to. This is bad and could lead to data
loss.


This patch addresses all of this by removing the CHECK, making the executor detect kill failures,
and allowing schedulers to retry kills.


- Gaston


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65695/#review199540
-----------------------------------------------------------


On March 20, 2018, 1:50 p.m., Gaston Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65695/
> -----------------------------------------------------------
> 
> (Updated March 20, 2018, 1:50 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
>     https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The default executor transitions a task to `TASK_KILLING` and marks its
> child container as being killed before posting a `KILL` call to the
> agent.
> 
> The executor ignores kill requests for containers that are marked as
> being killed, and it doesn't remove this mark if the `KILL` call fails.
> This means that it's possible for tasks to get stuck in a `TASK_KILLING`
> state.
> 
> This patch makes the default executor remove the killing mark if a
> `KILL` call fails. That way a scheduler can retry a kill.
> 
> 
> Diffs
> -----
> 
>   src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 
> 
> 
> Diff: https://reviews.apache.org/r/65695/diff/4/
> 
> 
> Testing
> -------
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message