mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiang Yan Xu <...@jxu.me>
Subject Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.
Date Wed, 10 Jan 2018 06:52:32 GMT


> On Jan. 4, 2018, 5:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp
> > Lines 10037-10056 (original), 10039-10062 (patched)
> > <https://reviews.apache.org/r/64940/diff/1/?file=1930130#file1930130line10039>
> >
> >     I think we shouldn't create a TASK_UNREACHABLE status update and call `updateTask`
or `forward` for a terminal task at all. . Also, `forward` sends TASK_UNREACHABLE update for
terminal task to the framework which looks incorrect.
> >     
> >     
> >     Ideally, we want terminal but unacknowledged tasks to still be marked unreachable
in some way, either via task state being TASK_UNREACHABLE or task being present in `unreachableTasks`.
This allows, for example, the WebUI to not show sandbox links for unreachable tasks irrespective
of whether they were terminal or not before going unreachable. 
> >     
> >     But doing this is tricky for various reasons:
> >     
> >     --> `updateTask()` doesn't allow a terminal state to be transitioned to TASK_UNREACHABLE.
Right now when we call `updateTask` for a terminal task, it adds TASK_UNREACHABLE status to
`Task.statuses` and also sends it to operator API stream subscribers which looks incorrect.
The fact that `updateTask` internally deals with already terminal tasks is a bad design decision
in retrospect. I think the callers shouldn't call it for terminal tasks instead.
> >     
> >     --> It's not clear to our users what a `completed` task means. The intention
was for this to hold a cache of terminal and acknowledged tasks for storing recent history.
The users of the WebUI probably equate "Completed Tasks" to terminal tasks irrespective of
their acknowledgement status, which is why it is confusing for them to see terminal but unacknowledged
tasks in the "Active tasks" section in the WebUI.
> >     
> >     --> When a framework reconciles the state of a task on an unreachable agent,
master replies with TASK_UNREACHABLE irrespective of whether the task was in a non-terminal
state or terminal but un-acknowledged state or terminal and acknowledged state when the agent
went unreachable.  
> >     
> >     I think the direction we want to go towards is
> >     
> >     --> Completed tasks should consist of terminal unacknowledged and terminal
acknowled tasks, likely in two different data structures.
> >     --> Unreachable tasks should consist of all non-complete tasks on an unreachable
agent.  All the tasks in this map should be in TASK_UNREACHABLE state.
> >     
> >     
> >     Given all the above is a very involved change, I would recommend keeping what
you have here but with a giant TODO (your current comment in #10058 doesn't go into enough
detail about the complexity here) that talks about the above stuff. Your change at least keeps
the parity with the (broken) semantics that we have in 1.4 and earlier so that's a bit better.
> 
> Vinod Kone wrote:
>     Ignore the first line. Forgot to delete it.
> 
> Jiang Yan Xu wrote:
>     Future direction
>     
>     1. If completed == terminal unacknowledged + terminal acknowledged, then completed
== terminal right? Should we then unify the terminology and pick one?
>     2. Unreachable tasks == non-terminal tasks on an unreachable agent: this is what
this RR is going to do but IIUC you want a different behavior.
>     
>     Current semantics
>     
>     1. In 1.4 the the master (in `updateTask` sends `TASK_UNREACHABLE` to the operator
API subsribers for terminal tasks), as it stands right now we are going to send `TASK_UNREACHABLE`
to the schedulers as well. Should we change that?
>     2. You also said above that "Ideally, we want terminal but unacknowledged tasks to
still be marked unreachable in some way" which seems to contradict your later point that "Unreachable
tasks should consist of all non-complete (terminal) tasks", could you clarify?
>     
>     Overall it sounds to me that the most correct semantic is still to set `TASK_UNREACHABLE`
only for the tasks that are non-terminal (because otherwise we know that the state is not
going to change to something else that we don't know yet) but perhaps we can use another field
in the status to signal the fact that the agent is partitioned?
> 
> James Peach wrote:
>     https://issues.apache.org/jira/browse/MESOS-8405
> 
> Vinod Kone wrote:
>     Sorry for the confusion. In the future direction section it should be "Unreachable
tasks should consist of all tasks on an unreachable agent.". Note that this is what is returned
when a framework does explicit reconciliation for example. I guess this means a terminal task
can go to UNREACHABLE and back depending on the state of the agent, which I agree is also
a bit weird. Basically the TASK_UNREACHABLE state is being used as a proxy for the agent state,
it's not really task state. I think that's source of the confusion. In retrospect, maybe the
unreachability of an agent should've been indicated to a framework via a different signal
(say another field in the status update) than task state. 
>     
>     
>     As an aside, I think we want to send TASK_UNREACHABLE updates for unreachable tasks
during implicit reconciliation; I think we overlooked sending them.

> unreachability of an agent should've been indicated to a framework via a different signal
(say another field in the status update) than task state. 

Agreed. I was suggesting something similar above as well.

So, coming back to the behavior we want for 1.5 for unacknowledged terminal tasks when the
agent is partitioned:

- Scheduler and operator subscribers are sent TASK_UNREACHABLE updates (we are doing this
right now but we need to clarify the differences: I see Vinod already pointed this out).
- **I missed this earlier**: Unacknowledged terminal tasks need to be put in `framework.unreachableTasks`
so they are not put in `framework.completedTasks` and are removed from this map when the agent
comes back. The current version of this RR (and the 1.4 version) would've put these tasks
into `framework.completedTasks` when parititioning happens and they are not removed from `framework.completedTasks`
when the agent comes back while also being added back to `framework.tasks` (duplication occurs).
So I think we probably need to maintain the behavior of the current master version (terminal
but unacknowlwedged tasks are put to `framework.unreachableTasks` with the state `TASK_UNREACHABLE`)
but remove the checks that fail.
- This way, during parition whe webUI would show these terminal tasks in `TASK_UNREACHABLE`
state and the operator API calls (not subscribers) would get the tasks in the `unreachable_tasks`
section (otherwise they would've showed up in the `completed_tasks` section which would seem
worse in the current semantics.

Does it make sense?


- Jiang Yan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/#review194794
-----------------------------------------------------------


On Jan. 5, 2018, 11:37 a.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64940/
> -----------------------------------------------------------
> 
> (Updated Jan. 5, 2018, 11:37 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gaston Kleiman, Jie Yu, Vinod Kone, and Jiang
Yan Xu.
> 
> 
> Bugs: MESOS-8337
>     https://issues.apache.org/jira/browse/MESOS-8337
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> If an agent is lost, we try to remove all the tasks that might
> have been lost. However, if a task is already terminal, it hasn't
> really been lost so we should not be tracking it in the framework's
> unreachable tasks list.
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 130f6e28cc62a8912aac66ecfbf014fe1ee444e3 
>   src/master/master.cpp 28d8be3a4769b418b61cff0b95845e4232135bc7 
>   src/tests/partition_tests.cpp 3813139f576ea01db0197f0fe8a73597db1bb69a 
> 
> 
> Diff: https://reviews.apache.org/r/64940/diff/5/
> 
> 
> Testing
> -------
> 
> make check (Fedora 27)
> 
> 
> Thanks,
> 
> James Peach
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message