mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anand Mazumdar <an...@apache.org>
Subject Re: Review Request 51477: Implemented `RunTaskGroupMessage` handler on the agent.
Date Wed, 07 Sep 2016 22:10:49 GMT


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.cpp, lines 3120-3135
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line3120>
> >
> >     Can you please show more detail and update the comments here for which case
will cause the `executor->queuedTasks` and `executor->queuedTaskGroups` have same taskId?

We currently store the tasks in queued task groups also in queued tasks.


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.cpp, line 2391
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line2391>
> >
> >     Just a question here, we are killing task in task group, and we can even say
here is killing a task group, but here the status is still `TASK_KILLED`, do we need to introduce
a new `TASKGROUP_KILLED` for this?
> >     
> >     Ditto for other places.

We might consider doing it later but for now this behavior is consistent with the behavior
on the Master.


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.cpp, lines 1988-1989
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1988>
> >
> >     What about adding `executor state` to the log message?

Why? Also, we weren't logging the state before either.


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.cpp, line 1711
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1711>
> >
> >     s/for/to

Why? This looks fine to me.


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.hpp, line 1027
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492260#file1492260line1027>
> >
> >     How about const?

Can you elaborate? Non member functions can't be marked `const`. Also, did you mean `const`
on the return type?


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.cpp, line 1748
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1748>
> >
> >     What about enhance the reason as `Task killed before it was launched due to
one task killed in the task group`?

We haven't been doing such fine grained error messaging on the Master too. We might consider
doing it in the future. I would file a JIRA for posterity.


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.cpp, lines 1825-1847
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1825>
> >
> >     I think the reason is not correct for other tasks in the task group.
> >     
> >     What about sending out TASK_LOST reason separately as following logic:
> >     
> >     if checkpoint failure:
> >       task lost with reason as checkpoint failure
> >       
> >     if kill:
> >       foreach task but not the checkpoint failure task:
> >          task lost with reason as one task fail cause whole task groupp fail

See above comment.


> On Sept. 7, 2016, 10:17 a.m., Guangya Liu wrote:
> > src/slave/slave.cpp, lines 1866-1887
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1866>
> >
> >     ditto as above for updating `reasons` separately

See above comment.


- Anand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51477/#review147994
-----------------------------------------------------------


On Sept. 6, 2016, 9:25 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51477/
> -----------------------------------------------------------
> 
> (Updated Sept. 6, 2016, 9:25 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6076
>     https://issues.apache.org/jira/browse/MESOS-6076
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This changes implements the `runTaskGroup()` handler on the
> agent ensuring that task group is sent atomically to the executor
> via the `LAUNCH_GROUP` event. It also refactors `runTask()`/`_runTask()`
> to go through a common handler function. Also, it ensures that all
> tasks in `framework->pending`/`queuedTasks` that are killed before
> running the task group result in all the tasks being killed.
> 
> Review: https://reviews.apache.org/r/51477/
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 4add4c0180ea56039e0d5009bad4d9346128bde6 
>   src/slave/slave.cpp 11664779ed78c0a5913598bb7dd1bb0e793d6b93 
> 
> Diff: https://reviews.apache.org/r/51477/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message