mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anand Mazumdar <an...@apache.org>
Subject Re: Review Request 51477: Implemented `RunTaskGroupMessage` handler on the agent.
Date Wed, 07 Sep 2016 22:10:41 GMT


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1616
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1616>
> >
> >     couldn't this have been?
> >     
> >     ```
> >     for (auto it = tasks.begin(); it != tasks.end(); ++it)
> >     ```

Was trying to be consistent with the loop a couple of lines before. Would file another CR
to clean it up too! https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1562


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1711
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1711>
> >
> >     you say "other" tasks but you send TASK_KILLED for "all" tasks in the group.
can you remind me why we can't send TASK_KILLED for the killed task in `killTask()` right
away and send TASK_KILLED for the rest of the tasks in the group here, similar to what we
do in `Master::_accept()`?

I don't know _yet_ the reasoning for us doing this in `Master::_accept()`. It is pretty non-intuitive
and leads to two code paths where we send the status update rather then just doing them all
in one place as the current code in this CR does. 

Also, there is a relevant `TODO` in the code that wants to do the opposite in the Master!

```// TODO(bmahler): Do this killing when processing
        // the `Kill` call, rather than doing it here.
```

I don't think we even have enough information in `Master::killTask()` to do this unless we
start storing task group information on the Master. I would change the `TODO` in the master
to reflect this i.e. send all the status updates in `_accept()` similar to what we are doing
on the agent.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1752
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1752>
> >
> >     back ticks instead of quotes.

Other places in this function use quotes for this comment. We might want to do a sweep later.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1903
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1903>
> >
> >     back ticks.

It seems weird that we have a quote and a back tick in the same comment! We might want to
do a clean sweep later.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.hpp, lines 327-329
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492260#file1492260line327>
> >
> >     kill this comment.

Would do in a separate CR.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 1712
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line1712>
> >
> >     if you follow `Master::_accept()` this should be a hashset of TaskIDs.

See earlier comment.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 2182
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line2182>
> >
> >     but `killTask` is not sending a TASK_KILLED update anymore?

It does so still for queued tasks.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 2332-2334
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line2332>
> >
> >     s/bigger//
> >     
> >     can you remind me why we don't send TASK_KILLED here? AFAICT `_run()` has information
about the task group based on the argument passed to it, even if `framework->pending` doesn't
itself have that info. if there is no good reason, lets make this consistent with how we do
it in the master.

See earlier comment.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 3137-3159
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line3137>
> >
> >     it's unfortunate that you have to make 2 duplicate containerizer->update
calls here. is there any reason why `__run` work with both tasks and taskGroups arguments
set?

Good Suggestion. The only reason I made `__run()` to not work with both tasks and task groups
was to maintain consistency with `run()/_run()`. Looks like not doing that would allow us
to get rid of this redundant code.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 2115
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line2115>
> >
> >     i think you need to also insert a space after each task group? it's probably
best to store taskgroups as a vector of vectors and call stringify.

I modified it a bit to have a `,` after every task group.


> On Sept. 7, 2016, 10:26 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 3123-3126
> > <https://reviews.apache.org/r/51477/diff/3/?file=1492261#file1492261line3123>
> >
> >     you couldn't use the default assignment operator for LinkedHashMap?

hmm, seems like I ran into some issues with using the default assignment operator. It seems
that we store the iterator in the values array too and we would copy those pointers with the
assignment operator? I did not have the time to investigate but I would follow up with a JIRA
later.


- Anand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51477/#review147990
-----------------------------------------------------------


On Sept. 6, 2016, 9:25 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51477/
> -----------------------------------------------------------
> 
> (Updated Sept. 6, 2016, 9:25 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6076
>     https://issues.apache.org/jira/browse/MESOS-6076
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This changes implements the `runTaskGroup()` handler on the
> agent ensuring that task group is sent atomically to the executor
> via the `LAUNCH_GROUP` event. It also refactors `runTask()`/`_runTask()`
> to go through a common handler function. Also, it ensures that all
> tasks in `framework->pending`/`queuedTasks` that are killed before
> running the task group result in all the tasks being killed.
> 
> Review: https://reviews.apache.org/r/51477/
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 4add4c0180ea56039e0d5009bad4d9346128bde6 
>   src/slave/slave.cpp 11664779ed78c0a5913598bb7dd1bb0e793d6b93 
> 
> Diff: https://reviews.apache.org/r/51477/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message