mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 71858: WIP: Set resource limits when launching executor container.
Date Wed, 15 Jan 2020 02:51:38 GMT


> On Jan. 7, 2020, 9:43 p.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 3140-3201 (patched)
> > <https://reviews.apache.org/r/71858/diff/4/?file=2191535#file2191535line3140>
> >
> >     I think maybe this logic could be easier to read if we do something like:
> >     
> >     ```
> >     auto limitsAreSet = [](const vector<TaskInfo>& tasks) {
> >       foreach (const TaskInfo& task, tasks) {
> >         if (!task.limits().empty()) {
> >           return true;
> >         }
> >       }
> >       
> >       return false;
> >     };
> >     
> >     Option<Map<string, Value::Scalar>> executorLimits;
> >     if (limitsAreSet(tasks)) {
> >       executorLimits = Map<string, Value::Scalar>();
> >       foreach (const TaskInfo& task, tasks) {
> >         // Add up task limits/requests here.
> >       }
> >     }
> >     ```
> >     
> >     What do you think?
> 
> Qian Zhang wrote:
>     I am a bit confused how it will simplify the logic here, like: how will you do the
`Add up task limits/requests here`? I guess you still need the code from L3140 to L3201, right?
> 
> Greg Mann wrote:
>     Ah sorry, before I answer your question I have another one: currently, your code
will only set the executor's cpu limit if one or more tasks have a cpu limit set, and will
only set the executor's mem limit if one or more tasks have the memory limit set. However,
I think we also want to set the cpu limit if one or more tasks has a _memory_ limit set, and
we want to set the memory limit if one or more tasks has a _cpu_ limit set, right? This way,
if a single task under an executor sets either a cpu or memory limit, then all tasks will
have both the cpu and memory limit set (and if it wasn't specified for a particular task,
it will be set to the default, which is the value of the request).
> 
> Qian Zhang wrote:
>     So if a framework launches two tasks (t1 and t2) with a same executor, t1 has CPU
limit specified but no memory limit specified, t2 has both of the CPU and memory limits not
specified, then we should not only set CPU hard limit but also the memory hard limit (i.e.,
set it to t1's memory request + t2's memory request) in the executor container's cgroups,
right? I think we have already done it because executor's resource requests always includes
all task's resource requests (see L3209 in this patch https://reviews.apache.org/r/71858/diff/6
), and in the memory cgroups (`memory.cpp`) we will set the executor container's memory hard
limit (`memory.limit_in_bytes`) to its memory request if its memory limit is not specified
(see https://reviews.apache.org/r/71943/ for details).
>     
>     And similarly if t1 has memory limit specified but no CPU limit specified, in the
CPU cgroups we will set the executor container's CPU hard limit (CFS quota) to the executor's
CPU request if `--cgroups_enable_cfs` is true.
> 
> Greg Mann wrote:
>     > I think we have already done it because executor's resource requests always
includes all task's resource requests (see L3209 in this patch https://reviews.apache.org/r/71858/diff/6
)
>     
>     That works when the executor is first launched, but will it be updated when additional
task groups are sent to the same executor?
>     
>     > And similarly if t1 has memory limit specified but no CPU limit specified, in
the CPU cgroups we will set the executor container's CPU hard limit (CFS quota) to the executor's
CPU request if --cgroups_enable_cfs is true.
>     
>     If t1 has a memory limit specified, then don't we want to set the CFS quota regardless
of whether or not the '--cgroups_enable_cfs' flag is set?
> 
> Qian Zhang wrote:
>     > That works when the executor is first launched, but will it be updated when
additional task groups are sent to the same executor?
>     
>     Yes, see https://reviews.apache.org/r/71952/ for details.
>     
>     > If t1 has a memory limit specified, then don't we want to set the CFS quota
regardless of whether or not the '--cgroups_enable_cfs' flag is set?
>     
>     I do not think we want to do it. In that case, there is no CPU limit specified for
both t1 and t2 (which means user does not want CPU hard limit set) and CFS quota is disabled
via the `--cgroups_enable_cfs` flag, so why do we want to set CFS quota when user does not
want it?
>     
>     For memory limit, it is a bit different, the original behavior is we always set both
soft limit and hard limit to the memory allocated to the task (i.e. memory request) when launching
a container and there is no agent flag to control it. So to keep backward compatibility, we
want to set memory hard limit even there is no memory limit specified in the task (i.e., set
hard limit to memory request). And what we want to change here is, setting hard limit to a
value which is larger than soft limit if memory limit is specified in the task.

In the design doc, the table outlining different states of 'shared_cgroups' says that when
'shared_cgroups = false', we would always set the hard limits for both CPU and memory cgroups,
even if the user has not set limits. If we still go with that approach, then I think we would
ignore the `--cgroups_enable_cfs` flag in the above case of t1.

Is there a good reason we should pay attention to the `--cgroups_enable_cfs` flag when 'shared_cgroups
= false'? It seems intuitive to me that opting in to per-nested-container cgroups means that
hard limits will also be set for those cgroups, WDYT?


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219154
-----------------------------------------------------------


On Jan. 8, 2020, 2:40 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Jan. 8, 2020, 2:40 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> WIP: Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 77b5bc0082c6bb73fbd48a2ebe812629921645cb 
>   src/slave/slave.cpp 3839a120446339fea8aa857f431a2dba28ed4002 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/6/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message