mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Downes <ian.dow...@gmail.com>
Subject Re: Review Request 59294: Optionally scale egress bandwidth with CPU.
Date Mon, 19 Jun 2017 19:43:00 GMT


> On May 17, 2017, 2:14 p.m., Jie Yu wrote:
> > src/slave/flags.cpp
> > Lines 770-786 (patched)
> > <https://reviews.apache.org/r/59294/diff/1/?file=1719990#file1719990line770>
> >
> >     This sounds like a heuristic. Any justification why this heuristic? Wondering
if label based solution is better? For instance, the isolator will look for a special label
of the task/executor. The label specifies the egress rate limit which can override the default
rate limit. Something along this line?
> >     
> >     Then, the custom logic can be injected into a label decrorator, rather than
first class it here?
> 
> Ian Downes wrote:
>     It's not really a heuristic, it's a simple linear model with min/max. The major benefit
is that it enables more effective allocation of a host's egress bandwidth without exposing
bandwidth as a resource. A fixed egress bandwidth allocates poorly for either a small number
of very large containers (underutilizing) or a large number of small containers (overcommitting).
Scaling with CPU means a large container can get a larger share of the bandwidth.
>     
>     I thought about a label based solution but this doesn't work well with a heterogenous
cluster. We have a mix of 1G and 10G hosts and we'd like to use different egress_rate_per_cpu
depending on the link speed, e.g., 40 Mbps / core for 1G and 120 Mbps / core for 10 G. The
scheduler doesn't (and shouldn't) know the specifics of hosts beyond resources so unless we
make bandwidth a first class resource I think the logic should be at the isolator. Host bandwidth
could be exposed via an agent attribute but that's *really* breaking the resource abstraction.
> 
> Santhosh Kumar Shanmugham wrote:
>     To add to the point - scaling network bandwidth with CPU is typical in the cloud
and we are just mirroring the same feature here.
>     
>     ` Each core is subject to a 2 Gbits/second (Gbps) cap for peak performance. Each
additional core increases the network cap, up to a theoretical maximum of 16 Gbps for each
virtual machine; however, the actual performance you experience can vary depending on your
workload.`
>     
>     https://cloud.google.com/compute/docs/networks-and-firewalls
> 
> Jie Yu wrote:
>     > The scheduler doesn't (and shouldn't) know the specifics of hosts beyond resources
so unless we make bandwidth a first class resource I think the logic should be at the isolator
>     
>     Scheduler is not the one that sets the label. The label decrocator on the agent will
be the one doing that. Since the label decroator lives in the agent, it can automatically
detect the link speed and set the label properly.

The label decorator could work but the slaveRunTaskLabelDecorator is only on task start. One
of the key benefits of the posted code is that it will update existing containers when this
policy is introduced and then in the future if config is changed, all without restarting containers.
We're going to deploy this code to large running clusters, incrementally increasing the config
values. Rolling the clusters at each increment is not feasible and we don't want out-of-band
tooling separately re-configuring.


- Ian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59294/#review175114
-----------------------------------------------------------


On May 26, 2017, 11:23 a.m., Ian Downes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59294/
> -----------------------------------------------------------
> 
> (Updated May 26, 2017, 11:23 a.m.)
> 
> 
> Review request for mesos, Dmitry Zhuk, Ilya Pronin, and Jie Yu.
> 
> 
> Bugs: MESOS-7508
>     https://issues.apache.org/jira/browse/MESOS-7508
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add support to isolators/port_mapping for optionally scaling egress bandwidth with CPU
and with minimum and maximum limits.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/network/port_mapping.hpp 9d38289c7161d5e931053b587d115684ccc44c94

>   src/slave/containerizer/mesos/isolators/network/port_mapping.cpp cd008aaebcd42554a9a81d2b059269546f59c966

>   src/slave/flags.hpp b66995630f89dfb95a6d0cf66efc5d7590e90cbc 
>   src/slave/flags.cpp 0c8276e425a6a7d22ee68edc6cc25b331635ec44 
>   src/tests/containerizer/port_mapping_tests.cpp d062f2f6bcf7b44dbcde951cdca23b0a2cd42115

> 
> 
> Diff: https://reviews.apache.org/r/59294/diff/2/
> 
> 
> Testing
> -------
> 
> # added a new test 
> $ make check
> 
> 
> Thanks,
> 
> Ian Downes
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message