mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Downes <ian.dow...@gmail.com>
Subject Re: Review Request 59294: Optionally scale egress bandwidth with CPU.
Date Wed, 24 May 2017 20:38:08 GMT


> On May 17, 2017, 2:14 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/port_mapping.cpp
> > Lines 586-588 (patched)
> > <https://reviews.apache.org/r/59294/diff/1/?file=1719988#file1719988line586>
> >
> >     Instead of shelling out, i'd say we just introduce support in the nl library.
IN fact, we already have a patch chain starts here to support that
> >     https://reviews.apache.org/r/45605/

Definitely agree that's a better approach but I'd like to get this change in first. I do plan
to revive those reviews so that shaping can be moved out to the host side to enable token
sharing but that's a much larger code change and also a much harder deploy: this patch will
modify running containers in place with the new policy so is easy to deploy.


> On May 17, 2017, 2:14 p.m., Jie Yu wrote:
> > src/slave/flags.cpp
> > Lines 770-786 (patched)
> > <https://reviews.apache.org/r/59294/diff/1/?file=1719990#file1719990line770>
> >
> >     This sounds like a heuristic. Any justification why this heuristic? Wondering
if label based solution is better? For instance, the isolator will look for a special label
of the task/executor. The label specifies the egress rate limit which can override the default
rate limit. Something along this line?
> >     
> >     Then, the custom logic can be injected into a label decrorator, rather than
first class it here?

It's not really a heuristic, it's a simple linear model with min/max. The major benefit is
that it enables more effective allocation of a host's egress bandwidth without exposing bandwidth
as a resource. A fixed egress bandwidth allocates poorly for either a small number of very
large containers (underutilizing) or a large number of small containers (overcommitting).
Scaling with CPU means a large container can get a larger share of the bandwidth.

I thought about a label based solution but this doesn't work well with a heterogenous cluster.
We have a mix of 1G and 10G hosts and we'd like to use different egress_rate_per_cpu depending
on the link speed, e.g., 40 Mbps / core for 1G and 120 Mbps / core for 10 G. The scheduler
doesn't (and shouldn't) know the specifics of hosts beyond resources so unless we make bandwidth
a first class resource I think the logic should be at the isolator. Host bandwidth could be
exposed via an agent attribute but that's *really* breaking the resource abstraction.


- Ian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59294/#review175114
-----------------------------------------------------------


On May 15, 2017, 1:56 p.m., Ian Downes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59294/
> -----------------------------------------------------------
> 
> (Updated May 15, 2017, 1:56 p.m.)
> 
> 
> Review request for mesos, Dmitry Zhuk, Ilya Pronin, Jie Yu, and Santhosh Kumar Shanmugham.
> 
> 
> Bugs: MESOS-7508
>     https://issues.apache.org/jira/browse/MESOS-7508
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add support to isolators/port_mapping for optionally scaling egress bandwidth with CPU
and with minimum and maximum limits.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/network/port_mapping.hpp 9d38289c7161d5e931053b587d115684ccc44c94

>   src/slave/containerizer/mesos/isolators/network/port_mapping.cpp cd008aaebcd42554a9a81d2b059269546f59c966

>   src/slave/flags.hpp e5784ef81ad0720c7ec061ee0b28b8fadae77afd 
>   src/slave/flags.cpp bc63a6a4cb6115b4b4d592e67e34045f52b50d4c 
>   src/tests/containerizer/port_mapping_tests.cpp a528382e8b4831b9c7e8dcc877a5e242909f0cd5

> 
> 
> Diff: https://reviews.apache.org/r/59294/diff/1/
> 
> 
> Testing
> -------
> 
> # added a new test 
> $ make check
> 
> 
> Thanks,
> 
> Ian Downes
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message