mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhang...@cn.ibm.com>
Subject Re: Review Request 49851: Implemented `MemorySubsystem`.
Date Mon, 08 Aug 2016 05:18:39 GMT


> On July 27, 2016, 10:05 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystem.cpp, line 494
> > <https://reviews.apache.org/r/49851/diff/5/?file=1453262#file1453262line494>
> >
> >     How do we recover this field `updatedLimit`? I mean during agent recovery, `updatedLimit`
of each Info will be reset to `false`, right? But I think that is not correct.
> 
> haosdent huang wrote:
>     I think this problem exists at `mem.cpp` as well, and update the limit twice here
does not bring problems here. So I drop this. Feel free to reopen this if you think it would
bring problems here.
> 
> Qian Zhang wrote:
>     Yes, it is also a bug in `mem.cpp` which uses `info->pid` to check if it is the
first time to update the memory limit, but the value of `info->pid` will also be lost during
agent recovery.
>     
>     And I think updating the memory hard limit twice will bring some problems, as mentioned
in this comments (https://github.com/apache/mesos/blob/1.0.0/src/slave/containerizer/mesos/isolators/cgroups/mem.cpp#L369:#L371),
decreasing memory hard limit may induce an OOM.
>     
>     I had a discussion with Jie on this issue, and his suggestion is that we should not
use such kind of field (e.g., `pid` or `updatedLimit`) because they can not be recovered,
instead, before updating the hard limit, we should check if its current value is still the
initial value (should be -1? And it may be different in different Linux distributions, so
you may need to double check it), if it is, then that means it is the first time to update
the hard limit, then we can just update it, if it is not, then that means it is not the first
time, then we can only increase it but not decrease it.
> 
> haosdent huang wrote:
>     Thank you for your help. The initial value of limit_in_bytes is -1 (unlimited). Let's
update to check by this.
> 
> haosdent huang wrote:
>     Refer to http://lxr.free-electrons.com/source/kernel/res_counter.c?v=3.10 , http://lxr.free-electrons.com/source/kernel/res_counter.c?v=3.12
and http://lxr.free-electrons.com/source/include/linux/page_counter.h 
>     
>     There are two possible value of the initail values of `limit_in_bytes`: 9223372036854775807
(LONG_MAX) and 18446744073709551615 (ULONG_MAX).

I see the third value of it in Ubuntu 16.04:
```
cat /sys/fs/cgroup/memory/memory.limit_in_bytes                                  
9223372036854771712
```
We need to cover it as well?


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49851/#review143517
-----------------------------------------------------------


On Aug. 7, 2016, 12:26 p.m., haosdent huang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49851/
> -----------------------------------------------------------
> 
> (Updated Aug. 7, 2016, 12:26 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5045
>     https://issues.apache.org/jira/browse/MESOS-5045
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Implemented `MemorySubsystem`.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp b191b2a52a9645fc902a35ed52909b2142f0b4c0

>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 2659252d8cffcefc233bc85fb4707c8147272737

>   src/slave/containerizer/mesos/isolators/cgroups/constants.hpp c45d88092f3fe497373dfeaa8346aef9126c7b8b

>   src/slave/containerizer/mesos/isolators/cgroups/subsystem.hpp 5f52a076a1fa3a21d886cb961ddeed5046a38d7c

>   src/slave/containerizer/mesos/isolators/cgroups/subsystem.cpp a30ecafcbecc9d3b6eeea2b04dcb4d278750af41

> 
> Diff: https://reviews.apache.org/r/49851/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> haosdent huang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message