mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 72305: Sent appropriate task status reason when task over memory request.
Date Thu, 02 Apr 2020 17:13:28 GMT


> On April 2, 2020, 6:27 a.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp
> > Lines 715 (patched)
> > <https://reviews.apache.org/r/72305/diff/2/?file=2216777#file2216777line715>
> >
> >     We already get max memory usage at L671, so I think we should directly use it
rather than getting usage here.

I'm concerned that using the max usage with result in many false positives, where we send
REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED when it's not correct. A container may exceed its
memory request at one point in time, leading to 'max_usage > soft_limit', but that doesn't
mean it was using that much memory at the time it was OOM-killed.

My rationale for using 'usage_in_bytes' is that while there is some uncertainty in that value,
I prefer that race to the false positives which would be caused by relying on 'max_usage_in_bytes'.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72305/#review220183
-----------------------------------------------------------


On April 2, 2020, 5:10 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72305/
> -----------------------------------------------------------
> 
> (Updated April 2, 2020, 5:10 p.m.)
> 
> 
> Review request for mesos and Qian Zhang.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When a container is OOM-killed and its memory usage is over its
> soft memory limit but below its hard memory limit, then we send
> schedulers REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED to indicate
> that the scheduler's task was preferentially OOM-killed because
> it had exceeded its memory request.
> 
> 
> Diffs
> -----
> 
>   src/common/protobuf_utils.cpp 723d85a8656e61f77ab99e5e63f844ec95303ff0 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 15f87ba8c0a1b44fb3380beb0e739af566ab08fc

> 
> 
> Diff: https://reviews.apache.org/r/72305/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message