mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gastón Kleiman <gas...@mesosphere.io>
Subject Re: Review Request 69978: Added garbage collection of terminated operations status update streams.
Date Thu, 21 Feb 2019 00:21:32 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69978/#review212999
-----------------------------------------------------------




src/slave/slave.cpp
Line 4711 (original), 4735 (patched)
<https://reviews.apache.org/r/69978/#comment298854>

    I think the patch is good as-is — if we add a `removeOperation()` call in the `err()`
helper, the framework won't be able to ack the update in the case that there is an error calling
`operationStatusUpdateManager.acknowledgement()` which prevents the SUM from checkpointing
the ack and cleaning up the stream, causing it to resend the status update.
    
    That means that the agent would keep resending the update until the stream is garbage
collected the next time the agent recovers.
    
    Your comment however made me notice another corner case that the SLRP handles and I had
missed: the agent could fail over right before executing this lambda. It should notice that
during recovery and garbage collect the stream — I'm adding a new patch to this chain that
does this.


- Gastón Kleiman


On Feb. 13, 2019, 3:15 p.m., Gastón Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69978/
> -----------------------------------------------------------
> 
> (Updated Feb. 13, 2019, 3:15 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Greg Mann.
> 
> 
> Bugs: MESOS-9574
>     https://issues.apache.org/jira/browse/MESOS-9574
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Make the agent garbage collect an operation status update stream once
> a terminal status update is acknowledged.
> 
> This patch also improves the logging of acknowledgement failures and the
> readability of the `Slave::operationStatusAcknowledgement` method.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp e3c2c005d865b5c333e92e50e49ef398fe06ad79 
> 
> 
> Diff: https://reviews.apache.org/r/69978/diff/2/
> 
> 
> Testing
> -------
> 
> Manual testing + existing tests still pass.
> 
> 
> Thanks,
> 
> Gastón Kleiman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message