mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Mann <g...@mesosphere.io>
Subject Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.
Date Wed, 05 Jun 2019 15:42:14 GMT


> On June 5, 2019, 1:05 a.m., Benjamin Mahler wrote:
> > 3rdparty/libprocess/src/metrics/metrics.cpp
> > Lines 199-207 (patched)
> > <https://reviews.apache.org/r/70783/diff/1/?file=2147733#file2147733line199>
> >
> >     If we consolidate the logging between user timeout and slow timeout can we have
one path?
> >     
> >     ```
> >       Future<Nothing> waited = <all futures ready OR user timeout OR slow
timeout>;
> >     
> >       return waited
> >         .then(defer(self(), &Self::_snapshotHandler, ...));
> >     }
> >       
> >     _snapshotHandler(...)
> >     {
> >       // LOG(INFO) not ready futures after timeout (user or built-in SLOW one)
> >       ...
> >         
> >       Future<Nothing> waited = <all futures ready OR user timeout>;
> >     
> >       return waited
> >         .then(defer(self(), &Self::__snapshotHandler, ...));
> >     }
> >       
> >     __snapshotHandler(...)
> >     {
> >       ...
> >       return response;
> >     }
> >     ```
> >     
> >     We would log those not ready after either the user provided one, or the built-in
slow one, whichever is smaller (I guess we can use a min() for that too if we want). The downside
here seems to be that if a user specifies a small one and still lots of pull gauges are used,
will lead to a lot of INFO logging. Not sure how much we should worry about this case.
> >     
> >     We should probably name the endpoint handler functions and snapshot map function
differently or house the handler ones in an 'Http' sub-struct, a bit confusing right now.

I had considered a similar option previously but ended up opting for a more minimal diff.
I wrote a new solution along these lines and although the patch is now much more complex,
I think the end result is cleaner. The complexity largely arises from the fact that if the
user specifies a timeout longer than the hard-coded timeout, then we can potentially execute
the logging code twice. Let me know what you think.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215693
-----------------------------------------------------------


On June 5, 2019, 3:39 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> -----------------------------------------------------------
> 
> (Updated June 5, 2019, 3:39 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added debug logging for metrics which are slow to become ready.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 75711edbaf46797e5eb54ba720ea11cf3de81522

>   3rdparty/libprocess/src/metrics/metrics.cpp 623d44adbe838f995ddbe89ee26f5bcc9c600be5

>   3rdparty/libprocess/src/tests/metrics_tests.cpp 881275693e67f3c9fb670c7e70cb5014090ed7a5

> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message