mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.
Date Tue, 04 Jun 2019 20:42:12 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215689
-----------------------------------------------------------



I think it is a little unfortunate that we end up with a conditional `select()` and an extra
chaining step in this endpoint.  But since we aren't saving the vector of keys/futures/statistics
anywhere, the chaining is mostly unavoidable.

Since this adds some significant logic, I would like there to be at least an addition to the
tests.  We can't read the resulting log line, but we can at least exercise the code.  For
example, copying this chunk of existing test code and increasing the in-test timeout to 10+
seconds:

```
TEST_F(MetricsTest, THREADSAFE_SnapshotTimeout)
{
  ...
  
  // Get the snapshot.
  Future<Response> response = http::get(upid, "snapshot", "timeout=2secs"); // <-----------------
Increase this timeout.

  // Make sure the request is pending before the timeout is exceeded.
  //
  // TODO(neilc): Replace the `sleep` here with a less flaky
  // synchronization method.
  os::sleep(Milliseconds(10));
  Clock::settle();

  ASSERT_TRUE(response.isPending());

  // Advance the clock to trigger the timeout.
  Clock::advance(Seconds(2)); // <----------------------------- And this timeout.

  AWAIT_EXPECT_RESPONSE_STATUS_EQ(OK().status, response);

  // Parse the response.
  Try<JSON::Object> responseJSON = JSON::parse<JSON::Object>(response->body);
  ASSERT_SOME(responseJSON);

  // We can't use simple JSON equality testing here as initializing
  // libprocess adds metrics to the system. We want to only check if
  // the metrics from this test are correctly handled.
  map<string, JSON::Value> values = responseJSON->values;

  EXPECT_EQ(1u, values.count("test/counter"));
  EXPECT_DOUBLE_EQ(0.0, values["test/counter"].as<JSON::Number>().as<double>());

  EXPECT_EQ(1u, values.count("test/gauge"));
  EXPECT_DOUBLE_EQ(42.0, values["test/gauge"].as<JSON::Number>().as<double>());

  EXPECT_EQ(0u, values.count("test/gauge_fail"));
  EXPECT_EQ(0u, values.count("test/gauge_timeout"));
```


3rdparty/libprocess/src/metrics/metrics.cpp
Lines 40 (patched)
<https://reviews.apache.org/r/70783/#comment302499>

    It would be nice to mention here (or in the `logSlowMetrics()` header definition + commit
description) that this timeout does not cancel snapshotting of any metric.  It merely adds
an informative log line, before continuing with snapshotting as normal.


- Joseph Wu


On June 4, 2019, 7:54 a.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> -----------------------------------------------------------
> 
> (Updated June 4, 2019, 7:54 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch adds logging which, after a hard-coded delay, prints
> the keys of any metrics which have not yet become ready
> following a request for the metrics snapshot.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 75711edbaf46797e5eb54ba720ea11cf3de81522

>   3rdparty/libprocess/src/metrics/metrics.cpp 623d44adbe838f995ddbe89ee26f5bcc9c600be5

> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message