mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam B" <a...@mesosphere.io>
Subject Re: Review Request 39338: Added code that appends the fetcher log to the agent log upon fetcher failure.
Date Fri, 16 Oct 2015 07:18:17 GMT


> On Oct. 15, 2015, 6:24 a.m., Benjamin Bannier wrote:
> > src/slave/containerizer/fetcher.cpp, line 799
> > <https://reviews.apache.org/r/39338/diff/1/?file=1098840#file1098840line799>
> >
> >     It would probably be better to stream the full message into the `LOG` object
to get the full message into a single block in the block in face of concurrent `LOG` calls
from other threads.

FetcherProcess is a single-threaded actor, so it shouldn't have other threads itself. But
it's logging to the same log as the SlaveProcess, and StatusUpdateManager, so those could
interleave. And do we run multiple FetcherProcesses? It'd be extra confusing to not know which
fetcher log went to which task if two tasks failed simultaneously.
We should probably also log the containerId and fetcher command, so we can differentiate between
different failures and see the failed command without enabling VLOG.
```
LOG(WARNING) << "Begin " << containerId << " fetcher log (stderr in sandbox)
from running: "
             << command << "\n"
             << text.get() << "\n"
             << "End " << containerId << " fetcher log";
```


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39338/#review102777
-----------------------------------------------------------


On Oct. 15, 2015, 6:11 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39338/
> -----------------------------------------------------------
> 
> (Updated Oct. 15, 2015, 6:11 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Ben Mahler, and Till Toenshoff.
> 
> 
> Bugs: MESOS-3743
>     https://issues.apache.org/jira/browse/MESOS-3743
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added an onFailed() clause to the inspection of the fetcher subprocess run. This clause
copies the fetcher log from <task sandbox>/stderr and appends it to the agent log.
> 
> This is to facilitate debugging spurious fetch failures in production or CI.
> 
> Similar, but not the same: https://reviews.apache.org/r/37813/ (see MESOS-3743 for an
explanation).
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/fetcher.cpp 2b2298c329ed5fb5863cb0fed1491e478c3e5d5a 
> 
> Diff: https://reviews.apache.org/r/39338/diff/
> 
> 
> Testing
> -------
> 
> Ran make check. As expected no change in behavior.
> When I modified the fetcher to fail, 
> I observed the expected extra output.
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message