mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 70368: Initialized resource provider manager earlier when recovering.
Date Tue, 09 Apr 2019 12:24:57 GMT


> On April 9, 2019, 11:04 a.m., Chun-Hung Hsiao wrote:
> > src/tests/slave_tests.cpp
> > Lines 11750 (patched)
> > <https://reviews.apache.org/r/70368/diff/2/?file=2137991#file2137991line11750>
> >
> >     Would it be reasonable to wait for a `TASK_LOST` for now and add a TODO to change
it later?

Getting the this particular task status does currently not add too much value I think. I created
MESOS-9711 for addressing the executor termination and added a `TODO` here. Once that fix
has landed we should revisit the task status.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70368/#review214490
-----------------------------------------------------------


On April 9, 2019, 2:24 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70368/
> -----------------------------------------------------------
> 
> (Updated April 9, 2019, 2:24 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Greg Mann.
> 
> 
> Bugs: MESOS-9667
>     https://issues.apache.org/jira/browse/MESOS-9667
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When recovering and reusing the same agent ID the resource provider
> manager can be initialized before e.g., recovering executors. This patch
> move the initialization to such an earlier point. This e.g., allows to
> successfully publish resources via the manager when HTTP-based executors
> resubscribe which previously ran into an assertion failure.
> 
> If the agent ID is not reused we still need to wait for the agent to
> register with the master which would assign an agent ID. In that case we
> do not expect any executors to resubscribe.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 5373cee5d30c2403497939eeba2ee5405117237e 
>   src/tests/slave_tests.cpp 528a25a837513f153de2a5e89897440144385633 
> 
> 
> Diff: https://reviews.apache.org/r/70368/diff/3/
> 
> 
> Testing
> -------
> 
> * `make check`
> * the test fails without the agent change
> * ran the test for 17000 iterations without failures (failure rate <1% with 66% certainty)
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message