mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anand Mazumdar <mazumdar.an...@gmail.com>
Subject Re: Review Request 45166: Fixed flaky `MasterTest.SlavesEndpointTwoSlaves`.
Date Wed, 23 Mar 2016 15:13:57 GMT


> On March 23, 2016, 6:52 a.m., Adam B wrote:
> > While I'm not afraid of your changes, I can't understand why this broke in the first
place. Although "`FUTURE_PROTOBUF`
> > just guarantees that the method has dispatched but does not ensure that it has completed",
at the point in master when SlaveRegisteredMessage is created/dispatched, the master would
have already completed addSlave to put the slave in the master data structures that the http
endpoints read from. In fact, the only thing left in `_registerSlave()` after sending the
registered ack is just a log message. Pause/settle should clear out the master's message queue,
but what else do we really need to wait on?

Thanks for the review Adam. I explicitly listed the race in the review description now to
make things more clearer. The idea behind doing a `pause/settle` was to ensure that all retried
registration attempts from either slaves are taken care of. But, it looks like it was rather
confusing. I moved the `AWAIT_READY` around since it looked more obvious.


- Anand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45166/#review124974
-----------------------------------------------------------


On March 22, 2016, 5:29 p.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45166/
> -----------------------------------------------------------
> 
> (Updated March 22, 2016, 5:29 p.m.)
> 
> 
> Review request for mesos, Alexander Rojas and Neil Conway.
> 
> 
> Bugs: MESOS-4984
>     https://issues.apache.org/jira/browse/MESOS-4984
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> We were not correctly waiting for the master to register the
> slave before making a call to the `slaves` endpoint. This change
> adds a pause/settle to ensure that. Note that `FUTURE_PROTOBUF`
> just gurrantees that the method has dispatched but does not ensure
> that it has completed.
> 
> 
> Diffs
> -----
> 
>   src/tests/master_tests.cpp d34ba0bdd71efd261850d8c205c16cecb701ac7c 
> 
> Diff: https://reviews.apache.org/r/45166/diff/
> 
> 
> Testing
> -------
> 
> Ran it in a loop. Previously used to fail quite often.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message