mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Klaus Ma" <kl...@cguru.net>
Subject Re: Review Request 38003: MESOS-3351 (duplicated slave id in master after master failover)
Date Sat, 05 Sep 2015 03:00:38 GMT


> On Sept. 4, 2015, 7:35 p.m., Vinod Kone wrote:
> >
> 
> Vinod Kone wrote:
>     Also, please make the summary and description more meaningful than just the ticket
ID.

Yes, both summary & description are updated for this fix


> On Sept. 4, 2015, 7:35 p.m., Vinod Kone wrote:
> > src/master/master.cpp, lines 306-317
> > <https://reviews.apache.org/r/38003/diff/1/?file=1061118#file1061118line306>
> >
> >     Just do this.
> >     
> >     ```
> >     
> >     // Master ID is generated randomly based on UUID.
> >     info_.set_id(UUID::random().toString());
> >     
> >     ```

addressed


> On Sept. 4, 2015, 7:35 p.m., Vinod Kone wrote:
> > src/tests/master_tests.cpp, line 3607
> > <https://reviews.apache.org/r/38003/diff/1/?file=1061119#file1061119line3607>
> >
> >     All our comments are expected to be proper sentences, i.e., start with a capital
letter and end with period. Please fix here and everywhere.

addressed


> On Sept. 4, 2015, 7:35 p.m., Vinod Kone wrote:
> > src/tests/master_tests.cpp, lines 3638-3666
> > <https://reviews.apache.org/r/38003/diff/1/?file=1061119#file1061119line3638>
> >
> >     Why do you need to launch a scheduler and task for this test?
> >     
> >     I think you can simplify this test by not launching them.

Agree, scheduler & tasks are not necessary, both of them are removed.


- Klaus


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38003/#review97785
-----------------------------------------------------------


On Sept. 5, 2015, 2:46 a.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38003/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2015, 2:46 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3351
>     https://issues.apache.org/jira/browse/MESOS-3351
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> __Phenomenon:__
> In some race condition, the slave was shutdown when after master failover.
> 
> __Root Cause:__
> The slave was shutdown because of duplicated SlavID: in master, the SlaveID is genereated
by masterInfo.id + "-S" + nextSlaveId; when master failover, nextSlaveId was reset to 0 and
masterInfo.id (generated by date + ip + port + pid) maybe un-changed which lead to duplicated
SlaveID. 
> 
> __Solution/Fix:__
> Generate masterInfo.id by UUID instead of "date + ip + port + pid".
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 5589eca 
>   src/tests/master_tests.cpp 8a6b98b 
> 
> Diff: https://reviews.apache.org/r/38003/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message