mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guangya Liu <gyliu...@gmail.com>
Subject Re: Review Request 44258: Fixed http endpoint trigger two inverse offer calls.
Date Sat, 05 Mar 2016 00:34:44 GMT


> On March 4, 2016, 8:28 p.m., Joris Van Remoortere wrote:
> > src/master/http.cpp, lines 2026-2027
> > <https://reviews.apache.org/r/44258/diff/3/?file=1280368#file1280368line2026>
> >
> >     Can you explain how machines going from `UP` to `DOWN` are handled in the next
loop?
> >     I see logic for `UP` to `DRAINING` in the next loop.
> >     
> >     Also missing a backtick after `UP`

My bad, it is from `UP` to `DRAINING`.


> On March 4, 2016, 8:28 p.m., Joris Van Remoortere wrote:
> > src/master/http.cpp, lines 2031-2033
> > <https://reviews.apache.org/r/44258/diff/3/?file=1280368#file1280368line2031>
> >
> >     For some of these early exit conditions, does it make sense to add `CHECK`s
(and maybe event comments) to document why we are exiting?
> >     Stating *that* we are exiting less helpful to readers than *why*.
> >     
> >     I think the implied invariant here (which we should call out explicitly) is
that any machine should only be "touched" by 1 of the 2 loops here. The exit conditions between
them are meant to enforce this exclusion?

I was adding some comments to address any machine should only be "touched" by 1 of the 2 loops
here. Can you please show more for how we can add `CHECK` here? I did not found a good way
to add `CHECK` for here but only by checking via some `if` conditions.


- Guangya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44258/#review122146
-----------------------------------------------------------


On March 4, 2016, 2:10 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44258/
> -----------------------------------------------------------
> 
> (Updated March 4, 2016, 2:10 a.m.)
> 
> 
> Review request for mesos, Anand Mazumdar, Joris Van Remoortere, and Joseph Wu.
> 
> 
> Bugs: MESOS-4831
>     https://issues.apache.org/jira/browse/MESOS-4831
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> There is a bug when setting host maintain with http endpoint: https://github.com/apache/mesos/blob/master/src/master/http.cpp#L1987-L2021
> The logic is as this:
> 1) Get all host list from maintain window and put it to updated hashmap.
> 2) If the machine in was in updated was also in master->machines, call master updateUnavailability
to trigger recoverResources, updateUnavailability etc in allocator
> 3) Otherwise, clear the unavailabity time window for the machine.
> 4) Update each new machines in updated to call master updateUnavailability
> 
> But the logic in step 4) is getting all machines from the schedule windows but not the
machines that is new to the cluster, this caused master get two updateUnavailability calls
for a machine in the updated hashmap.
> 
> The fix is filter machines in updated hashmap when handling new machines.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 8276baa538eb4d2aaf54cc1aa516bffaadacc4dd 
>   src/tests/master_maintenance_tests.cpp 3faa8136cf57276295553910319480028f433e4c 
> 
> Diff: https://reviews.apache.org/r/44258/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
>  ./bin/mesos-tests.sh --gtest_filter="MasterMaintenanceTest.*" --verbose
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message