mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Klaus Ma <klaus1982...@gmail.com>
Subject Re: Review Request 44258: Fixed http endpoint trigger two inverse offer calls.
Date Wed, 02 Mar 2016 14:43:08 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44258/#review121643
-----------------------------------------------------------




src/master/http.cpp (lines 1999 - 2000)
<https://reviews.apache.org/r/44258/#comment183392>

    I think we can just remove `master->updateUnavailability(id, updated[id]);` here, so
other machine will `UP` and scheduler will be updated in next loop.
    
    Futhur more, we can avoid the loop for update. The draft code will be:
    
    ```
    hashmap<MachineID, Unavailability> updated;
    
    // delete the loop for "updated[id] = window.unavailability();"
    
    foreach (const mesos::maintenance::Window& window, schedule.windows()) {
      foreach (const MachineID& id, window.machine_ids()) {
        ...
        master->updateUnavailability(id, window.unavailability());
        updated[id] = window.unavailability();
      }
    }
    
    foreachkey (const MachineID& id, utils::copy(master->machines)) {
      if (updated.contains(id)) {
        continue;
      }
    
      // Transition each removed machine back to the `UP` mode and remove the
      // unavailability.
      master->machines[id].info.set_mode(MachineInfo::UP);
      master->updateUnavailability(id, None());
    }
    ```


- Klaus Ma


On March 2, 2016, 3:45 p.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44258/
> -----------------------------------------------------------
> 
> (Updated March 2, 2016, 3:45 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar, Joris Van Remoortere, and Joseph Wu.
> 
> 
> Bugs: MESOS-4831
>     https://issues.apache.org/jira/browse/MESOS-4831
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> There is a bug when setting host maintain with http endpoint: https://github.com/apache/mesos/blob/master/src/master/http.cpp#L1987-L2021
> The logic is as this:
> 1) Get all host list from maintain window and put it to updated hashmap.
> 2) If the machine in was in updated was also in master->machines, call master updateUnavailability
to trigger recoverResources, updateUnavailability etc in allocator
> 3) Otherwise, clear the unavailabity time window for the machine.
> 4) Update each new machines in updated to call master updateUnavailability
> 
> But the logic in step 4) is getting all machines from the schedule windows but not the
machines that is new to the cluster, this caused master get two updateUnavailability calls
for a machine in the updated hashmap.
> 
> The fix is filter machines in updated hashmap when handling new machines.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 5e9e28e904ba0045ee27eb828f47231632a91d74 
>   src/tests/master_maintenance_tests.cpp 3faa8136cf57276295553910319480028f433e4c 
> 
> Diff: https://reviews.apache.org/r/44258/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
>  ./bin/mesos-tests.sh --gtest_filter="MasterMaintenanceTest.*" --verbose
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message