mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhitao Li <zhitaoli...@gmail.com>
Subject Re: Review Request 67022: Refreshed checkpointed SlaveInfo if `--reconfiguration_policy==any`.
Date Tue, 11 Sep 2018 17:41:50 GMT


> On Sept. 11, 2018, 10:35 a.m., Benno Evers wrote:
> > src/slave/slave.cpp
> > Lines 1607 (patched)
> > <https://reviews.apache.org/r/67022/diff/1/?file=2018278#file2018278line1607>
> >
> >     If I understand the slave code correctly, I think this would checkpoint the
new state every time we're losing connection to a master (e.g. when the leader changes).
> >     
> >     I'd suggest moving it into `Slave::recover()`, i.e. the place where we set `requiredMasterCapabilities.agentUpdate`
to true in the first place.

The condition to set `requiredMasterCapabilities.agentUpdate` is:

```
     // Check for SlaveInfo compatibility.                                               
                                                                                         
  
     Try<Nothing> _compatible =                                                    
                                                                                         
        
       compatible(slaveState->info.get(), info);                                      
                                                                                         
     
                                                                                         
                                                                                         
  
     if (_compatible.isSome()) {                                                         
                                                                                         
  
       // Permitted change, so we reuse the recovered agent id and reconnect             
                                                                                         
  
       // to running executors.                                                          
                                                                                         
  
                                                                                         
                                                                                         
  
       // Prior to Mesos 1.5, the master expected that an agent would never              
                                                                                         
  
       // change its `SlaveInfo` and keep the same slave id, and therefore would         
                                                                                         
  
       // not update it's internal data structures on agent re-registration.             
                                                                                         
  
       if (!(slaveState->info.get() == info)) {                                       
                                                                                         
     
         requiredMasterCapabilities.agentUpdate = true;                                  
                                                                                         
  
       }       
```

So if slave info did not change (i.e, a simple leader change), `requiredMasterCapabilities.agentUpdate`
should be `false` and we would not update the checkpointed `SlaveInfo`.

I confirmed that in our agent log, but we can also create a new unit test to cover that case.
wdyt?


- Zhitao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67022/#review208520
-----------------------------------------------------------


On May 8, 2018, 5:40 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67022/
> -----------------------------------------------------------
> 
> (Updated May 8, 2018, 5:40 p.m.)
> 
> 
> Review request for mesos, Eric Chung and Jason Lai.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Refreshed checkpointed SlaveInfo if `--reconfiguration_policy==any`.
> 
> 
> Diffs
> -----
> 
>   src/slave/flags.cpp a319b5ea633c41fd8a252c5e1617ac52d1480ba5 
>   src/slave/slave.cpp c6d9152d9de4184f9107bb8242b41d468d76e018 
> 
> 
> Diff: https://reviews.apache.org/r/67022/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message