mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jie Yu <yujie....@gmail.com>
Subject Re: Review Request 63732: Reconciled offer operations between agent and master.
Date Mon, 27 Nov 2017 20:03:50 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63732/#review191911
-----------------------------------------------------------




src/master/master.cpp
Lines 7119 (patched)
<https://reviews.apache.org/r/63732/#comment269852>

    Can you add a comment saying that if the key is None, it means the agent default resources
that have no resource provider id.



src/master/master.cpp
Lines 7128-7129 (patched)
<https://reviews.apache.org/r/63732/#comment269845>

    Please use `Resources::hasResourceProvider` which has more CHECKs.



src/master/master.cpp
Lines 7211-7218 (patched)
<https://reviews.apache.org/r/63732/#comment269854>

    i feel this is unnecessary. It should be the responsibility of the agent to convert the
operation to post refinement format if it is every downgraded before checkpointing.
    
    We can add some CHECK here.



src/master/master.cpp
Lines 7231-7244 (patched)
<https://reviews.apache.org/r/63732/#comment269857>

    Hum, what if this is a failover master and the UpdateSlaveMessage is a result of an LRP
re-registration with the agent. In that case, master does not know about the operation but
the agent does?
    
    Also, i would be future proof and does not check the latest state of the operation. It's
possible that the operaiton is terminal but still waiting for ack from the framework, and
master will still maintain that operation. I would still check the invariant that if the master
knows about the RP, then it should know more operation than the agent.
    
    There might be one edge case worth documenting: master receives an ack from the framework,
and removed the operation and send an Ack message to the agent. The agent didn't receive the
ack yet before it sends an UpdateSlaveMessage. In that case, what we should do? Adding back
the operation does not sound right because the agent will then delete the operation on receiving
ack. Then we have an inconsistency between master and agent.



src/master/master.cpp
Lines 7302-7308 (patched)
<https://reviews.apache.org/r/63732/#comment269863>

    We do want to support that when a CSI plugin detects a chagne to the total resources.



src/master/master.cpp
Lines 7352-7362 (patched)
<https://reviews.apache.org/r/63732/#comment269864>

    This should go to addOfferOperation.



src/master/master.cpp
Lines 7389-7403 (patched)
<https://reviews.apache.org/r/63732/#comment269870>

    That's `Slave::recoverResources(operation)`?


- Jie Yu


On Nov. 24, 2017, 2:50 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63732/
> -----------------------------------------------------------
> 
> (Updated Nov. 24, 2017, 2:50 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Jan Schlicht.
> 
> 
> Bugs: MESOS-8207
>     https://issues.apache.org/jira/browse/MESOS-8207
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 53263e499d88b906b6406c24c0dfb737e589e813 
> 
> 
> Diff: https://reviews.apache.org/r/63732/diff/7/
> 
> 
> Testing
> -------
> 
> `make check`, tested as part of https://reviews.apache.org/r/63843/.
> 
> This patch requires `protobuf::isSpeculativeOperation` from https://reviews.apache.org/r/63751/
which is _not_ part of this review chain (to avoid multiple dependent RRs).
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message