mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Megha Sharma <mshar...@apple.com>
Subject Re: Review Request 64098: Send status updates when agent re-registers.
Date Thu, 30 Nov 2017 19:07:44 GMT


> On Nov. 28, 2017, 7:01 p.m., Ilya Pronin wrote:
> > src/master/master.cpp
> > Lines 6789 (patched)
> > <https://reviews.apache.org/r/64098/diff/3/?file=1902267#file1902267line6789>
> >
> >     I think this is not specific to unreachable agents. Can be an agent that was
recovered after failover.

Ilya, I agree the reason needs to be changed based on whether or not the agent was unreachable.
Also, Yan and I dicussed more about the agent re-registeration scenarios in which the master
should do a status update. If the master undergoes a failover then the current approach will
make the master do status updates for all tasks on re-registering agents which will make the
make the critical path of agent re-registeration slower. One good alternative was to do status
updates for only unreachable agents. Since master already sent a TASK_LOST/TASK_UNREACHABLE
for these so there should definitely be a followup. Most of the frameworks today already do
frequent reconciliations upon re-registering with master so doing explicit status updates
for re-registering agents due to failover seemed a bit unnecessary. How do you feel about
the changed approach?


- Megha


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64098/#review191979
-----------------------------------------------------------


On Nov. 28, 2017, 12:55 a.m., Megha Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64098/
> -----------------------------------------------------------
> 
> (Updated Nov. 28, 2017, 12:55 a.m.)
> 
> 
> Review request for mesos, Ilya Pronin, James Peach, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6406
>     https://issues.apache.org/jira/browse/MESOS-6406
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Master will send task status updates to frameworks when an agent
> re-registers.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 2ddd67ada3731803b00883b6a1f32b20c1bb238f 
>   src/tests/master_allocator_tests.cpp 3400d70bb0ba564eac43c4639eee0efd4d8059e6 
>   src/tests/master_tests.cpp 9c450b9f592d9e09a468f537d9b500e97acc636b 
>   src/tests/partition_tests.cpp e49c474167076b4136a161ed29b11db9a13455a7 
>   src/tests/persistent_volume_tests.cpp acfeac16884b00581a3523607ff26f44f6dca53a 
>   src/tests/slave_recovery_tests.cpp c864aa92d9ff128a89dbc25653385de25653f56a 
>   src/tests/upgrade_tests.cpp 7f434dbba858f636719eec24e92b306b76430c4c 
> 
> 
> Diff: https://reviews.apache.org/r/64098/diff/4/
> 
> 
> Testing
> -------
> 
> with make check
> 
> 
> Thanks,
> 
> Megha Sharma
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message