mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam B" <a...@mesosphere.io>
Subject Re: Review Request 40429: Report executor exit to framework schedulers.
Date Fri, 18 Dec 2015 12:04:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40429/#review110348
-----------------------------------------------------------


You've got the basics down, but you need to put your EXPECT calls before the action that's
going to trigger the event you're expecting.


CHANGELOG (line 104)
<https://reviews.apache.org/r/40429/#comment170172>

    You're modifying the 0.26 Release Notes, but this is going into 0.27 now.
    If this is worth calling out as an "API Change" (probably), then you should add it to
an "API Changes" section of `Release Notes - Mesos - Version 0.27.0 (WIP)`.
    If not, then you don't have to change this file at all. (The 0.27 release manager will
fill in the CHANGELOG prior to release.)



docs/upgrades.md (line 24)
<https://reviews.apache.org/r/40429/#comment171206>

    This needs to go in the "to 0.27.x" section now.



src/sched/sched.cpp (lines 215 - 216)
<https://reviews.apache.org/r/40429/#comment171207>

    Why did you swap these? I think Vinod wanted you to swap ExitedExecutorMessage::slave_id
and ExitedExecutorMessage::executor_id instead.



src/tests/fault_tolerance_tests.cpp (line 294)
<https://reviews.apache.org/r/40429/#comment171210>

    This EXPECT should be set up before you call containerizer.destroy(), in case the destroy
is so fast that it calls executorLost before you can set up the expectation.



src/tests/fault_tolerance_tests.cpp (lines 296 - 300)
<https://reviews.apache.org/r/40429/#comment171211>

    If you AWAIT on the executorLost, you probably don't need the Clock::pause/settle/resume
anymore.



src/tests/fault_tolerance_tests.cpp (line 1717)
<https://reviews.apache.org/r/40429/#comment171212>

    Ditto. Move this above containerizer.destroy and consider removing the pause/settle/resume.
Here and everywhere else.



src/tests/slave_tests.cpp (line 1149)
<https://reviews.apache.org/r/40429/#comment171213>

    This executor is lost because of the EXPECT_CALL(containerizer, launch) which WillOnce(Return(Failure)),
but that's not triggered until the driver.launchTasks call. Put your new EXPECT somewhere
before driver.launchTasks.



src/tests/slave_tests.cpp (line 1391)
<https://reviews.apache.org/r/40429/#comment171214>

    This expectation should go above the detector.appoint() which triggers the offerRescinded.
I wonder why this wasn't needed before. This shouldn't be related to your executorLost change.



src/tests/slave_tests.cpp (line 1410)
<https://reviews.apache.org/r/40429/#comment171215>

    Move above the post(ShutdownMessage), and maybe WAIT on it, rather than potentially stopping
the driver before executorLost is delivered.



src/tests/slave_tests.cpp (line 1507)
<https://reviews.apache.org/r/40429/#comment171216>

    Move above driver.killTask


- Adam B


On Dec. 14, 2015, 12:09 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40429/
> -----------------------------------------------------------
> 
> (Updated Dec. 14, 2015, 12:09 p.m.)
> 
> 
> Review request for mesos, Adam B and Vinod Kone.
> 
> 
> Bugs: MESOS-313
>     https://issues.apache.org/jira/browse/MESOS-313
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Report executor exit to framework schedulers. This is a MVP to start the work of notifying
scheduler on scheduler refresh.
> 
> Next step would be sending this message reliabily, and/or splitting Event::FAILURE for
slave failure and executor termination.
> 
> 
> Diffs
> -----
> 
>   CHANGELOG dbefa5df9e9183155bee532193148988dfc1fb84 
>   docs/app-framework-development-guide.md 4a43a93d080bdac37b8aee91748fea7552a1cc67 
>   docs/upgrades.md 7c1f1814680078380ca33bbc27421675ffe61d60 
>   include/mesos/scheduler.hpp 049c041286f3167e79cc5ea8a9e0bf8d42569832 
>   src/java/src/org/apache/mesos/Scheduler.java 4f048830a2c47f747033c60730cc770cb2578815

>   src/python/interface/src/mesos/interface/__init__.py 4be502fd83029ad5fc798696caf9e27fd95f7482

>   src/sched/sched.cpp 44eb4f50e8ed84297268d94a3a0320c843ff6d8c 
>   src/tests/fault_tolerance_tests.cpp ba657d0e1d8515cffd1b37925bf91a84b2feaef1 
>   src/tests/gc_tests.cpp f939d27c58177fba052fbcd9d6c9a572d052df52 
>   src/tests/master_slave_reconciliation_tests.cpp 9afa826006fa7129da1a9c1ac8c389c0e051f717

>   src/tests/master_tests.cpp 865fa4a71f4bae2a218cd2c4e10873222d1ea3c4 
>   src/tests/scheduler_event_call_tests.cpp 03f0332ef75bbe7c4947bd6daf55d40384570f18 
>   src/tests/slave_tests.cpp 4975bea8a7a701e0414426760692720f73dea7f5 
> 
> Diff: https://reviews.apache.org/r/40429/diff/
> 
> 
> Testing
> -------
> 
> Modified test for SchedulerDriverEventTest.Failure, which verifies that MockScheduler::executorLost
is invoked.
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message