mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Rukletsov <ruklet...@gmail.com>
Subject Re: Review Request 64033: Terminated driver-based executors if kill arrives before launch task.
Date Thu, 21 Dec 2017 13:52:14 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64033/
-----------------------------------------------------------

(Updated Dec. 21, 2017, 1:52 p.m.)


Review request for mesos, Andrei Budnik, Anand Mazumdar, Armand Grillet, and Vinod Kone.


Bugs: MESOS-8297
    https://issues.apache.org/jira/browse/MESOS-8297


Repository: mesos


Description
-------

`ExecutorRegisteredMessage` or `RunTaskMessage` may not be delivered
to a driver-based executor. Since these messages are not retried,
without this patch an executor never starts a task and remains idle,
ignoring kill task request. This patch ensures all built-in driver-
based executors eventually shut down if kill task arrives before
the task has been started.


Diffs
-----

  src/docker/executor.cpp 3974f20052e3c12eb154a5146d19d4dc1759859f 
  src/exec/exec.cpp 7fc46daa633ca42944123a7546ac4eceea93956d 
  src/launcher/executor.cpp 31a47106d7f511220afba4fb382c9252d4671f6e 


Diff: https://reviews.apache.org/r/64033/diff/6/


Testing (updated)
-------

make check on MacOS 10.11.6

Manual testing using modified "exec.cpp" that drops executor registration confirmation.
```
TEST_P(CommandExecutorTest, KillWithNoLaunch)
{
  Try<Owned<cluster::Master>> master = StartMaster();
  ASSERT_SOME(master);

  Owned<MasterDetector> detector = master.get()->createDetector();

  slave::Flags flags = CreateSlaveFlags();
  flags.http_command_executor = false;

  Try<Owned<cluster::Slave>> slave = StartSlave(detector.get(), flags);
  ASSERT_SOME(slave);

  MockScheduler sched;
  MesosSchedulerDriver driver(
      &sched, DEFAULT_FRAMEWORK_INFO, master.get()->pid, DEFAULT_CREDENTIAL);

  EXPECT_CALL(sched, registered(&driver, _, _));

  Future<vector<Offer>> offers;
  EXPECT_CALL(sched, resourceOffers(&driver, _))
    .WillOnce(FutureArg<1>(&offers))
    .WillRepeatedly(Return()); // Ignore subsequent offers.

  driver.start();

  AWAIT_READY(offers);
  EXPECT_EQ(1u, offers->size());

  // Launch a task with the command executor.
  TaskInfo task = createTask(
      offers->front().slave_id(),
      offers->front().resources(),
      SLEEP_COMMAND(1000));

  Future<RunTaskMessage> runTaskMessage =
    FUTURE_PROTOBUF(RunTaskMessage(), _, _);

  driver.launchTasks(offers->front().id(), {task});

  AWAIT_READY(runTaskMessage);

  // Wait for executor to startup.
  Clock::pause();
  Clock::settle();
  os::sleep(Seconds(1));
  Clock::settle();
  Clock::resume();

  // There should only be a TASK_FAILED update.
  Future<TaskStatus> statusFailed;
  EXPECT_CALL(sched, statusUpdate(_, _))
    .WillOnce(FutureArg<1>(&statusFailed));

  driver.killTask(task.task_id());

  AWAIT_READY(statusFailed);
  EXPECT_EQ(TASK_FAILED, statusFailed->state());

  driver.stop();
  driver.join();
}
```


Thanks,

Alexander Rukletsov


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message