mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Re: Review Request 45067: Updated the long-lived-framework example.
Date Tue, 29 Mar 2016 19:09:39 GMT


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 77
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line77>
> >
> >     s/. Reject/. Reject/ <-- extraneous space

Oops, that's a habit (and the Apache license uses 2 spaces :)


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, lines 95-96
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line95>
> >
> >     So if resources are not enough we do not explicitly decline the offer?
> >     
> >     How about:
> >     
> >     ```
> >     // Reject offer if the offer belongs to a different slave than the one currently
running the executor(s).
> >     if (slaveID.isSome() && slaveID != offers[i].slave_id()) {
> >       // decline offer.
> >     }
> >     
> >     // Reject the offer if the resources are not enough.
> >     if (!Resources(offers[i].resources()).flatten().contains(TASK_RESOURCES + EXECUTOR_RESOURCES))
{
> >       // decline offer.
> >     }
> >     
> >     // Launch the task.
> >     
> >     ```
> >     
> >     Also, are you expecting the executor to be multi-task? If yes, the resources
check above should only include task resources? 
> >     
> >     Since you do not control whether the executor is single or multi-task, I would
recommend to launch each task with an unique executor id.

We actually do expect the same executor to be used for each task.  (The executor's ID is "default".)

There are really 5 cases here:
1) Executor running & offer from different slave.
2) Executor running & offer not big enough for a task.
3) Executor running & offer from same slave & offer big enough for task.
4) Executor not running & offer not big enough for task + executor.
5) Executor not running & offer big enough for task + executor.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 173
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line173>
> >
> >     why is this called "residency" instead of "slaveID" ?

I wanted a variable name that somehow expressed where the `long-lived-executor` was "living".
 But `slaveId` would work too.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 196
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line196>
> >
> >     why does the framework assume master lives on the same machine?

Hm, I guess the location of the master doesn't really play a role here.  Removed from comment.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 151
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line151>
> >
> >     I thought we could compare option directly with a value instead of doing a .get()?

Yup, my mistake.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 160
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line160>
> >
> >     seems weird that you are setting residency/slaveID to None() if one executor
is lost. isn't it still possible for other executors to be running on that slave? do we want
to launch new executors on a different slave in that case?
> >     
> >     I would recommend to keep track of running executors with `set<ExecutorID>`.

As mentioned above, this framework only launches on executor with ID "default".


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45067/#review125780
-----------------------------------------------------------


On March 29, 2016, 12:08 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45067/
> -----------------------------------------------------------
> 
> (Updated March 29, 2016, 12:08 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and Vinod Kone.
> 
> 
> Bugs: MESOS-5062
>     https://issues.apache.org/jira/browse/MESOS-5062
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This gives the example `long-lived-framework` enough options to run outside of the build
environment.
> 
> This also updates:
> 
> * The style of the framework code.
> * Gives the `ExecutorInfo` some resources (needed for some cgroups isolators).
> * Restricts the framework to one agent.  Otherwise, it would grab a small chunk of every
machine in the cluster.
> * Adds filters for declined offers.
> 
> 
> Diffs
> -----
> 
>   src/examples/long_lived_framework.cpp ef498d63bc5f0a8deb46d71edd85a76a1d38fdd0 
> 
> Diff: https://reviews.apache.org/r/45067/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> Ran this on the master node on a Mesos cluster:
> ```
> ./long-lived-framework --master=zk://localhost:2181/mesos --executor_uri="https://s3.amazonaws.com/url/to/long-lived-executor"
--executor_command="LD_LIBRARY_PATH=/path/to/libmesos && ./long-lived-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message