mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neil Conway" <neil.con...@gmail.com>
Subject Re: Review Request 41896: Added guide to writing highly available Mesos frameworks.
Date Tue, 05 Jan 2016 06:37:04 GMT


> On Jan. 5, 2016, 1:45 a.m., Guangya Liu wrote:
> > docs/high-availability-framework-guide.md, lines 117-118
> > <https://reviews.apache.org/r/41896/diff/1/?file=1181077#file1181077line117>
> >
> >     What will be the final state of this task? Does the framework need to re-launch
this task again even though this task might already been finished in agent. Can you please
add some best practise for this case?
> 
> Neil Conway wrote:
>     I'm not sure what else we can say here: the best practice we're recommending is to
avoid this situation entirely by ensuring that when a new framework leader is elected, it
knows about (a superset of) all the tasks the previous leader might have launched.
> 
> Guangya Liu wrote:
>     Can we clarify that the framework may need to re-launch task for such case?

Well, if the framework instance doesn't know about the task (because it hasn't persisted state
correctly before failing over), it isn't clear what they should do in general -- maybe kill
the unknown task, maybe let it run and page an admin. This is why we don't recommend that
this situation be allowed in the first place :)


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41896/#review112704
-----------------------------------------------------------


On Jan. 5, 2016, 3:14 a.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41896/
> -----------------------------------------------------------
> 
> (Updated Jan. 5, 2016, 3:14 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Joris Van Remoortere.
> 
> 
> Bugs: MESOS-3936
>     https://issues.apache.org/jira/browse/MESOS-3936
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added guide to writing highly available Mesos frameworks.
> 
> 
> Diffs
> -----
> 
>   docs/app-framework-development-guide.md 4a43a93d080bdac37b8aee91748fea7552a1cc67 
>   docs/high-availability-framework-guide.md PRE-CREATION 
>   docs/high-availability.md 31aa66220617a3f8606b185ef247c11f00735227 
>   docs/home.md 6f0f4b9cb9d0da1f9960ebe7f36ce186c1317535 
> 
> Diff: https://reviews.apache.org/r/41896/diff/
> 
> 
> Testing
> -------
> 
> Previewed via site-docker.
> 
> Note that there's a lot more that could be said here; also, at some point we should probably
unify the "reconciliation" page with this page, and perhaps move some of the content in the
"high-availability" page here (leaving the "high-availability" page for the operator-centric
parts of configuring Mesos to run in HA mode).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message