mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Klaus Ma" <klaus1982...@gmail.com>
Subject Re: Review Request 37531: Fix master CHECK failure if a framework uses duplicated task id.
Date Wed, 13 Jan 2016 14:06:01 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/
-----------------------------------------------------------

(Updated Jan. 13, 2016, 10:06 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Changes
-------

rebase and ping Vinod :).


Summary (updated)
-----------------

Fix master CHECK failure if a framework uses duplicated task id.


Bugs: MESOS-3070
    https://issues.apache.org/jira/browse/MESOS-3070


Repository: mesos


Description
-------

__Phenomenon:__
The master crash because of duplicated task id

__Root Cause:__
The task id are stored in slave agent; if master failover, there's a time window that new
slave lanched a task with same task id; so if the old task re-registered back, the master
will crash because of duplicated task id.

__Solution:__
Stores tasks info in Master::Framework by SlaveID to avoid duplicated issue.


Diffs (updated)
-----

  src/master/http.cpp bcafc7aff89659a68352f3876ce6042f8b34bd5d 
  src/master/master.hpp f02d165874fa8023675e545793de699aeecae29b 
  src/master/master.cpp c122c30d943813fc3ce9e7025783c7231809b022 
  src/tests/master_tests.cpp 223b9d20a3a8a8194a3a6a605ec2394c37ab5957 

Diff: https://reviews.apache.org/r/37531/diff/


Testing
-------

make
make check


Thanks,

Klaus Ma


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message