mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mesos Reviewbot <revi...@mesos.apache.org>
Subject Re: Review Request 73131: Fixed agent reregistration and marking as unreachable race.
Date Tue, 12 Jan 2021 04:25:05 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73131/#review222438
-----------------------------------------------------------



Patch looks great!

Reviews applied: [73131]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose
--disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1';
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On Jan. 12, 2021, 1:23 a.m., Ilya Pronin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73131/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2021, 1:23 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-10209
>     https://issues.apache.org/jira/browse/MESOS-10209
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> During master failover if agent reregistration runs concurrently with
> marking the agent as unreachable and finishes before the MarkUnreachable
> operation is complete, the assertion that the agent is in the recovered
> set in Master::_markUnreachable() doesn't hold. The reason for this is
> because after readmitting the agent the master removes it from the
> recovered set in Master::__reregisterSlave().
> 
> We can fix this by ignoring agent reregistration requests while a
> marking unreachable operation is in progress, similarly to how we do it
> for marking gone. Once the marking operation is complete, the agent will
> be able to reregister as usual.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 164720a3ad40773b6de0268e3a7119de04bf297e 
>   src/tests/master_tests.cpp cd0973ed4cc8fc33de714d59c7680aef05b97b47 
> 
> 
> Diff: https://reviews.apache.org/r/73131/diff/1/
> 
> 
> Testing
> -------
> 
> Ran `make check`. Verified that the new test crashes without the fix.
> 
> 
> Thanks,
> 
> Ilya Pronin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message