mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Conway <neil.con...@gmail.com>
Subject Re: Review Request 42988: Changed ZooKeeper reconnection logic to retry more aggressively.
Date Fri, 29 Jan 2016 23:57:19 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42988/
-----------------------------------------------------------

(Updated Jan. 29, 2016, 11:57 p.m.)


Review request for mesos and Joris Van Remoortere.


Changes
-------

Tweak commit message.


Bugs: MESOS-4546
    https://issues.apache.org/jira/browse/MESOS-4546


Repository: mesos


Description (updated)
-------

The previous implementation of `GroupProcess` tried to establish a single
ZooKeeper connection on startup, but didn't attempt to retry. ZooKeeper will
retry internally, but it only retries by attempting to reconnect to a list of
previously resolved IPs; it doesn't attempt to re-resolve those IPs to pickup
updates to DNS configuration. Because DNS configuration can be quite dynamic,
we now close the current Zk handle and open a new one if we've seen a
successful `zookeeper_init` but haven't been connected within the ZooKeeper
session timeout.


Diffs (updated)
-----

  src/zookeeper/group.hpp cf82fec290a2fa9bec122539c2eb0f12b45c2fb2 
  src/zookeeper/group.cpp 2ae3193e0e138c90b205d45400d80e80853e1b99 
  src/zookeeper/zookeeper.cpp 3c4fdad972dcd1728c52a05970646c713dcf98c8 

Diff: https://reviews.apache.org/r/42988/diff/


Testing
-------

make check, on both OSX and Arch Linux. Manually configured a situation in which the Mesos
agent uses stale DNS information in a loop: validated that without the patch, we don't pickup
DNS changes, whereas with the patch, we do.


Thanks,

Neil Conway


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message