mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jojy Varghese" <j...@mesosphere.io>
Subject Re: Review Request 40873: SimpleRegistryPullerTest: Moved blob response preparation to the top.
Date Wed, 02 Dec 2015 23:21:22 GMT


> On Dec. 2, 2015, 4:40 p.m., Anand Mazumdar wrote:
> > @jojy, Do we know why preparing the response was taking so long that was causing
the socket to timeout previously ?
> 
> Jojy Varghese wrote:
>     @anand: The observed behavior is that the server socket gets a RST. This is ususally
when the peer side of the socket is closed. Also observed is that there are asycnhronous socket
close being done ( i think from the HTTP layer). This was observed when I added a log inside
the dtor of the socket (socket.hpp: L103). For failed test cases, I saw an extra socket close.
This change reduces the time between the peer interactions. This is not a "solution" of the
problem but an effort to eliminate possible false alarms and be able to focus on the real
issue. Since its very difficult to reproduce this issue predictably, we need to elimitate
all red herrings.
> 
> Anand Mazumdar wrote:
>     Thanks for the reply @jojy. 
>     
>     Do we know why the peer side of the socket was closed i.e. it got a RST and why the
peer side of the socket was closed in the first place ? Can you also help me understand how
are these false alarms ? To me, it suggests a bug inside the `RegistryClient` itself and we
are finding a workaround in the tests to mask it, no ?

This fix addresses couple of things:

- Eliminates the assumption that a socket accept call will be invoked before the client sends
the request.
- Eliminates the assumption of any timeout dependencies of http client socket on the processing
at the server side.
  
  Assumption no. 1 is clearly a bug in the test that this fix addresses.
  
 @anand: We discussed about this and I agree that we could be hiding an underlying issue at
the http layer. But we want to address the real issue and not red herring. This test case
failure is just that. Moreover, these fixes are improvements that the test need anyways to
eliminate the above mentioned assumptions.


- Jojy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40873/#review108662
-----------------------------------------------------------


On Dec. 2, 2015, 4:26 p.m., Jojy Varghese wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40873/
> -----------------------------------------------------------
> 
> (Updated Dec. 2, 2015, 4:26 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-4025
>     https://issues.apache.org/jira/browse/MESOS-4025
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> By moving the preperation task above any test task avoids socket waiting for the response
to be prepared. This change has eliminated socket timeouts seen sometimes.
> 
> 
> Diffs
> -----
> 
>   src/tests/containerizer/provisioner_docker_tests.cpp c63bf53fee40ef12536a16e11f4d5224c4e4278e

> 
> Diff: https://reviews.apache.org/r/40873/diff/
> 
> 
> Testing
> -------
> 
> make check (600 count).
> 
> 
> Thanks,
> 
> Jojy Varghese
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message