mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Klues <klue...@gmail.com>
Subject Re: Review Request 47481: Rewrote os::read() to be friendlier to reading binary data.
Date Tue, 17 May 2016 23:39:18 GMT


> On May 17, 2016, 8:44 p.m., Jojy Varghese wrote:
> > 3rdparty/stout/include/stout/os/read.hpp, line 113
> > <https://reviews.apache.org/r/47481/diff/3/?file=1385903#file1385903line113>
> >
> >     Does this assume a certain stack size?
> 
> Kevin Klues wrote:
>     It would assume a certain stack size, yes. 
>     Trust me, I didn't want to go this route personally. I switched to this way based
on offline feedback.
> 
> Benjamin Mahler wrote:
>     > Trust me, I didn't want to go this route personally. I switched to this way
based on offline feedback.
>     
>     In the interest of transparency, the initial implementation was directly writing
to std::string managed storage. I gave feedback to Kevin to avoid this approach because it's
a bit clever and it's not obviously correct compared to writing to a character array and doing
the copy: http://stackoverflow.com/a/6700534 . While the string approach is nice because it
avoids the extra copy from a character array into the string, `fread` already maintains an
internal user-space buffer in order to provide buffered I/O. So, if we really cared about
eliminating user space copying we have to use `read` rather than `fread` AFAICT.
>     
>     Whether the array is on the heap or the stack is a separate concern IMO, if assuming
a particular stack size is a problem, then let's put it on the heap. Curious to hear more
thoughts on that however, since we have a number of stack allocated buffers in the code, 1024
bytes or less FWICT.
>     
>     Also, why 4096 instead of `BUFSIZ`?
>     http://www.cplusplus.com/reference/cstdio/BUFSIZ/
>     http://www.cplusplus.com/reference/cstdio/setbuf/
>     
>     If we used `BUFSIZ` is there still a concern about stack size assumptions?

I chose 4096 because we had talked about using a PAGE sized buffer. I'm fine with any size
though.

In general, no matter what size buffer we choose, so long as the buffer is stack allocated,
there is always a risk of overruning the stack if we are near the end of the stack. This isn't
just for buffers, it's true for any stack allocated variables -- buffers just tend to be rather
large stack allocated variables, increasing the risk of an overflow.

That said, I'd say the risk is pretty low of overruning your stack on Linux (by default the
stack can expand up to 8Mb and you can set this higher if you want). If you happen to use
ulimit to set the max stack size to something smaller (or programatically launch threads with
smaller stacks), you do increase your chances of overrunning the stack (but there was always
a non-zero cahnce of overruning it anyway).  Either way, Linux will always start with a small
stack and automatically grow it on demand up to the specified max size (page-fault -> allocate
new stack -> copy old stack -> update stack pointer -> restart task).

The bigger concern for me is accidentally writing buggy code that overruns stack allocated
buffers in the current stack frame. I've tried to make sure to avoid this in this particular
function, but I typically like to avoid using stack allocated buffers myself.


- Kevin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47481/#review133624
-----------------------------------------------------------


On May 17, 2016, 8:36 p.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47481/
> -----------------------------------------------------------
> 
> (Updated May 17, 2016, 8:36 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Joris Van Remoortere, and switched to 'mcypark'.
> 
> 
> Bugs: MESOS-5398
>     https://issues.apache.org/jira/browse/MESOS-5398
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The previous read() implementation was based on calling getline() to
> read in chunks of data from a file. This is fine for text-based files,
> but is a little strange for binary files.
> 
> The new implementation reads in chunks of raw bytes into a stack
> allocated buffer before copying them into their final location. I
> usually don't like stack allocated buffers because of their potential
> security implications when it comes to stack buffer overflows, but I
> took extra care here to make sure there won't be any.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/os/read.hpp e1e97c1bcb7493a734fc77721a83c230b1a23724 
> 
> Diff: https://reviews.apache.org/r/47481/diff/
> 
> 
> Testing
> -------
> 
> make check -j
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message