mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <bmah...@apache.org>
Subject Re: Review Request 48374: Added class to share Nvidia-specific components between containerizers.
Date Wed, 29 Jun 2016 22:11:30 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48374/#review140041
-----------------------------------------------------------


Fix it, then Ship it!




We should mention in the commit description that we're introducing this to contain the `#ifdef`s
around the Nvidia classes that we need to pass around.


src/slave/containerizer/containerizer.cpp (lines 234 - 236)
<https://reviews.apache.org/r/48374/#comment205375>

    Error context here?



src/slave/containerizer/mesos/isolators/gpu/components.hpp (line 44)
<https://reviews.apache.org/r/48374/#comment205365>

    Hm.. why the const here? It seems that the containerizers are going to need to do non-const
operations on the allocator, and having const here requires that they make a copy.



src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp (lines 110 - 111)
<https://reviews.apache.org/r/48374/#comment205377>

    Hm.. after chatting with Kevin it seems that this case occurs when they are on Linux and
ask for the gpu isolator, but NVML is not available. Therefore we realized that we shouldn't
pass the option here and instead we should surface a better error message to the user within
the isolator creation logic up in the mesos containerizer.


- Benjamin Mahler


On June 29, 2016, 12:25 a.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48374/
> -----------------------------------------------------------
> 
> (Updated June 29, 2016, 12:25 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-5562
>     https://issues.apache.org/jira/browse/MESOS-5562
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Previously, the `NvidiaGpuAllocator` component was created directly
> inside the `NvidiaGpuIsolatorProcess` even though it was designed to
> be used by multiple containerizers at the same time. To allow for
> simultaneous access, we created a new `NvidiaComponents` class to hold
> an `NvidiaGpuAllocator` instance and pass it to each containerizer as
> it is created. This component can easily be extended to include more
> cross-containerizer components later on.
> 
> We create the `NvidiaComponents` instance inside
> `Containerizer::create()` and pass it to both the docker containerizer
> and the mesos containerizer when they are created.  The docker
> containerizer currently doesn't do anything with this information, but
> its interface has been updated to accomodate it. The mesos
> containerizer has been updated to pass this all the way down to the
> `NvidiaGpuIsolatorProcess` and exploit it.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am 86c39fdf379ada470c9b1f86be263ef71dc47c41 
>   src/slave/containerizer/containerizer.cpp 2449526061f7d6ac46ff42260ccdd0278694d9d4

>   src/slave/containerizer/docker.hpp 9b0a5a76e457006119e657ee8f627dcd1326c0ce 
>   src/slave/containerizer/docker.cpp 514268fd84507eafa11fc57d38a23b0502f80ef8 
>   src/slave/containerizer/mesos/containerizer.hpp a1a00020668f6da8d611f26e5637afffc87d09ba

>   src/slave/containerizer/mesos/containerizer.cpp d984efd4742ec084d66538c48a36ea768832324d

>   src/slave/containerizer/mesos/isolators/gpu/components.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/gpu/nvidia.hpp e35a55fdf44ec01d49d6a9e396a88f4163ef79c3

>   src/slave/containerizer/mesos/isolators/gpu/nvidia.cpp b8753ec8fc0afc8f2b389250de57d5095287acf9

> 
> Diff: https://reviews.apache.org/r/48374/diff/
> 
> 
> Testing
> -------
> 
> GTEST_FILTER="" make -j check && sudo GTEST_FILTER="*NVIDIA*" src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message