mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Klues <klue...@gmail.com>
Subject Re: Review Request 44979: Filled in the guts of the Nvidia GPU isolator.
Date Thu, 17 Mar 2016 22:31:43 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44979/#review124105
-----------------------------------------------------------



Thanks Rob!

Here's a quck rundown of the comments for improvement we talked about over email. I've added
comments inline as well:

1) Put hanging {} on simple constructor on same line as last
initializer if possible.
2) Initialize variables next to where they are first used, not at the
top of functions.
3) Group error logic before other logic when possible and omit the
"else" block if the error logic is surrounded by an if.
4) Don't initialize variables inside if and while statements.
5) Prefer foreach over vectors instead of regular for loop with index.
6) If something is a TODO, mark it such with a "TODO(username):" tag.
7) Be consistent about line breakages
8) The logic in the update() function is a little hard to follow.
9) Use static constants for things like the NVIDIA_MAJOR_DEVICE number.
10) Avoid typedefs and prefer structs over classes if all members
should be public.
11) Omit unused variables in code copied from other places
(specifically the embedded Info struct)

I've provided a patch at https://github.com/klueska-mesosphere/mesos/commits/nvidia-isolator-staging,
under the commit "Updated Nvidia Isolator to conform to style." that patch all of this up
properly.


src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.hpp (lines 73 - 78)
<https://reviews.apache.org/r/44979/#comment186471>

    Prefer simple struct here over a `typedef`, i.e., 
    
    ```
    struct Gpu
    {
      nvmlDevice_t handle;
      unsigned int minor;
      Option<ContainerID> containerId;
    };
    ```



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.hpp (lines 85 - 87)
<https://reviews.apache.org/r/44979/#comment186474>

    No need to make this function `virtual`.  Also, as commented below, we don't need to return
a `list<Nothing>` here (just a `Nothing`).  It looks like this was copied from the `cpu`
isolator, so I can see why some cruft got left behind.



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.hpp (lines 89 - 100)
<https://reviews.apache.org/r/44979/#comment186473>

    Some of the variables in this struct are unnecessary (i.e. it looks like it was just copied
from the mem isolator).  We should remove anything unused in this isolator.



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (lines 63 - 64)
<https://reviews.apache.org/r/44979/#comment186476>

    We typically put the {} on the same line as the last initializer, when possible.



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (lines 82 - 84)
<https://reviews.apache.org/r/44979/#comment186477>

    We typically don't declare variables at the top of a function like in standard "C".  Instead,
we declare them just before their first use in the body of the function.



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (lines 86 - 87)
<https://reviews.apache.org/r/44979/#comment186478>

    `if` blocks should have their opening `{` on the same line if possible.  Please correct
throughout.
    
    Also, we typically don't call functions in the body of the conditional expression: i.e.,
I'd change this to:
    
    ```
    nvmlReturn_t result = nvmlInit();
    if (result != NVML_SUCCESS) {
    ```



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (lines 98 - 119)
<https://reviews.apache.org/r/44979/#comment186480>

    We typically prefer `foreach` over standard for/while loops when possible. Specifically
here, I'd change this to:
    
    ```
      int i = 0;
      foreach (Gpu& gpu, gpus) {
        result = nvmlDeviceGetHandleByIndex(i++, &gpu.handle);
        if (result != NVML_SUCCESS) {
          return Error(
              "nvmlDeviceGetHandleByIndex failed: " +
              string(nvmlErrorString(result)));
        }
    
        result = nvmlDeviceGetMinorNumber(gpu.handle, &gpu.minor);
        if (result != NVML_SUCCESS) {
          return Error(
              "nvmlDeviceGetMinorNumber failed: " +
              std::string(nvmlErrorString(result)));
        }
    
        gpu.containerId = None();
      }
    ```



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (line 203)
<https://reviews.apache.org/r/44979/#comment186481>

    Barring any sort of denial of all devices by default, we should explicitly deny all GPU
devices here. I.e.:
    
    ```
    foreach (Gpu& gpu, gpus) {
      Try<Nothing> update = updateCgroup(containerId, gpu);
      if (update.isError()) {
        return Failure(update.error());
      }
    }
    ```
    
    I actually already have a follow-on patch whcih makes this unnecessary, but for this version,
we still need it to be complete.



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (line 258)
<https://reviews.apache.org/r/44979/#comment186482>

    We should short circuit here if we don't have any GPUs to update. I.e.,
    
    ```
    if (resources.gpus().isNone()) {
      return Nothing();
    }
    ```



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (lines 282 - 356)
<https://reviews.apache.org/r/44979/#comment186484>

    The logic here is a bit to follow, and as before, we typically prefer `foreach` over standard
for loops.  Take a look at my implementation here, for a btter way to do this:
    
    https://github.com/klueska-mesosphere/mesos/commit/fc4c3a08bcd674812b8e04d413c0a752e8a75342?diff=split#diff-8a9d7d25ba00e23eb992276ab0094d46R279



src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp (lines 404 - 424)
<https://reviews.apache.org/r/44979/#comment186485>

    As mentioned in a comment above, we don't need a list here.  We should clean this up to
not do any unecessary work.


- Kevin Klues


On March 17, 2016, 9:29 p.m., Rob Todd wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44979/
> -----------------------------------------------------------
> 
> (Updated March 17, 2016, 9:29 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Kevin Klues, and Vikrama Ditya.
> 
> 
> Bugs: Mesos-4625
>     https://issues.apache.org/jira/browse/Mesos-4625
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Filled in the guts of the Nvidia GPU isolator.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.hpp PRE-CREATION

>   src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp PRE-CREATION

> 
> Diff: https://reviews.apache.org/r/44979/diff/
> 
> 
> Testing
> -------
> 
> This was the initial work to fill in the constructor/destructor, update(), create(),
and cleanup() methods pre-stylistic review by Kevin.  Will push the stylistic review changes
as a separate diff such that they can be explicitly seen by any potential contributor.
> 
> 
> Thanks,
> 
> Rob Todd
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message