mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anindya Sinha <anindya_si...@apple.com>
Subject Re: Review Request 51879: Autodetect value of resource when not specified in static resources.
Date Sun, 18 Dec 2016 08:38:44 GMT


> On Dec. 16, 2016, 7:58 a.m., Jiang Yan Xu wrote:
> > src/slave/containerizer/containerizer.cpp, lines 327-344
> > <https://reviews.apache.org/r/51879/diff/11/?file=1581274#file1581274line327>
> >
> >     Here what I found to be a bit awkward:
> >     
> >     We already know here that we want to process "cpus", "mem", "disk" and "ports".
We don't do it directly but we call a helper method which has to check that the input are
precisely what this method provides.
> >     
> >     ```
> >     if (name != "cpus" && name != "mem" && name != "disk" &&
name != "ports") {
> >       return Error(
> >         "Auto-detection of resource type '" + name + "' not supported");
> >     }
> >     ```
> >     
> >     and from this method we have the check the output.
> >     
> >     This makes the helper not a natrual abstraction but rather a tightly coupled
blocked to code to avoid duplicating logic for all predefined resources. If that's the objective,
then we can just do:
> >     
> >     ```
> >     const hashset<string> predefined = ({"cpus", "mem", "disk", "ports", "gpus"};
> >     
> >     foreach (const string& name, predefined) {
> >     #ifdef __linux__
> >       // GPUS are handled separately.
> >       if (name == "gpus") {
> >         Try<Resources> gpus = NvidiaGpuAllocator::resources(flags);
> >         if (gpus.isError()) {
> >           return Error("Failed to obtain GPU resources: " + gpus.error());
> >         }
> >     
> >         foreach(const Resource& gpu, gpus.get()) {
> >           result.push_back(gpu);
> >         }
> >       }
> >     #endif
> >       
> >       // Content of `detect()` except we can just add stuff to `result` without
returning.
> >     }
> >     
> >     // Custom resources.
> >     foreach(const Resource& resource, parsed.get()) {
> >       if (!predefined.contains(resource.name())) {
> >         result.push_back(resource);
> >       }
> >     }
> >     ```
> >     
> >     The result should be less amount of code without any increase in code duplication.

I removed the following block from `Try<vector<Resource>> detect()` since if the
`name` does not match the resources that cannot be auto-detected, it would be a no-op, ie.
returns the resources with no modifications.

```
if (name != "cpus" && name != "mem" && name != "disk" && name != "ports")
{
  return Error(
    "Auto-detection of resource type '" + name + "' not supported");
}
```

I think I would still want to keep the `detect()` function separate. Seems cleaner to me,
otherwise we would have to have bunch of `if` conditions within this `foreach` for each of
the predefined resource types. How does this look instead?

```
  const hashset<string> predefined = ({"cpus", "mem", "disk", "ports", "gpus"});

  foreach (const string& name, predefined) {
#ifdef __linux__
    // GPU resource is handled separately.
    if (name == "gpus") {
      Try<Resources> gpus = NvidiaGpuAllocator::resources(flags);
      if (gpus.isError()) {
        return Error("Failed to obtain GPU resources: " + gpus.error());
      }

      foreach (const Resource& gpu, gpus.get()) {
        result.push_back(gpu);
      }
    }
#endif

    if (name != "gpus") {
      Try<vector<Resource>> _resources = detect(name, flags, parsed.get());
      if (_resources.isError()) {
        return Error(
            "Failed to obtain " + name + " resources: " + _resources.error());
      }

      result.insert(
          result.end(), _resources.get().begin(), _resources.get().end());
    }
  }

  // Custom resources.
  foreach (const Resource& resource, parsed.get()) {
    if (!predefined.contains(resource.name())) {
      result.push_back(resource);
    }
  }
```


> On Dec. 16, 2016, 7:58 a.m., Jiang Yan Xu wrote:
> > src/slave/containerizer/mesos/isolators/gpu/allocator.cpp, lines 192-200
> > <https://reviews.apache.org/r/51879/diff/11/?file=1581275#file1581275line192>
> >
> >     It's the same logic to support GPUs that are missing values right? Can we add
it (fine to be in a separate review)? Otherwise we have to separately document that we don't
support GPUs and the inconsistency looks bad.

Seems to be a bit more complicated. When we iterate through `parsed` to retrieve `gpus` and
we find a `Resource` with no value, we can add a `Resource` object (of type `gpus`) with a
size of `available` retrieved from `nvml::deviceGetCount()`. But if there are additional `gpus`
resources (with or without value), then we will fail here eventually:
`if (resources.gpus().get() > available.get())`

I think a better approach might be to account for all `gpus` resources which have values,
and then give the `gpus` resources with no value an amount equivalent to `available - (sum
of all other gpus resources)`. In case there are more than 1 `gpus` resources with no value,
we will still fail at the same point (unless we equally divide the remaining amount of `gpus`
resources amongst n of such `gpus` resources having no value.

Thoughts?


- Anindya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51879/#review159311
-----------------------------------------------------------


On Dec. 10, 2016, 1:19 a.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51879/
> -----------------------------------------------------------
> 
> (Updated Dec. 10, 2016, 1:19 a.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6062
>     https://issues.apache.org/jira/browse/MESOS-6062
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When static resources indicate resources with a positive size, we use
> that for the resources on the agent. However, --resources can include
> resources with no size, which indicates that mesos agent determine the
> size of those resources from the agent and uses that information. Note
> that auto-detection of resources is allowed for all known resource
> types except "gpus" (i.e. "cpus", "mem", "disk" and "ports") when
> represented in JSON format only. Auto-detection is not done when the
> resources are represented in text format.
> 
> With this change, JSON representation for disk resources that do not
> specify any value would not result in an error, but those resources
> will not be accounted for until a valid size is determined for such
> resources. A scalar value of -1 still results in an invalid resource.
> 
> 
> Diffs
> -----
> 
>   include/mesos/resources.hpp f569c931ff7db8d51dfd7c96f4f2addab05df85d 
>   include/mesos/v1/resources.hpp f60ab794a0c7c24885c49cc47b798c363e3279e7 
>   src/common/resources.cpp 4bb9beffcb3509f4226b4985e05eccec01412d0f 
>   src/slave/containerizer/containerizer.cpp d46882baa904fd439bffb23c324828b777228f1c

>   src/slave/containerizer/mesos/isolators/gpu/allocator.cpp 2e722691475c84afae14009014ea70cc0fdd0e65

>   src/v1/resources.cpp 46cc00f2f453f5eb4ddc4b0b9b89be2bd89f05d9 
> 
> Diff: https://reviews.apache.org/r/51879/diff/
> 
> 
> Testing
> -------
> 
> Tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message