mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Conway <neil.con...@gmail.com>
Subject Re: Review Request 59355: Optimized sorter performance with many inactive clients.
Date Tue, 23 May 2017 17:28:18 GMT


> On May 23, 2017, 4:56 p.m., James Peach wrote:
> > src/master/allocator/sorter/drf/sorter.hpp
> > Lines 316 (patched)
> > <https://reviews.apache.org/r/59355/diff/3/?file=1727589#file1727589line316>
> >
> >     Did you consider keeping separate vectors for active and inactive children?
That would avoid the copying due to the insert() here, and I think it would simplify the sorting
code later on.

Yeah, I considered that -- it is definitely a bit awkward to maintain the ordering invariant
over a single `children` vector. The downside to keeping two vectors is that the code for
inserting, removing, and updating clients becomes more complicated, because we need to search
two vectors, check for duplicates in both vectors, etc. My initial preference was for a single
vector, but you could definitely argue for two vectors instead.


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/#review175812
-----------------------------------------------------------


On May 23, 2017, 6:30 a.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59355/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 6:30 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7521
>     https://issues.apache.org/jira/browse/MESOS-7521
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
> same data structure used to store active clients. This resulted in a
> significant performance regression when the vast majority of sorter
> clients are inactive: when sorting clients and producing the sorted
> order, we iterate over ALL clients (active and inactive), which could be
> much slower than the old active-only implementation.
> 
> This commit revises the sorter to ensure that inactive leaf nodes are
> always stored at the end of their parent's list of child nodes. This
> allows the sorter to stop early (at the first inactive leaf) when
> iterating over a node's children, if we're only interested in applying
> an operation to each active leaf or internal node. This change fixes the
> observed performance regression relative to Mesos 1.2.0.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff

>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7

> 
> 
> Diff: https://reviews.apache.org/r/59355/diff/3/
> 
> 
> Testing
> -------
> 
> Initial perf testing:
> 
> MESOS 1.2.0:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 90.61248ms
> Added 5000 agents in 38.788639509secs
> allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing
offers
> allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
(48234 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
(48235 ms total)
> ```
> 
> MESOS in master branch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 295.603058ms
> Added 5000 agents in 16.676030553secs
> allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks
suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
(102353 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
(102353 ms total)
> ```
> 
> MESOS in master branch + this patch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 325.582494ms
> Added 5000 agents in 16.311040922secs
> allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 frameworks suppressing
offers
> allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 frameworks suppressing
offers
> allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 frameworks suppressing
offers
> allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 frameworks suppressing
offers
> allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 frameworks suppressing
offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
(24664 ms)
> ```
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message