mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bannier <benjamin.bann...@mesosphere.io>
Subject Re: Review Request 66165: Re-fixed many master allocator tests.
Date Wed, 21 Mar 2018 10:52:49 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66165/#review199649
-----------------------------------------------------------




src/tests/master_allocator_tests.cpp
Line 759 (original), 748 (patched)
<https://reviews.apache.org/r/66165/#comment279994>

    This test seems to get flaky for me with this patch, could you please confirm it works
under load (e.g., using `stress` or some actual workload)? I haven't verified all touched
tests, please do.
    
        [ RUN      ] MasterAllocatorTest/0.SlaveLost
        ../src/tests/master_allocator_tests.cpp:838: Failure
        Mock function called more times than expected - taking default action specified at:
        ../src/tests/allocator.hpp:273:
            Function call: addSlave(@0x7f2414006ab8 6d430237-e4d5-4852-8459-2020f598449f-S2,
@0x7f2414006ad8 hostname: "gru1.hw.ca1.mesosphere.com"
        resources {
          name: "cpus"
          type: SCALAR
          scalar {
            value: 3
          }
        }
        resources {
          name: "mem"
          type: SCALAR
          scalar {
            value: 256
          }
        }
        resources {
          name: "disk"
          type: SCALAR
          scalar {
            value: 1024
          }
        }
        resources {
          name: "ports"
          type: RANGES
          ranges {
            range {
              begin: 31000
              end: 32000
            }
          }
        }
        id {
          value: "6d430237-e4d5-4852-8459-2020f598449f-S2"
        }
        checkpoint: true
        port: 39521
        , @0x7f2423e76c28 { 32-byte object <78-A9 BC-2B 24-7F 00-00 00-00 00-00 00-00 00-00
01-00 00-00 00-00 00-00 01-00 00-00 24-7F 00-00>, 32-byte object <78-A9 BC-2B 24-7F
00-00 00-00 00-00 00-00 00-00 01-00 00-00 00-00 00-00 02-00 00-00 24-7F 00-00>, 32-byte
object <78-A9 BC-2B 24-7F 00-00 00-00 00-00 00-00 00-00 01-00 00-00 00-00 00-00 03-00 00-00
00-00 00-00> }, @0x7f2423e76f20 48-byte object <01-00 00-00 24-7F 00-00 00-00 00-00
00-00 00-00 BF-83 8E-4D FE-7F 00-00 C0-89 E7-23 24-7F 00-00 00-87 E7-23 24-7F 00-00 8C-52
15-29 24-7F 00-00>, @0x7f2414006e98 { cpus:3, mem
        :256, disk:1024, ports:[31000-32000] }, @0x7f2414006e30 {})
                 Expected: to be called once
                   Actual: called twice - over-saturated and active
        *** Aborted at 1521624413 (unix time) try "date -d @1521624413" if you are using GNU
date ***
        PC: @          0x2cb968b testing::UnitTest::AddTestPartResult()
        *** SIGSEGV (@0x0) received by PID 14803 (TID 0x7f2423e78700) from PID 0; stack trace:
***
            @     0x7f242cba25d0 (unknown)
            @          0x2cb968b testing::UnitTest::AddTestPartResult()
            @          0x2cb9219 testing::internal::AssertHelper::operator=()
            @          0x2cfc809 testing::internal::GoogleTestFailureReporter::ReportFailure()
            @           0xe36438 testing::internal::Expect()
            @          0x2cf6ef4 testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith()
            @          0x135367a _ZN7testing8internal18FunctionMockerBaseIFvRKN5mesos7SlaveIDERKNS2_9SlaveInfoERKSt6vectorINS2_20SlaveInfo_CapabilityESaISA_EERK6OptionINS2_14UnavailabilityEERKNS2_9ResourcesERK7hashmapINS2_11FrameworkIDESK_St4hashISO_ESt8equal_toISO_EEEE10InvokeWithERKSt5tupleIJS5_S8_SE_SJ_SM_SV_EE
            @          0x135362b testing::internal::FunctionMocker<>::Invoke()
            @          0x12ebc75 mesos::internal::tests::TestAllocator<>::addSlave()
            @     0x7f2433f04cad mesos::internal::master::Master::addSlave()
            @     0x7f2433f030e6 mesos::internal::master::Master::__registerSlave()
            @     0x7f243402d3b3 _ZZN7process8dispatchIN5mesos8internal6master6MasterERKNS_4UPIDEONS2_20RegisterSlaveMessageERKNS_6FutureIbEES7_S8_SD_EEvRKNS_3PIDIT_EEMSF_FvT0_T1_T2_EOT3_OT4_OT5_ENKUlOS5_S9_OSB_PNS_11ProcessBaseEE_clESU_S9_SV_SX_
            @     0x7f243402cfa1 _ZN5cpp176invokeIZN7process8dispatchIN5mesos8internal6master6MasterERKNS1_4UPIDEONS4_20RegisterSlaveMessageERKNS1_6FutureIbEES9_SA_SF_EEvRKNS1_3PIDIT_EEMSH_FvT0_T1_T2_EOT3_OT4_OT5_EUlOS7_SB_OSD_PNS1_11ProcessBaseEE_JS7_SA_SD_SZ_EEEDTclclsr3stdE7forwardISH_Efp_Espclsr3stdE7forwardIT0_Efp0_EEEOSH_DpOS11_
            @     0x7f243402cf0d _ZN6lambda8internal7PartialIZN7process8dispatchIN5mesos8internal6master6MasterERKNS2_4UPIDEONS5_20RegisterSlaveMessageERKNS2_6FutureIbEESA_SB_SG_EEvRKNS2_3PIDIT_EEMSI_FvT0_T1_T2_EOT3_OT4_OT5_EUlOS8_SC_OSE_PNS2_11ProcessBaseEE_JS8_SB_SE_St12_PlaceholderILi1EEEE13invoke_expandIS11_St5tupleIJS8_SB_SE_S13_EES16_IJOS10_EEJLm0ELm1ELm2ELm3EEEEDTclsr5cpp17E6invokeclsr3stdE7forwardISI_Efp_Espcl6expandclsr3stdE3getIXT2_EEclsr3stdE7forwardISM_Efp0_EEclsr3stdE7forwardISN_Efp2_EEEEOSI_OSM_N5cpp1416integer_sequenceImJXspT2_EEEEOSN_
            @     0x7f243402cdf2 _ZNO6lambda8internal7PartialIZN7process8dispatchIN5mesos8internal6master6MasterERKNS2_4UPIDEONS5_20RegisterSlaveMessageERKNS2_6FutureIbEESA_SB_SG_EEvRKNS2_3PIDIT_EEMSI_FvT0_T1_T2_EOT3_OT4_OT5_EUlOS8_SC_OSE_PNS2_11ProcessBaseEE_JS8_SB_SE_St12_PlaceholderILi1EEEEclIJS10_EEEDTcl13invoke_expandclL_ZSt4moveIRS11_EONSt16remove_referenceISI_E4typeEOSI_EdtdefpT1fEclL_ZS16_IRSt5tupleIJS8_SB_SE_S13_EEES1B_S1C_EdtdefpT10bound_argsEcvN5cpp1416integer_sequenceImJLm0ELm1ELm2ELm3EEEE_Eclsr3stdE16forward_as_tuplespclsr3stdE7forwardIT_Efp_EEEEDpOS1J_
            @     0x7f243402cd72 _ZN5cpp176invokeIN6lambda8internal7PartialIZN7process8dispatchIN5mesos8internal6master6MasterERKNS4_4UPIDEONS7_20RegisterSlaveMessageERKNS4_6FutureIbEESC_SD_SI_EEvRKNS4_3PIDIT_EEMSK_FvT0_T1_T2_EOT3_OT4_OT5_EUlOSA_SE_OSG_PNS4_11ProcessBaseEE_JSA_SD_SG_St12_PlaceholderILi1EEEEEJS12_EEEDTclclsr3stdE7forwardISK_Efp_Espclsr3stdE7forwardIT0_Efp0_EEEOSK_DpOS17_
            @     0x7f243402cd36 _ZN6lambda8internal6InvokeIvEclINS0_7PartialIZN7process8dispatchIN5mesos8internal6master6MasterERKNS5_4UPIDEONS8_20RegisterSlaveMessageERKNS5_6FutureIbEESD_SE_SJ_EEvRKNS5_3PIDIT_EEMSL_FvT0_T1_T2_EOT3_OT4_OT5_EUlOSB_SF_OSH_PNS5_11ProcessBaseEE_JSB_SE_SH_St12_PlaceholderILi1EEEEEJS13_EEEvOSL_DpOT0_
            @     0x7f243402cafa _ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEE10CallableFnINS_8internal7PartialIZNS1_8dispatchIN5mesos8internal6master6MasterERKNS1_4UPIDEONSB_20RegisterSlaveMessageERKNS1_6FutureIbEESG_SH_SM_EEvRKNS1_3PIDIT_EEMSO_FvT0_T1_T2_EOT3_OT4_OT5_EUlOSE_SI_OSK_S3_E_JSE_SH_SK_St12_PlaceholderILi1EEEEEEclEOS3_
            @     0x7f242dfcc55d _ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEEclES3_
            @     0x7f242dfae809 process::ProcessBase::consume()
            @     0x7f242e032549 _ZNO7process13DispatchEvent7consumeEPNS_13EventConsumerE
            @           0xdda4d6 process::ProcessBase::serve()
            @     0x7f242dfab2bd process::ProcessManager::resume()
            @     0x7f242dfb4d3e process::ProcessManager::init_threads()::$_1::operator()()
            @     0x7f242dfb4be5 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_1vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
            @     0x7f242dfb4bb5 std::_Bind_simple<>::operator()()
            @     0x7f242dfb4aa9 std::thread::_State_impl<>::_M_run()
            @     0x7f2429a6e90f execute_native_thread_routine
            @     0x7f242cb9873a start_thread
            @     0x7f24291d6e7f __GI___clone
        [2]    14803 segmentation fault (core dumped)  ./src/mesos-tests --gtest_filter='*MasterAllocatorTest/0*'
--gtest_repeat=-1


- Benjamin Bannier


On March 20, 2018, 9:36 p.m., Till Toenshoff wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66165/
> -----------------------------------------------------------
> 
> (Updated March 20, 2018, 9:36 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Benjamin Bannier.
> 
> 
> Bugs: MESOS-8613
>     https://issues.apache.org/jira/browse/MESOS-8613
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When the slave has a very short lifetime, its scheduled registration
> retry might occur when the test is tearing down. These unintuitively
> motivated registrations in turn cause additional invocations of
> `AddSlave` on the allocator.
> Additionally, this also reverts the newly introduced Clock pauses as
> they have shown to be problematic.
> 
> 
> Diffs
> -----
> 
>   src/tests/master_allocator_tests.cpp 1ceb8e8a57ab300a957931d5ad3d54904e555597 
> 
> 
> Diff: https://reviews.apache.org/r/66165/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> Ran the MasterAllocatorTests 10k times without any hiccups.
> 
> 
> Thanks,
> 
> Till Toenshoff
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message