[argobots-discuss] question about ABT_mutex and ULT scheduling

Phil Carns carns at mcs.anl.gov
Wed Jan 6 15:36:58 CST 2021


Oh that's interesting!  I'm glad I asked about this then :)


I'm reproducing this on Summit (POWER arch) with Argobots 1.0 and gcc 
(everything built using spack).


I cannot reproduce it on my laptop (x86_64 arch) with Argobots 1.0 and 
gcc, but there are so many differences between my laptop and Summit I 
wasn't sure where to start :)


I'll try using the most recent git revision on Summit and see what that 
does.


thanks,

-Phil


On 1/6/21 2:06 PM, Iwasaki, Shintaro wrote:
> Hi Phil,
>
> Thank you for a good question! I created an issue: 
> https://github.com/pmodels/argobots/issues/287
>
> Yes, what you expect is correct.  A ULT (lock/unlock) may not yield if 
> there is no contention.  We guarantee this and will make it clear in 
> the specification.
>
> The current Argobots (assuming the current master) should work as you 
> expect; ULT A should never yield in your case.
>
> In the case of Argobots 1.0 or 1.0.1, a ULT may yield because of the 
> following possible reasons, both of which are fixed in the current master:
> 1. Lock is not performed atomically "strong" while the architecture 
> supports weak atomics (e.g., on ARM and POWER) (fixed by 
> https://github.com/pmodels/argobots/pull/223)
> 2. If you explicitly pass `--disable-simple-mutex` at configuration 
> time, the previous mutex-handover mechanism may have this issue (fixed 
> by https://github.com/pmodels/argobots/pull/268)
>
> Regarding 1., because some atomic instructions spuriously fail ("weak" 
> https://en.cppreference.com/w/c/atomic/atomic_compare_exchange), maybe 
> the current spinlock implementation in Argobots causes this issue if 
> you are using non-Intel hardware.  I'd be happy if you could let us 
> know what combination of hardware and compiler (with a compiler 
> version) you are using.  If you are using Intel hardware, I believe 
> the current Argobots master work correctly unless you use a 
> not-so-common compiler (e.g., PGI), but I will check.
>
> (Note that a priority lock/unlock is just a hint, so it will not help.)
>
> Anyway, I should make this point clearer in the specification.  At the 
> same time, I will add a test to see if this is really the case.  If 
> the current mechanism is broken, I will fix it.  Please estimate that 
> this clarification and fix (if possible) will come this week.
>
> Thanks,
> Shintaro Iwasaki
>
> ------------------------------------------------------------------------
> *From:* Phil Carns via discuss <discuss at lists.argobots.org>
> *Sent:* Wednesday, January 6, 2021 12:27 PM
> *To:* discuss at lists.argobots.org <discuss at lists.argobots.org>
> *Cc:* Carns, Philip H. <carns at mcs.anl.gov>
> *Subject:* [argobots-discuss] question about ABT_mutex and ULT scheduling
> Hi all,
>
> We've isolated a situation where the ABT_mutex construct is behaving a
> little differently than I expected.  We have two ULTs running on a
> single ES.  The ULTs are using ABT_mutex_lock/free() to protect a shared
> data structure.  This specific configuration will never have lock
> contention (the mutex is really there to protect more complex
> configurations where there are more ESs and ULTs participating than what
> I described above).
>
> Here is what puzzles me: I'm not 100% sure, but it really looks like ULT
> A yields to ULT B when attempting to lock the mutex sometimes, even
> though there is no contention.
>
> This is a performance bug for us; ULT B is only supposed to execute when
> ULT A is idle in this configuration.  We don't really want to give up
> execution when acquiring an uncontested mutex if we don't have to.
>
> I'm sure we could work around it (restructuring our code, or using a
> spinlock or priority mutex or something), but I wanted to ask on the
> list first: is the behavior I described above (a ULT yielding on a mutex
> lock, even if the mutex is available) expected?  Or is it an indication
> that we are doing something wrong somewhere?  I want to make sure that I
> understand the problem before altering the code.
>
> thanks!
>
> -Phil
>
>
> _______________________________________________
> discuss mailing list
> discuss at lists.argobots.org
> https://lists.argobots.org/mailman/listinfo/discuss 
> <https://lists.argobots.org/mailman/listinfo/discuss>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.argobots.org/pipermail/discuss/attachments/20210106/0e9b3e3b/attachment-0001.html>


More information about the discuss mailing list