[argobots-discuss] question about ABT_mutex and ULT scheduling

Iwasaki, Shintaro siwasaki at anl.gov
Mon Jan 11 10:25:24 CST 2021


Hi Phil,

Thank you for your report. https://github.com/pmodels/argobots/pull/288 clarified that behavior and added a test to check it.
As far as I checked on POWER 9 (Summit-like configuration) and ARM-v8, the current main branch is working correctly (https://www.argobots.org/tests/).
Argobots 1.1 will include this fix.

Thanks,
Shintaro


________________________________
From: Phil Carns via discuss <discuss at lists.argobots.org>
Sent: Thursday, January 7, 2021 8:27 AM
To: discuss at lists.argobots.org <discuss at lists.argobots.org>
Cc: Carns, Philip H. <carns at mcs.anl.gov>
Subject: Re: [argobots-discuss] question about ABT_mutex and ULT scheduling


Confirmed.  Thanks again Shintaro!


I repeated my test case on Summit this morning using an argobots at main spack build.  The suspicious mutex behavior is gone and the benchmark is hitting its performance target.  I'm optimistic that this will fix some other confusing performance problems that we've seen there :)


We'll adjust our documentation/recommendations for building our stack on Summit accordingly until there is a number spack version with the fix.


-Phil


On 1/6/21 4:36 PM, Phil Carns via discuss wrote:

Oh that's interesting!  I'm glad I asked about this then :)


I'm reproducing this on Summit (POWER arch) with Argobots 1.0 and gcc (everything built using spack).


I cannot reproduce it on my laptop (x86_64 arch) with Argobots 1.0 and gcc, but there are so many differences between my laptop and Summit I wasn't sure where to start :)


I'll try using the most recent git revision on Summit and see what that does.


thanks,

-Phil


On 1/6/21 2:06 PM, Iwasaki, Shintaro wrote:
Hi Phil,

Thank you for a good question! I created an issue: https://github.com/pmodels/argobots/issues/287

Yes, what you expect is correct.  A ULT (lock/unlock) may not yield if there is no contention.  We guarantee this and will make it clear in the specification.

The current Argobots (assuming the current master) should work as you expect; ULT A should never yield in your case.

In the case of Argobots 1.0 or 1.0.1, a ULT may yield because of the following possible reasons, both of which are fixed in the current master:
1. Lock is not performed atomically "strong" while the architecture supports weak atomics (e.g., on ARM and POWER) (fixed by https://github.com/pmodels/argobots/pull/223)
2. If you explicitly pass `--disable-simple-mutex` at configuration time, the previous mutex-handover mechanism may have this issue (fixed by https://github.com/pmodels/argobots/pull/268)

Regarding 1., because some atomic instructions spuriously fail ("weak" https://en.cppreference.com/w/c/atomic/atomic_compare_exchange), maybe the current spinlock implementation in Argobots causes this issue if you are using non-Intel hardware.  I'd be happy if you could let us know what combination of hardware and compiler (with a compiler version) you are using.  If you are using Intel hardware, I believe the current Argobots master work correctly unless you use a not-so-common compiler (e.g., PGI), but I will check.

(Note that a priority lock/unlock is just a hint, so it will not help.)

Anyway, I should make this point clearer in the specification.  At the same time, I will add a test to see if this is really the case.  If the current mechanism is broken, I will fix it.  Please estimate that this clarification and fix (if possible) will come this week.

Thanks,
Shintaro Iwasaki

________________________________
From: Phil Carns via discuss <discuss at lists.argobots.org><mailto:discuss at lists.argobots.org>
Sent: Wednesday, January 6, 2021 12:27 PM
To: discuss at lists.argobots.org<mailto:discuss at lists.argobots.org> <discuss at lists.argobots.org><mailto:discuss at lists.argobots.org>
Cc: Carns, Philip H. <carns at mcs.anl.gov><mailto:carns at mcs.anl.gov>
Subject: [argobots-discuss] question about ABT_mutex and ULT scheduling

Hi all,

We've isolated a situation where the ABT_mutex construct is behaving a
little differently than I expected.  We have two ULTs running on a
single ES.  The ULTs are using ABT_mutex_lock/free() to protect a shared
data structure.  This specific configuration will never have lock
contention (the mutex is really there to protect more complex
configurations where there are more ESs and ULTs participating than what
I described above).

Here is what puzzles me: I'm not 100% sure, but it really looks like ULT
A yields to ULT B when attempting to lock the mutex sometimes, even
though there is no contention.

This is a performance bug for us; ULT B is only supposed to execute when
ULT A is idle in this configuration.  We don't really want to give up
execution when acquiring an uncontested mutex if we don't have to.

I'm sure we could work around it (restructuring our code, or using a
spinlock or priority mutex or something), but I wanted to ask on the
list first: is the behavior I described above (a ULT yielding on a mutex
lock, even if the mutex is available) expected?  Or is it an indication
that we are doing something wrong somewhere?  I want to make sure that I
understand the problem before altering the code.

thanks!

-Phil


_______________________________________________
discuss mailing list
discuss at lists.argobots.org<mailto:discuss at lists.argobots.org>
https://lists.argobots.org/mailman/listinfo/discuss



_______________________________________________
discuss mailing list
discuss at lists.argobots.org<mailto:discuss at lists.argobots.org>
https://lists.argobots.org/mailman/listinfo/discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.argobots.org/pipermail/discuss/attachments/20210111/fc5d6610/attachment.html>


More information about the discuss mailing list