[argobots-discuss] question about ABT_mutex and ULT scheduling
Phil Carns
carns at mcs.anl.gov
Thu Jan 7 08:27:59 CST 2021
Confirmed. Thanks again Shintaro!
I repeated my test case on Summit this morning using an argobots at main
spack build. The suspicious mutex behavior is gone and the benchmark is
hitting its performance target. I'm optimistic that this will fix some
other confusing performance problems that we've seen there :)
We'll adjust our documentation/recommendations for building our stack on
Summit accordingly until there is a number spack version with the fix.
-Phil
On 1/6/21 4:36 PM, Phil Carns via discuss wrote:
>
> Oh that's interesting! I'm glad I asked about this then :)
>
>
> I'm reproducing this on Summit (POWER arch) with Argobots 1.0 and gcc
> (everything built using spack).
>
>
> I cannot reproduce it on my laptop (x86_64 arch) with Argobots 1.0 and
> gcc, but there are so many differences between my laptop and Summit I
> wasn't sure where to start :)
>
>
> I'll try using the most recent git revision on Summit and see what
> that does.
>
>
> thanks,
>
> -Phil
>
>
> On 1/6/21 2:06 PM, Iwasaki, Shintaro wrote:
>> Hi Phil,
>>
>> Thank you for a good question! I created an issue:
>> https://github.com/pmodels/argobots/issues/287
>>
>> Yes, what you expect is correct. A ULT (lock/unlock) may not yield
>> if there is no contention. We guarantee this and will make it clear
>> in the specification.
>>
>> The current Argobots (assuming the current master) should work as you
>> expect; ULT A should never yield in your case.
>>
>> In the case of Argobots 1.0 or 1.0.1, a ULT may yield because of the
>> following possible reasons, both of which are fixed in the current
>> master:
>> 1. Lock is not performed atomically "strong" while the architecture
>> supports weak atomics (e.g., on ARM and POWER) (fixed by
>> https://github.com/pmodels/argobots/pull/223)
>> 2. If you explicitly pass `--disable-simple-mutex` at configuration
>> time, the previous mutex-handover mechanism may have this issue
>> (fixed by https://github.com/pmodels/argobots/pull/268)
>>
>> Regarding 1., because some atomic instructions spuriously fail
>> ("weak"
>> https://en.cppreference.com/w/c/atomic/atomic_compare_exchange),
>> maybe the current spinlock implementation in Argobots causes this
>> issue if you are using non-Intel hardware. I'd be happy if you could
>> let us know what combination of hardware and compiler (with a
>> compiler version) you are using. If you are using Intel hardware, I
>> believe the current Argobots master work correctly unless you use a
>> not-so-common compiler (e.g., PGI), but I will check.
>>
>> (Note that a priority lock/unlock is just a hint, so it will not help.)
>>
>> Anyway, I should make this point clearer in the specification. At
>> the same time, I will add a test to see if this is really the case.
>> If the current mechanism is broken, I will fix it. Please estimate
>> that this clarification and fix (if possible) will come this week.
>>
>> Thanks,
>> Shintaro Iwasaki
>>
>> ------------------------------------------------------------------------
>> *From:* Phil Carns via discuss <discuss at lists.argobots.org>
>> *Sent:* Wednesday, January 6, 2021 12:27 PM
>> *To:* discuss at lists.argobots.org <discuss at lists.argobots.org>
>> *Cc:* Carns, Philip H. <carns at mcs.anl.gov>
>> *Subject:* [argobots-discuss] question about ABT_mutex and ULT
>> scheduling
>> Hi all,
>>
>> We've isolated a situation where the ABT_mutex construct is behaving a
>> little differently than I expected. We have two ULTs running on a
>> single ES. The ULTs are using ABT_mutex_lock/free() to protect a shared
>> data structure. This specific configuration will never have lock
>> contention (the mutex is really there to protect more complex
>> configurations where there are more ESs and ULTs participating than what
>> I described above).
>>
>> Here is what puzzles me: I'm not 100% sure, but it really looks like ULT
>> A yields to ULT B when attempting to lock the mutex sometimes, even
>> though there is no contention.
>>
>> This is a performance bug for us; ULT B is only supposed to execute when
>> ULT A is idle in this configuration. We don't really want to give up
>> execution when acquiring an uncontested mutex if we don't have to.
>>
>> I'm sure we could work around it (restructuring our code, or using a
>> spinlock or priority mutex or something), but I wanted to ask on the
>> list first: is the behavior I described above (a ULT yielding on a mutex
>> lock, even if the mutex is available) expected? Or is it an indication
>> that we are doing something wrong somewhere? I want to make sure that I
>> understand the problem before altering the code.
>>
>> thanks!
>>
>> -Phil
>>
>>
>> _______________________________________________
>> discuss mailing list
>> discuss at lists.argobots.org
>> https://lists.argobots.org/mailman/listinfo/discuss
>> <https://lists.argobots.org/mailman/listinfo/discuss>
>
> _______________________________________________
> discuss mailing list
> discuss at lists.argobots.org
> https://lists.argobots.org/mailman/listinfo/discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.argobots.org/pipermail/discuss/attachments/20210107/03204d8a/attachment.html>
More information about the discuss
mailing list