[argobots-discuss] Argobots ABT_eventual_set too slow

Houjun Tang htang4 at lbl.gov
Wed Apr 24 13:59:43 CDT 2019


Hi Pavan,

OK, I will try that, and will send an update on that later.


Thanks,
Houjun Tang

On Wed, Apr 24, 2019 at 11:51 AM Balaji, Pavan <balaji at anl.gov> wrote:

> Houjun,
>
> You could try BOLT, which allows you to have OpenMP internally use
> Argobots too.  So they won't conflict with each other.
>
>   -- Pavan
>
> > On Apr 24, 2019, at 1:49 PM, Houjun Tang via discuss <
> discuss at lists.argobots.org> wrote:
> >
> > Hi,
> >
> > I think I found the what is causing the performance slowdown. My
> previous experiments were using the default application configuration, with
> OpenMP enabled. I've just tried to compile the application without OpenMP,
> and the performance gets much better, the ABT_eventual_set takes less than
> 0.00002 seconds in all operations. So it looks like running Argobots with
> OpenMP may sometimes cause a slowdown. Any idea on how to resolve this?
> >
> >
> > Thanks,
> > Houjun Tang
> >
> > On Tue, Apr 23, 2019 at 4:43 PM Houjun Tang <htang4 at lbl.gov> wrote:
> > Hi Phil,
> >
> > Thanks for the suggestion. I'm using the basic scheduler, and have
> changed to ABT_SCHED_BASIC as you mentioned, but the issue remains.
> >
> > I've talked to Shintaro earlier this afternoon, and sent him my codes to
> run the experiments, hopefully we can figure out what was going on.
> >
> >
> > Thanks,
> > Houjun Tang
> >
> > On Tue, Apr 23, 2019 at 8:16 AM Carns, Philip H. <carns at mcs.anl.gov>
> wrote:
> > Hi Houjun,
> >
> > Which scheduler are you using with Argobots?
> >
> > If you can reproduce this easily it might be worth toggling between
> ABT_SCHED_BASIC and ABT_SCHED_BASIC_WAIT (whichever one you are not using)
> to narrow down a possible issue there.  I don't expect a problem there, but
> something unusual must be going on.  In my experience eventual_set() is an
> inexpensive call.
> >
> > thanks,
> > -Phil
> > From: Iwasaki, Shintaro via discuss <discuss at lists.argobots.org>
> > Sent: Monday, April 22, 2019 9:45 AM
> > To: Houjun Tang
> > Cc: Iwasaki, Shintaro; discuss at lists.argobots.org
> > Subject: Re: [argobots-discuss] Argobots ABT_eventual_set too slow
> >
> > Hi, Houjun,
> >
> > Thank you for your detailed explanation! There are two possible issues:
> >
> > 1. Because tasklet (= task) is nonpreemptive, the current Argobots-aware
> HDF5 code might fail to overlap communications. For example,
> ABT_eventual_set uses busy-wait based synchronization (
> https://github.com/pmodels/argobots/blob/master/src/eventual.c#L260) if
> tasklet is used. If ULT (= thread) is used, it does user-level
> context-switch based synchronization (
> https://github.com/pmodels/argobots/blob/master/src/eventual.c#L257).
> However, ultimately it depends on how the code is written. This problem
> should be addressed by changing the HDF5 implementation (e.g., by using
> ULTs properly), if this is the case.
> >
> > 2. Currently the Argobots eventual uses a simple linked list, which
> performs poorly if so many threads and tasks are waiting. This performance
> should be improved by more advanced management of waiting threads in the
> Argobots runtime.
> >
> > From your explanation, I can hardly get which problem is more
> significant; intuitively, this operation itself does not seem to consume
> 0.1 seconds or so (though it depends on how many tasks are in the linked
> list).
> > I am happy to examine your code if it is okay, but I would really
> appreciate it if you would give me a simplified code that reproduces your
> performance issue.
> >
> > Thank You,
> > Shintaro Iwasaki
> >
> > On Fri, Apr 19, 2019 at 5:57 PM Houjun Tang <htang4 at lbl.gov> wrote:
> > Hi Shintaro,
> >
> > Thanks for the quick reply, here is a brief summary of what I have been
> working on.
> > I'm adding the asynchronous I/O feature to the HDF5 library using
> Argobots as the background thread execution engine. So whenever there is an
> HDF5 I/O call, the main thread will create an Argobots task (with
> ABT_task_create) and have it executed by Argobots in the background. Only
> one Argobots pool is used, so consecutive tasks are executed sequentially.
> The application code linked to the async HDF5 library is creating and
> writing a lot of HDF5 attributes, as seen in the figure I sent previously,
> the time taken by ABT_eventual_set varies greatly, from 2 us to 0.45 s with
> an average ~0.07s.
> >
> > Any additional information would you like to know? Do you want the code?
> >
> >
> > Regards,
> > Houjun Tang
> >
> > On Fri, Apr 19, 2019 at 2:11 PM Iwasaki, Shintaro via discuss <
> discuss at lists.argobots.org> wrote:
> > Hello Houjun,
> >
> > Thank you for reporting a performance issue with data!
> > Unfortunately, I haven't experienced this issue. I checked the code, but
> it is hard to judge if the implementation of ABT_eventual_set is bad or
> not. As far as I checked the implementation of ABT_eventual_set, this
> function does not looks very optimized (I mean, it uses a naive spinlock),
> but doe not seem very slow (I mean, it does not allocate memory every time).
> > https://github.com/pmodels/argobots/blob/master/src/eventual.c#L229
> >
> > In any case, this single operation should be finished within 1us or less
> (under no contention). I guess it might be caused by a scheduling issue or
> an affinity issue, but since the performance of this function has not been
> fully examined, the current implementation might have some performance
> bugs. I could diagnose this problem more if you would give me more details.
> >
> > Thank You,
> > Shintaro Iwasaki
> >
> > On Fri, Apr 19, 2019 at 2:56 PM Houjun Tang via discuss <
> discuss at lists.argobots.org> wrote:
> > Hi,
> >
> > I'm using Argobots as the engine for executing asynchronous I/O
> operations in the background of an HDF5 application, but found it to be
> slow in some operations. With profiling, the slowdown comes mostly from
> ABT_eventual_set. Below is a boxplot of the ABT_eventual_set time (measured
> by calling gettimeofday before and after it) from 385 operations, running
> with one process and one Argobots thread. The *_fn are different functions
> executed by Argobots. In most cases it's below 0.1s, but there are several
> cases that are taking more than 0.25 seconds. As these HDF5 operations take
> less than 0.1 seconds, the overhead of ABT_eventual_set becomes dominant.
> >
> > Any idea what could have caused this?
> >
> > Thanks,
> > Houjun Tang
> >
> >
> > _______________________________________________
> > discuss mailing list
> > discuss at lists.argobots.org
> > https://lists.argobots.org/mailman/listinfo/discuss
> > _______________________________________________
> > discuss mailing list
> > discuss at lists.argobots.org
> > https://lists.argobots.org/mailman/listinfo/discuss
> > _______________________________________________
> > discuss mailing list
> > discuss at lists.argobots.org
> > https://lists.argobots.org/mailman/listinfo/discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.argobots.org/pipermail/discuss/attachments/20190424/4e3010ce/attachment-0001.html>


More information about the discuss mailing list