<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span>Hello Phil,</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span><br>
</span>
<div>Thank you. I understand the situation more. <span style="font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255); display: inline !important">
In my understanding, a</span>ll the following options can be implemented in the current Argobots. Some are less invasive while others are easy to implement.</div>
<div><br>
</div>
<div>- Single <span style="margin: 0px; font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255); display: inline !important">execution stream + <span>one </span></span>pool + ABT_pool_pop_timedwait()</div>
<blockquote style="border-left: 3px solid rgb(200, 200, 200); border-top-color: rgb(200, 200, 200); border-right-color: rgb(200, 200, 200); border-bottom-color: rgb(200, 200, 200); padding-left: 1ex; margin-left: 0.8ex; color: rgb(102, 102, 102);">
<div>For that to work, though, the pool implementation would need to be able to inspect the ABT_unit at push() time and tell whether it is a newly created thread or a resumed thread so that it could track them separately.</div>
</blockquote>
<div>Without https://github.com/pmodels/argobots/issues/154, one can use a flag stored in a user-created descriptor corresponding to the thread to check if that thread has been already executed or not. A hash table is a general solution, but it would be heavy.
In some applications, such descriptor can be obtained via ABT_thread_get_arg().<br>
</div>
<div><br>
</div>
<div>Another way is to use a ULT-specific value (e.g., ABT_thread_get_specific()) to manage such a flag. A quick hack is using `ABT_thread_set_arg()` and `ABT_thread_get_arg()` to manage an execution flag, which may be faster than ABT_thread_set_specific()
and ABT_thread_get_specific() in the current Argobots implementation (related to https://github.com/pmodels/argobots/issues/159).</div>
<div><br>
</div>
<div>This idea is less invasive, but implementing a correct and reasonably scalable pool with flag management might not be an easy task.</div>
<div><br>
</div>
<div>
<div style="margin: 0px">
<div style="margin: 0px; font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255)">
- <span style="font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255); display: inline !important">
Single </span><span style="margin: 0px; font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255); display: inline !important">execution stream + multiple </span><span style="font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255); display: inline !important">pools</span>
+ ABT_pool_pop_timedwait()</div>
<div style="margin: 0px"><br>
</div>
Even if you use two pools (for example, the scheduler I suggested in the previous mail), it should work well if these two pools (old-thread-pool and new-thread-pool) share the same Pthreads mutex/condition variable. This change of the pool implementation can
be minimum.<br>
</div>
<div style="margin: 0px"><br>
</div>
</div>
<div style="margin: 0px">
<div style="margin: 0px; font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255)">
- Multiple execution streams + each has one pool + ABT_pool_pop_timedwait()</div>
<br>
</div>
<div style="margin: 0px">The easiest way that does not change the pool implementation is using multiple execution streams: some for newly created threads (<span style="font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255); display: inline !important">these
execution streams</span> only check new-thread-pool) and the others for suspended threads (these execution streams only check old-thread-pool). The oversubscription cost should not be very high if these execution streams are sleeping immediately when no work
is available. This does not need to change the pool implementation but is very invasive.</div>
<div style="margin: 0px"><br>
</div>
<div>I would also like to note that, <span style="font-family: Calibri, Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255); display: inline !important">
presently </span>there is no good example of custom pool implementation in Argobots, so it is hard to show how to implement it in a reasonably scalable way. I will add a reasonable example in one to two days, which might be helpful.</div>
<div><br>
</div>
<div>Thanks,<br>
</div>
<div>Shintaro</div>
</div>
<div id="appendonsend"></div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Carns, Philip H. <carns@mcs.anl.gov><br>
<b>Sent:</b> Wednesday, July 8, 2020 7:59 AM<br>
<b>To:</b> Iwasaki, Shintaro <siwasaki@anl.gov>; discuss@lists.argobots.org <discuss@lists.argobots.org><br>
<b>Subject:</b> Re: scheduling with priority for resumed ULTs</font>
<div> </div>
</div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
That's interesting.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
For us the issue of how to block on two pools would be a problem. I don't think we have an application-specific rules that would help; either pool could receive new work while the scheduler is blocked on a pop.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
Something along the lines of #154 that would allows some control over what happens within the pool data structure would be helpful, but it's not a high priority.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
In the meantime (since our use case is so simple) I wonder if we could do something within the confines of the current pool interface. The linked list pointers are not exposed to the caller (right?), so nothing is stopping a pool from maintaining multiple
linked lists internally if it wants to. Multiple work unit queues within a single pool could share a single internal condition/signalling mechanism for blocking pop calls.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
For that to work, though, the pool implementation would need to be able to inspect the ABT_unit at push() time and tell whether it is a newly created thread or a resumed thread so that it could track them separately.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
Is there any way to do that? It might not be a great idea from a software engineering perspective for a pool to dig too deep into the unit or thread data structures, but if there were something in there that could indicate if a thread had ever been run or
not, then we could hack it as a proof of concept to see if it makes a performance difference before spending time on something more invasive.<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
thanks,</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
-Phil<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div id="x_appendonsend"></div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Iwasaki, Shintaro <siwasaki@anl.gov><br>
<b>Sent:</b> Tuesday, July 7, 2020 6:44 PM<br>
<b>To:</b> discuss@lists.argobots.org <discuss@lists.argobots.org><br>
<b>Cc:</b> Carns, Philip H. <carns@mcs.anl.gov><br>
<b>Subject:</b> Re: scheduling with priority for resumed ULTs</font>
<div> </div>
</div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<span></span><span>Hello, Phil,<br>
</span></div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<div><br>
</div>
<div>Thank you for your excellent question. The current Argobots does not provide a very straightforward way.<br>
</div>
<div><br>
</div>
<div>1. The simplest idea<br>
</div>
<div><br>
</div>
<div>In my opinion, the easiest way should be one that uses two pools, new-thread-pool and old-thread-pool.<br>
</div>
<div>The new threads/tasklets are pushed to one of new-thread-pools. The user-defined scheduler looks like following:<br>
</div>
<div><br>
</div>
<div>sched_run():<br>
</div>
<div> while (1) {<br>
</div>
<div> if (unit = ABT_pool_pop(old_thread_pool)) {<br>
</div>
<div> /* Prioritize resumed/yielded threads */<br>
</div>
<div> ABT_xstream_run_unit(unit, old_thread_pool);<br>
</div>
<div> continue;<br>
</div>
<div> }<br>
</div>
<div> if (unit = ABT_pool_pop(new_thread_pool)) {<br>
</div>
<div> ABT_unit_set_associated_pool(unit, old_thread_pool);<br>
</div>
<div> /* Threads are moved to old_thread_pool, so if this "unit" suspends or yields, it is<br>
</div>
<div> * pushed to old_thread_pool, which will be prioritized over new threads. */<br>
</div>
<div> ABT_xstream_run_unit(unit, old_thread_pool);<br>
</div>
<div> }<br>
</div>
<div> }<br>
</div>
<div><br>
</div>
<div>However, this scheduler may cause a deadlock with a certain dependency. For example, thread2 is never scheduled forever since thread1 is in old_thread_pool.<br>
</div>
<div><br>
</div>
<div>g_flag = 0;<br>
</div>
<div>void thread1() {<br>
</div>
<div> ABT_thread_create(thread2, ... new_thread_pool); /* newly created thread is pushed to new_thread_pool */<br>
</div>
<div> while (g_flag == 0)<br>
</div>
<div> ABT_thread_yield(); /* thread1 was associated with old_thread_pool when thread1 was scheduled for the first time. */<br>
</div>
<div>}<br>
</div>
<div>void thread2() {<br>
</div>
<div> g_flag = 1;<br>
</div>
<div>}<br>
</div>
<div><br>
</div>
<div>To avoid this, the scheduler can sometimes check and run threads in new_thread_pool (for example, every N iterations).<br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>2. Does it work with ABT_pool_pop_timedwait() (i.e., ABT_POOL_FIFO_WAIT)?<br>
</div>
<div><br>
</div>
<div>ABT_pool_pop_timedwait() only takes a single pool; users cannot timed-wait for multiple pools. Consider using ABT_pool_pop_timedwait() instead of ABT_pool_pop() in the scheduler I mentioned above. In general, a scheduler can timed-wait (= sleep) for
<span style="font-family:Calibri,Arial,Helvetica,sans-serif; background-color:rgb(255,255,255); display:inline!important">
either old_thread_pool or new_thread_pool</span> even though the other pool has threads. If there is application-specific knowledge (e.g., old_thread_pool can be empty only when new_thread_pool is empty etc), ABT_pool_pop_timedwait() + the scheduling strategy
above is a good idea, though.<br>
</div>
<div><br>
</div>
<div>For now, there is no general solution. One idea is using more execution streams: some ESs are dedicated to new-thread-pool while the other ESs to old-thread-pool. If they sleep in ABT_pool_pop_timedwait(), the performance penalty of oversubscription
etc should be small.<br>
</div>
<div><br>
</div>
<div>Creating a customized pool is another way (e.g., marking a thread when it is scheduled for the first time and manages newly created threads and suspend threads separately in a pool), but it is complicated.<br>
</div>
<div><br>
</div>
<div>The fundamental solution should be allowing different pool operations corresponding to yield/create/suspend/... (e.g., push to the head of the list on creation but pushed to the tail of the list on suspension: https://github.com/pmodels/argobots/issues/154),
but it is under development. If this option is the most promising, I will prioritize this.<br>
</div>
<div><br>
</div>
<div>If you have any questions, please let us know.<br>
</div>
<div><br>
</div>
<div>Thanks,<br>
</div>
<div>Shintaro<br>
</div>
<span></span><span></span><br>
</div>
<div id="x_x_appendonsend"></div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Carns, Philip H. via discuss <discuss@lists.argobots.org><br>
<b>Sent:</b> Tuesday, July 7, 2020 4:40 PM<br>
<b>To:</b> discuss@lists.argobots.org <discuss@lists.argobots.org><br>
<b>Cc:</b> Carns, Philip H. <carns@mcs.anl.gov><br>
<b>Subject:</b> [argobots-discuss] scheduling with priority for resumed ULTs</font>
<div> </div>
</div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
Hi all,</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
I thought this question may be of general interest so I am asking on the mailing list.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
My understanding is that the default pool/scheduler combination uses FIFO ordering. Suppose we wanted to try a slight variation: FIFO ordering, but with resumed ULTs always taking priority over new ULTs that have not yet begun execution.<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
The use case for this would be for a data service to expedite requests that are already in progress (and were suspended while waiting on disk or network activity) to try to get them out of the system before starting to process new requests, assuming that there
is work available in either category. We create a new ULT for every incoming request. Under heavy client process load it is plausible that the final step(s) of servicing an existing request could be delayed behind newly incoming requests, but we haven't
empirically confirmed yet.<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
What would be the easiest way to accomplish this? I think I can find a way to do it, but it probably would not be the cleverest solution
<span id="x_x_x_🙂">🙂</span><br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
FWIW we usually use ABT_POOL_FIFO_WAIT and ABT_SCHED_BASIC_WAIT rather than the default pool and scheduler, but I don't think that should change anything. They are based on the default pool and scheduler and only differ in terms of their idle behavior.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
thanks!</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)">
-Phil<br>
</div>
</div>
</div>
</div>
</body>
</html>