<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
code
{mso-style-priority:99;
font-family:"Courier New";}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="FR" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">Hi Phil,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I think we hit the same issue recently on the DAOS side and had to bump the stack size as well. Wangdi & Xuezhao should know more.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Maybe a regression in ABT?<br>
<br>
Cheers,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Johann<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-left:36.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">"Carns, Philip H. via discuss" <discuss@lists.argobots.org><br>
<b>Reply-To: </b>"discuss@lists.argobots.org" <discuss@lists.argobots.org><br>
<b>Date: </b>Thursday, 21 February 2019 at 15:50<br>
<b>To: </b>"discuss@lists.argobots.org" <discuss@lists.argobots.org><br>
<b>Cc: </b>"Carns, Philip H." <carns@mcs.anl.gov><br>
<b>Subject: </b>Re: [argobots-discuss] how to debug a stack overrun in Argobots<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><br>
Just to follow up a little bit; I realized from looking at README.envvar just now that the default value of ABT_THREAD_STACKSIZE is 16K. That's almost certainly too low for us because we have ULTs that make calls into a variety of system libraries (including
fairly big things like libfabric) that are beyond our control.<br>
<br>
It seems likely that we will have to run with a larger stack size, but I would still like to have a better understanding of where the problem paths are, and how much head room we really need, if anyone has suggestions.<br>
<br>
thanks!<br>
-Phil<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:36.0pt"> <o:p></o:p></p>
<div>
<p style="margin-left:36.0pt">On 2019-02-21 15:31:53-05:00 Carns, Philip H. via discuss wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 8.0pt;margin-left:0cm;margin-right:0cm">
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt">Hi all, <br>
<br>
There is a little bit of back story on <a href="https://github.com/pmodels/argobots/issues/93">https://github.com/pmodels/argobots/issues/93</a> , but make a long story short we have realized that we have some code that is overflowing the stack in Argobots.
Many thanks to Shintaro for his help and insight or we may have never figured this out. We can work around the problem with `export <code><span style="font-size:10.0pt">ABT_THREAD_STACKSIZE=$((1024 * 1024))`. This not only fixes a Power8 test case for us,
but also appears to solve a different frustrating, nonsensical segmentation fault that we've been chasing with a different code permutation on x86_64.</span></code><span style="font-size:10.0pt;font-family:"Courier New""><br>
<br>
<code>Any suggestions on how to track down what's triggering this in our code or get a better idea of how much stack we need? </code></span><span style="font-family:"Courier New"">We are using a considerable number of libraries, many of which are not maintained
by us, so I don't even know where to start looking yet. </span>My usual go to tool for this would be asan in gcc or clang, but I don't think that will work correctly with Argobots, and maybe there is a better solution anyway. <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:36.0pt"><br>
thanks,<br>
-Phil<o:p></o:p></p>
</div>
</div>
</div>
</blockquote>
</div>
</div>
<p>---------------------------------------------------------------------<br>
Intel Corporation SAS (French simplified joint stock company)<br>
Registered headquarters: "Les Montalets"- 2, rue de Paris, <br>
92196 Meudon Cedex, France<br>
Registration Number: 302 456 199 R.C.S. NANTERRE<br>
Capital: 4,572,000 Euros</p>
<p>This e-mail and any attachments may contain confidential material for<br>
the sole use of the intended recipient(s). Any review or distribution<br>
by others is strictly prohibited. If you are not the intended<br>
recipient, please contact the sender and delete all copies.</p></body>
</html>