Skip to content

Commit ce9df20

Browse files
author
MFC Action
committed
Docs @ 8a24386
1 parent 90e855c commit ce9df20

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

documentation/md_running.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -189,8 +189,8 @@ <h3><a class="anchor" id="autotoc_md111"></a>
189189
<h3><a class="anchor" id="autotoc_md112"></a>
190190
AMD GPUs</h3>
191191
<ul>
192-
<li>Rocprof (ROC): <code>./mfc.sh run ... -t simulation --roc --hip-trace [rocprof flags]</code> allows one to visualize MFC's system-wide performance with <a href="https://ui.perfetto.dev/">Perfetto UI</a>. When used, <code>--roc</code> will run the simulation and generate files in the case directory for all targets. <code>results.json</code> can then be imported in <a href="https://ui.perfetto.dev/">Perfetto's UI</a>. Learn more about AMD Rocprof <a href="https://rocm.docs.amd.com/projects/rocprofiler/en/docs-5.5.1/rocprof.html">here</a> It is best to run case files with few timesteps to keep the report file sizes manageable.</li>
193-
<li>Omniperf (OMNI): <code>./mfc.sh run ... -t simulation --omni [omniperf flags]</code> allows one to conduct kernel-level profiling with <a href="https://rocm.docs.amd.com/projects/omniperf/en/latest/index.html">AMD's Omniperf</a>. When used, <code>--omni</code> will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run. Adding this argument will moderately slow down the simulation and run the MFC executable several times. For this reason, it should only be used with case files with few timesteps.</li>
192+
<li>Rocprof Systems (RSYS): <code>./mfc.sh run ... -t simulation --rsys --hip-trace [rocprof flags]</code> allows one to visualize MFC's system-wide performance with <a href="https://ui.perfetto.dev/">Perfetto UI</a>. When used, <code>--roc</code> will run the simulation and generate files in the case directory for all targets. <code>results.json</code> can then be imported in <a href="https://ui.perfetto.dev/">Perfetto's UI</a>. Learn more about AMD Rocprof <a href="https://rocm.docs.amd.com/projects/rocprofiler/en/docs-5.5.1/rocprof.html">here</a> It is best to run case files with few timesteps to keep the report file sizes manageable.</li>
193+
<li>Rocprof Compute (RCU): <code>./mfc.sh run ... -t simulation --rcu -n &lt;name&gt; [rocprof-compute flags]</code> allows one to conduct kernel-level profiling with <a href="https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/what-is-rocprof-compute.html">ROCm Compute Profiler</a>. When used, <code>--rcu</code> will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run. Adding this argument will moderately slow down the simulation and run the MFC executable several times. For this reason, it should only be used with case files with few timesteps.</li>
194194
</ul>
195195
<p><a class="anchor" id="restarting-cases"></a> </p>
196196
<h2><a class="anchor" id="autotoc_md113"></a>

0 commit comments

Comments
 (0)