Docs @ c78933f

MFC Action · MFC Action · commit ba1a3132b4af · 2024-03-09T22:39:50.000Z
diff --git a/documentation/md_expectedPerformance.html b/documentation/md_expectedPerformance.html
@@ -137,40 +137,24 @@
 <div class="textblock"><p><a class="anchor" id="autotoc_md53"></a> MFC has been benchmarked on several CPUs and GPU devices. This page shows a summary of these results.</p>
 <h1><a class="anchor" id="autotoc_md54"></a>
 Expected time-steps/hour</h1>
-<p>The following table outlines expected performance in terms of the number of time steps per hour, rounded to the nearest hundred (higher is better). A 3D inviscid, 6-equation problem is solved for various problem sizes (grid cells) and hardware. A 3rd order (3-stage) Runge-Kutta time-stepper is used. CPU results utilize an entire processor die.</p>
+<p>The following table outlines observed performance as nanoseconds per grid point (ns/GP) per right-hand side evaluation (lower is better). We solve an example 3D, inviscid, 5-equation model problem with two advected species (a total of 8 PDEs). The numerics are WENO5 and the HLLC approximate Riemann solver. We report results for various numbers of grid points per CPU die (or GPU device) and hardware.</p>
 <table class="markdownTable">
 <tr class="markdownTableHead">
-<th class="markdownTableHeadRight">Hardware   </th><th class="markdownTableHeadCenter"># Cores   </th><th class="markdownTableHeadCenter">Steps/Hr (1M pts)   </th><th class="markdownTableHeadCenter">Steps/Hr (4M pts)   </th><th class="markdownTableHeadCenter">Steps/Hr (8M pts)   </th><th class="markdownTableHeadCenter">Compiler   </th><th class="markdownTableHeadLeft">Computer    </th></tr>
+<th class="markdownTableHeadRight">Hardware   </th><th class="markdownTableHeadCenter"></th><th class="markdownTableHeadCenter">1M GPs   </th><th class="markdownTableHeadCenter">4M GPs   </th><th class="markdownTableHeadCenter">8M GPs   </th><th class="markdownTableHeadCenter">Compiler   </th><th class="markdownTableHeadLeft">Computer    </th></tr>
 <tr class="markdownTableRowOdd">
-<td class="markdownTableBodyRight">NVIDIA V100   </td><td class="markdownTableBodyCenter">1 (device)   </td><td class="markdownTableBodyCenter">88.5k   </td><td class="markdownTableBodyCenter">18.7k   </td><td class="markdownTableBodyCenter">N/A   </td><td class="markdownTableBodyCenter">NVHPC 22.11   </td><td class="markdownTableBodyLeft">PACE Phoenix    </td></tr>
+<td class="markdownTableBodyRight">NVIDIA V100   </td><td class="markdownTableBodyCenter">1 device   </td><td class="markdownTableBodyCenter">96   </td><td class="markdownTableBodyCenter">104   </td><td class="markdownTableBodyCenter">104   </td><td class="markdownTableBodyCenter">NVHPC 22.11   </td><td class="markdownTableBodyLeft">PACE Phoenix    </td></tr>
 <tr class="markdownTableRowEven">
-<td class="markdownTableBodyRight">NVIDIA V100   </td><td class="markdownTableBodyCenter">1 (device)   </td><td class="markdownTableBodyCenter">78.8k   </td><td class="markdownTableBodyCenter">18.8k   </td><td class="markdownTableBodyCenter">N/A   </td><td class="markdownTableBodyCenter">NVHPC 22.11   </td><td class="markdownTableBodyLeft">OLCF Summit    </td></tr>
+<td class="markdownTableBodyRight">NVIDIA V100   </td><td class="markdownTableBodyCenter">1 device   </td><td class="markdownTableBodyCenter">101   </td><td class="markdownTableBodyCenter">104   </td><td class="markdownTableBodyCenter">104   </td><td class="markdownTableBodyCenter">NVHPC 22.11   </td><td class="markdownTableBodyLeft">OLCF Summit    </td></tr>
 <tr class="markdownTableRowOdd">
-<td class="markdownTableBodyRight">NVIDIA A100   </td><td class="markdownTableBodyCenter">1 (device)   </td><td class="markdownTableBodyCenter">114.4k   </td><td class="markdownTableBodyCenter">34.6k   </td><td class="markdownTableBodyCenter">16.5k   </td><td class="markdownTableBodyCenter">NVHPC 23.5   </td><td class="markdownTableBodyLeft">Wingtip    </td></tr>
+<td class="markdownTableBodyRight">NVIDIA A100   </td><td class="markdownTableBodyCenter">1 device   </td><td class="markdownTableBodyCenter">71   </td><td class="markdownTableBodyCenter">56   </td><td class="markdownTableBodyCenter">59   </td><td class="markdownTableBodyCenter">NVHPC 23.5   </td><td class="markdownTableBodyLeft">Wingtip    </td></tr>
 <tr class="markdownTableRowEven">
-<td class="markdownTableBodyRight">AMD MI250X   </td><td class="markdownTableBodyCenter">1 (GCD)   </td><td class="markdownTableBodyCenter">77.5k   </td><td class="markdownTableBodyCenter">22.3k   </td><td class="markdownTableBodyCenter">11.2k   </td><td class="markdownTableBodyCenter">CCE 16.0.1   </td><td class="markdownTableBodyLeft">OLCF Frontier    </td></tr>
+<td class="markdownTableBodyRight">AMD MI250X   </td><td class="markdownTableBodyCenter">1 GCD   </td><td class="markdownTableBodyCenter">108   </td><td class="markdownTableBodyCenter">90   </td><td class="markdownTableBodyCenter">96   </td><td class="markdownTableBodyCenter">CCE 16.0.1   </td><td class="markdownTableBodyLeft">OLCF Frontier    </td></tr>
 <tr class="markdownTableRowOdd">
-<td class="markdownTableBodyRight">Intel Xeon Gold 6226   </td><td class="markdownTableBodyCenter">12 (cores)   </td><td class="markdownTableBodyCenter">2.5k   </td><td class="markdownTableBodyCenter">0.7k   </td><td class="markdownTableBodyCenter">0.4k   </td><td class="markdownTableBodyCenter">GNU 10.3.0   </td><td class="markdownTableBodyLeft">PACE Phoenix    </td></tr>
+<td class="markdownTableBodyRight">Intel Xeon Gold 6226   </td><td class="markdownTableBodyCenter">12 cores   </td><td class="markdownTableBodyCenter">1963   </td><td class="markdownTableBodyCenter">1688   </td><td class="markdownTableBodyCenter">1686   </td><td class="markdownTableBodyCenter">GNU 10.3.0   </td><td class="markdownTableBodyLeft">PACE Phoenix    </td></tr>
 <tr class="markdownTableRowEven">
-<td class="markdownTableBodyRight">Apple Silicon M2   </td><td class="markdownTableBodyCenter">6 (cores)   </td><td class="markdownTableBodyCenter">2.8k   </td><td class="markdownTableBodyCenter">0.6k   </td><td class="markdownTableBodyCenter">0.2k   </td><td class="markdownTableBodyCenter">GNU 13.2.0   </td><td class="markdownTableBodyLeft">N/A   </td></tr>
-</table>
-<p>We also show the expected performance of MFC for the same problem as above, except for the 5-equation model used, in the table below. It is presented in the same manner as the one above.</p>
-<table class="markdownTable">
-<tr class="markdownTableHead">
-<th class="markdownTableHeadRight">Hardware   </th><th class="markdownTableHeadCenter"># Cores   </th><th class="markdownTableHeadCenter">Steps/Hr (1M pts)   </th><th class="markdownTableHeadCenter">Steps/Hr (4M pts)   </th><th class="markdownTableHeadCenter">Steps/Hr (8M pts)   </th><th class="markdownTableHeadCenter">Compiler   </th><th class="markdownTableHeadLeft">Computer    </th></tr>
-<tr class="markdownTableRowOdd">
-<td class="markdownTableBodyRight">NVIDIA V100   </td><td class="markdownTableBodyCenter">1 (device)   </td><td class="markdownTableBodyCenter">113.4k   </td><td class="markdownTableBodyCenter">26.2k   </td><td class="markdownTableBodyCenter">13.0k   </td><td class="markdownTableBodyCenter">NVHPC 22.11   </td><td class="markdownTableBodyLeft">PACE Phoenix    </td></tr>
-<tr class="markdownTableRowEven">
-<td class="markdownTableBodyRight">NVIDIA V100   </td><td class="markdownTableBodyCenter">1 (device)   </td><td class="markdownTableBodyCenter">107.7k   </td><td class="markdownTableBodyCenter">26.3k   </td><td class="markdownTableBodyCenter">13.1k   </td><td class="markdownTableBodyCenter">NVHPC 22.11   </td><td class="markdownTableBodyLeft">OLCF Summit    </td></tr>
-<tr class="markdownTableRowOdd">
-<td class="markdownTableBodyRight">NVIDIA A100   </td><td class="markdownTableBodyCenter">1 (device)   </td><td class="markdownTableBodyCenter">153.5k   </td><td class="markdownTableBodyCenter">48.0k   </td><td class="markdownTableBodyCenter">22.5k   </td><td class="markdownTableBodyCenter">NVHPC 23.5   </td><td class="markdownTableBodyLeft">Wingtip    </td></tr>
-<tr class="markdownTableRowEven">
-<td class="markdownTableBodyRight">AMD MI250X   </td><td class="markdownTableBodyCenter">1 (GCD)   </td><td class="markdownTableBodyCenter">104.2k   </td><td class="markdownTableBodyCenter">31.0k   </td><td class="markdownTableBodyCenter">14.8k   </td><td class="markdownTableBodyCenter">CCE 16.0.1   </td><td class="markdownTableBodyLeft">OLCF Frontier    </td></tr>
-<tr class="markdownTableRowOdd">
-<td class="markdownTableBodyRight">Intel Xeon Gold 6226   </td><td class="markdownTableBodyCenter">12 (cores)   </td><td class="markdownTableBodyCenter">5.4k   </td><td class="markdownTableBodyCenter">1.6k   </td><td class="markdownTableBodyCenter">0.8k   </td><td class="markdownTableBodyCenter">GNU 10.3.0   </td><td class="markdownTableBodyLeft">PACE Phoenix    </td></tr>
-<tr class="markdownTableRowEven">
-<td class="markdownTableBodyRight">Apple Silicon M2   </td><td class="markdownTableBodyCenter">6 (cores)   </td><td class="markdownTableBodyCenter">3.7k   </td><td class="markdownTableBodyCenter">11.0k   </td><td class="markdownTableBodyCenter">0.3k   </td><td class="markdownTableBodyCenter">GNU 13.2.0   </td><td class="markdownTableBodyLeft">N/A   </td></tr>
+<td class="markdownTableBodyRight">Apple M2   </td><td class="markdownTableBodyCenter">6 cores   </td><td class="markdownTableBodyCenter">2919   </td><td class="markdownTableBodyCenter">245   </td><td class="markdownTableBodyCenter">4500   </td><td class="markdownTableBodyCenter">GNU 13.2.0   </td><td class="markdownTableBodyLeft">N/A   </td></tr>
 </table>
+<p><b>All results are in nanoseconds (ns) per grid point (gp) per right-hand side (rhs) evaluation. Lower is better.</b></p>
 <h1><a class="anchor" id="autotoc_md55"></a>
 Weak scaling</h1>
 <p>Weak scaling results are obtained by increasing the problem size with the number of processes so that work per process remains constant.</p>
diff --git a/index.html b/index.html
@@ -72,12 +72,12 @@
         };
     
         const sims = [
-            new FS("Shedding water droplet - Vorticity", "res/simulations/a.png", "Summit", "960 V100s", "4h", "https://doi.org/10.48550/arXiv.2305.09163"),
-            new FS("Inviscid flow over an airfoil - Vorticity", "res/simulations/g.png", "Delta", "128 A100s", "19h", "https://doi.org/10.48550/arXiv.2305.09163"),
-            new FS("Kidney stone near a collapsing bubble cloud - Maximum principal stresses", "res/simulations/d.png", "Summit", "576 V100s", "30 min", "https://doi.org/10.48550/arXiv.2305.09163"),
+            new FS("Shedding water droplet", "res/simulations/a.png", "Summit", "960 V100s", "4h", "https://doi.org/10.48550/arXiv.2305.09163"),
+            new FS("Flow over an airfoil (vorticity", "res/simulations/g.png", "Delta", "128 A100s", "19h", "https://vimeo.com/917305340/c05fd414c8?share=copy"),
+            new FS("Cavitating bubbles fragment kidney stone", "res/simulations/d.png", "Summit", "576 V100s", "30 min", "https://doi.org/10.48550/arXiv.2305.09163"),
             new FS("Breakup of vibrated interface", "res/simulations/f.png", "Summit", "128 V100s", "4h","https://youtu.be/qQV2ZRDpf2M"), 
-            new FS("Cavitating bubble cloud - Wall pressure", "res/simulations/b.png", "Summit", "216 V100s", "3h", "https://doi.org/10.48550/arXiv.2305.09163"),
-            new FS("Cavitating bubble cloud - Streamlines", "res/simulations/c.png", "Summit", "216 V100s", "3h", "https://doi.org/10.48550/arXiv.2305.09163"),
+            new FS("Collapsing bubbles (pressure)", "res/simulations/b.png", "Summit", "216 V100s", "3h", "https://doi.org/10.48550/arXiv.2305.09163"),
+            new FS("Collapsing bubbles (streamlines)", "res/simulations/c.png", "Summit", "216 V100s", "3h", "https://doi.org/10.48550/arXiv.2305.09163"),
         ];
 
         /*