skip to main content
10.1145/3416315.3416323acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurompiConference Proceedingsconference-collections
research-article
Open access

MPI Detach - Asynchronous Local Completion

Published: 07 October 2020 Publication History

Abstract

When aiming for large scale parallel computing, waiting time due to network latency, synchronization, and load imbalance are the primary opponents of high parallel efficiency. A common approach to hide latency with computation is the use of non-blocking communication. In the presence of a consistent load imbalance, synchronization cost is just the visible symptom of the load imbalance. Tasking approaches as in OpenMP, TBB, OmpSs, or C++20 coroutines promise to expose a higher degree of concurrency, which can be distributed on available execution units and significantly increase load balance. Available MPI non-blocking functionality does not integrate seamlessly into such tasking parallelization. In this work, we present a slim extension of the MPI interface to allow seamless integration of non-blocking communication with available concepts of asynchronous execution in OpenMP and C++.

References

[1]
M. B. Baker 2017. OpenSHMEM Specification 1.4. (12 2017). https://doi.org/10.2172/1460190
[2]
George C.1989. Dynamic load balancing for distributed memory multiprocessors. J. Parallel and Distrib. Comput. 7, 2 (1989), 279 – 301. https://doi.org/10.1016/0743-7315(89)90021-X
[3]
James D., R. E. Grant, P. Balaji, D. Goodell, D. Miller, M. Snir, and R. Thakur. 2014. Enabling communication concurrency through flexible MPI endpoints. Int. J. High Perform. Comput. Appl. 28, 4 (2014), 390–405. https://doi.org/10.1177/1094342014548772
[4]
A. C. Demiralp, D. N. Helmrich, and J. Protze. 2019. DPA - the distributed particle advector. https://devhub.vr.rwth-aachen.de/VR-Group/dpa.
[5]
Message Passing Interface Forum. 2015. MPI: A Message-passing Interface Standard, Version 3.1.
[6]
R. E. Grant, M. G. F. Dosanjh, M. J. Levenhagen, R. Brightwell, and A. Skjellum. 2019. Finepoints: Partitioned Multithreaded MPI Communication. In High Performance Computing - 34th Intl. Conf., ISC High Performance 2019, Frankfurt/Main, Germany, June 16-20, 2019, Proc.330–350. https://doi.org/10.1007/978-3-030-20656-7_17
[7]
N. Gustafsson, A. Laksberg, H. Sutter, and S. Mithani. 2013. Improvements to std::future<T> and Related APIs. Technical Report. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3784.pdf
[8]
M.-A. Hermanns, M. Geimer, B. Mohr, and F. Wolf. 2017. Trace-Based Detection of Lock Contention in MPI One-Sided Communication. In Tools for High Performance Computing 2016. Springer Intl. Publishing, Cham, 97–114. https://doi.org/10.1007/978-3-319-56702-0_6
[9]
ISO. 2016. Programming Languages — Technical Specification for C++ Extensions for Concurrency. Technical Report. https://www.iso.org/standard/65242.html
[10]
J. Klinkenberg, P. Samfass, M. Bader, C. Terboven, and M. S. Müller. 2020. CHAMELEON: Reactive Load Balancing for Hybrid MPI+OpenMP Task-Parallel Applications. J. Parallel Distributed Comput. 138 (2020), 55–64. https://doi.org/10.1016/j.jpdc.2019.12.005
[11]
S. Kumar 2008. The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. In Proc. of the 22nd Annual Intl. Conf. on Supercomputing, ICS 2008, 2008. 94–103. https://doi.org/10.1145/1375527.1375544
[12]
S. Lührs, D. Rohe, A. Schnurpfeil, K. Thust, and W. Frings. 2016. Flexible and Generic Workflow Management. In Parallel Computing: On the Road to Exascale (2015-09-01) (Advances in parallel computing), Vol. 27. Intl. Conf. on Parallel Computing 2015, Edinburgh (United Kingdom), 1 Sep 2015 - 4 Sep 2015, IOS Press, Amsterdam, 431 – 438. https://doi.org/10.3233/978-1-61499-621-7-431
[13]
OpenMP Architecture Review Board. 2018. OpenMP Application Program Interface Version 5.0. Specification.
[14]
S. Pellegrini, R. Prodan, and T. Fahringer. 2012. A Lightweight C++ Interface to MPI. In 2012 20th Euromicro Intl. Conf. on Parallel, Distributed and Network-based Processing. 3–10.
[15]
J. Reinders. 2007. Intel Threading Building Blocks(first ed.). O’Reilly & Associates, Inc., USA.
[16]
K. Sala, J. Bellón, P. Farré, X. Teruel, J. M. Pérez, A. J. Peña, D. J. Holmes, V. Beltran, and J. Labarta. 2018. Improving the Interoperability between MPI and Task-Based Programming Models. In Proc. of the 25th European MPI Users’ Group Meeting, 2018. 6:1–6:11. https://doi.org/10.1145/3236367.3236382
[17]
K. Sala, X. Teruel, J. M. Pérez, A. J. Peña, V. Beltran, and J. Labarta. 2019. Integrating Blocking and Non-Blocking MPI Primitives with Task-Based Programming Models. CoRR abs/1901.03271(2019). arxiv:1901.03271http://arxiv.org/abs/1901.03271
[18]
J. Schuchart, K. Tsugane, J. Gracia, and M. Sato. 2018. The Impact of Taskyield on the Design of Tasks Communicating Through MPI. In Evolving OpenMP for Evolving Architectures - Proc. of the 14th Intl. Workshop on OpenMP, IWOMP 2018. 3–17. https://doi.org/10.1007/978-3-319-98521-3_1
[19]
A. Stepanov and M. Lee. 1995. The standard template library. Vol. 1501. Hewlett Packard Laboratories 1501 Page Mill Road, Palo Alto, CA 94304.
[20]
B. Stroustrup and H. Sutter. 2015. Unified Call Syntax: x.f(y) and f(x,y). Technical Report. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4474.pdf
[21]
J. Wakely. 2018. Working Draft, C++ Extensions for Networking. Technical Report. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/n4771.pdf

Cited By

View all
  • (2024)An Overview on Mixing MPI and OpenMP Dependent Tasking on A64FXProceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops10.1145/3636480.3637094(7-16)Online publication date: 11-Jan-2024
  • (2023)A Shim Layer for Transparently Adding Meta Data to MPI HandlesProceedings of the 30th European MPI Users' Group Meeting10.1145/3615318.3615324(1-9)Online publication date: 11-Sep-2023
  • (2023)Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait ConstructOpenMP: Advanced Task-Based, Device and Compiler Programming10.1007/978-3-031-40744-4_5(66-80)Online publication date: 1-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EuroMPI/USA '20: Proceedings of the 27th European MPI Users' Group Meeting
September 2020
88 pages
ISBN:9781450388801
DOI:10.1145/3416315
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2020

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • European Union's Horizon 2020

Conference

EuroMPI/USA '20
EuroMPI/USA '20: 27th European MPI Users' Group Meeting
September 21 - 24, 2020
TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 66 of 139 submissions, 47%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)185
  • Downloads (Last 6 weeks)32
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An Overview on Mixing MPI and OpenMP Dependent Tasking on A64FXProceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops10.1145/3636480.3637094(7-16)Online publication date: 11-Jan-2024
  • (2023)A Shim Layer for Transparently Adding Meta Data to MPI HandlesProceedings of the 30th European MPI Users' Group Meeting10.1145/3615318.3615324(1-9)Online publication date: 11-Sep-2023
  • (2023)Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait ConstructOpenMP: Advanced Task-Based, Device and Compiler Programming10.1007/978-3-031-40744-4_5(66-80)Online publication date: 1-Sep-2023
  • (2022)Enhancing MPI+OpenMP Task Based Applications for Heterogeneous Architectures with GPU SupportOpenMP in a Modern World: From Multi-device Support to Meta Programming10.1007/978-3-031-15922-0_1(3-16)Online publication date: 20-Sep-2022
  • (2021)Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP ApplicationsOpenMP: Enabling Massive Node-Level Parallelism10.1007/978-3-030-85262-7_14(197-210)Online publication date: 14-Sep-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media