Abstract
OpenMP is widely accepted as a de facto standard for shared memory parallel programming in Fortran, C and C++. Nested parallelization has been included in the first OpenMP specification, but it took a few years until the first commercially available compilers supported this optional part of the specification. We employed nested parallelization using OpenMP in three production codes: a C++ code for content-based image retrieval, a C++ code for the computation of critical points in multi-block CFD datasets, and a multi-block Navier-Stokes solver written in Fortran90. In this paper we discuss the opportunities as well as the deficiencies of the nested parallelization support in OpenMP.
Similar content being viewed by others
References
Terboven, C., Deselaers, T., Bischof, C., Ney, H.: Shared-memory parallelization for content-based image retrieval. In: ECCV 2006 Workshop on Computation Intensive Methods for Computer Vision
Nested OpenMP for Efficient Computation of 3D Critical Points in Multi-Block CFD Datasets; Super computing (2006) (to appear)
Johnson, S., Leggett, P., Ierotheou, C., Spiegel, A., an Mey, D., Hoerschler, I.: Nested parallelization of the flow solver tfs using the parawise parallelization environment; IWOMP (2006); http://iwomp.univ-reims.fr/cd/papers/JLI+06.pdf
OpenMP Architecture Review Board: OpenMP application program interface, v2.5. (2005) http://www.openmp.org or http://www.compunity.org
Solaris Memory Placement Optimization and Sun Fire Servers, Technical White Paper, http://www.sun.com/servers/wp/docs/mpo_v7_CUSTOMER.pdf
Sun Studio 11: OpenMP API User’s Guide, Chapter 2, Nested Parallelism, http://docs.sun.com/source/819-3694/2_nested.html
Müller, H., Michoux, N., Bandon, D., Geissbuhler, A.: A review of content-based image retrieval systems in medical applications–clinical benefits and future directions. Int. J. Med. Inform. (73)1–23 (2004)
Sun, Y., Zhang, H., Zhang, L., Li, M.: Myphotos a system for home photo management and processing. In: ACM Multimedia Confernce, pp. 81–82 Juan-les-Pins, France, (2002)
Smeulders A.W.M., Worring M., Santini S., Gupta A., Jain R. (2000) Content-based image retrieval: the end of the early years. IEEE T. Pattern Anal. 22(12): 1349–1380
Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval—a quantitative comparison. In: DAGM 2004, Pattern Recognition, 26th DAGM Symposium, pp. 228–236 Number 3175 in Lecture Notes in Computer Science, Tübingen, Germany (2004)
Clough, P., Müller, H., Sanderson, M.: The CLEF cross language image retrieval track (ImageCLEF) 2004. In: Fifth Workshop of the Cross–Language Evaluation Forum (CLEF 2004). Volume 3491 of LNCS, pp. 597–613 (2005)
Clough, P., Mueller, H., Deselaers, T., Grubinger, M., Lehmann, T., Jensen, J., Hersh, W.: The clef 2005 cross-language image retrieval track. In: Workshop of the Cross–Language Evaluation Forum (CLEF 2005). Lecture Notes in Computer Science, Vienna, Austria (2005) (in press)
Hörschler I., Meinke M., Schröder W. (2003) Numerical simulation of the flow field in a model of the nasal cavity. Comput. Fluids 32: 3945
Hörschler, I., Brücker, C., Schröder, W., Meinke, M.: Investigation of the impact of the geometry on the nose flow, Eur. J. Mech. B/Fluids (In Press) http://dx.doi.org/10.1016/j.euromechflu.2005.11.006
ParaWise automatic parallelisation environment, PSP Inc. http://www.parallelsp.com
Jin, H., Frumkin, M., Yan, J.: Automatic generation of OpenMP directives and it application to computational fluid dynamics codes. International Symposium on High Performance Computing, p. 440 Tokyo, Japan, (2000)
Johnson, S., Ierotheou, C.: Parallelization of the TFS multi-block code from RWTH Aachen using the ParaWise/CAPO tools, PSP Inc, TR-2005-09-02, (2005). http://www.parallelsp.com/downloads/TechnicalReports/TR-2005-09-02.pdf
Johnson S., Cross M., and Everett M. (1996) Exploitation of symbolic information in interprocedural dependence analysis. Parallel Comput. 22, 197–226
Spiegel, A., an Mey, D., Bischof, C.: Hybrid parallelization of CFD Applications with Dynamic Thread Balancing, PARA04. In: Dongarra J., Madsen K., Wasniewski J. (eds.) Applied Parallel Computing State of the Art in Scientific Computing: 7th International Conference, PARA 2004, vol. 3732, pp. 433–441. Lyngby, Denmark (2006)
McCalpin, J.D.: STREAM: sustainable memory bandwidth in high performance computers, http://www.cs.virginia.edu/stream/
Bull, M.: The status of OpenMP 3.0, SC06, OpenMP BoF http://www.compunity.org/futures/Mark_SC06BOF.pdf
SUSE Linux 10.1 NUMA Policy Control, http://www.novell.com/products/linuxpackages/../suselinux/numactl.html
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
an Mey, D., Sarholz, S. & Terboven, C. Nested Parallelization with OpenMP. Int J Parallel Prog 35, 459–476 (2007). https://doi.org/10.1007/s10766-007-0054-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-007-0054-1