Skip to content

Commit e9f9dfa

Browse files
[libomptarget] Change nvcc compilation to use a unity build
Summary: [libomptarget] Change nvcc compilation to use a unity build This allows nvcc to inline functions between what would otherwise be distinct translation units, which in turn removes any runtime cost from implementing functions in source files (as opposed to inline in headers). This will then allow the circular dependencies in deviceRTL to be readily broken and individual components more easily shared between architectures. Reviewers: ABataev, jdoerfert, grokos, RaviNarayanaswamy, hfinkel, ronlieb, gregrodgers Reviewed By: jdoerfert Subscribers: mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69489
1 parent 0be9cf2 commit e9f9dfa

File tree

2 files changed

+26
-1
lines changed

2 files changed

+26
-1
lines changed

openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ if(LIBOMPTARGET_DEP_CUDA_FOUND)
8888
set(BUILD_SHARED_LIBS OFF)
8989
set(CUDA_SEPARABLE_COMPILATION ON)
9090
list(APPEND CUDA_NVCC_FLAGS -I${devicertl_base_directory})
91-
cuda_add_library(omptarget-nvptx STATIC ${cuda_src_files} ${omp_data_objects}
91+
cuda_add_library(omptarget-nvptx STATIC unity.cu
9292
OPTIONS ${CUDA_ARCH} ${CUDA_DEBUG})
9393

9494
# Install device RTL under the lib destination folder.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
//===------ unity.cu - Unity build of NVPTX deviceRTL ------------ CUDA -*-===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
//
9+
// Support compilers, specifically NVCC, which have not implemented link time
10+
// optimisation. This removes the runtime cost of moving inline functions into
11+
// source files in exchange for preventing efficient incremental builds.
12+
//
13+
//===----------------------------------------------------------------------===//
14+
15+
#include "src/cancel.cu"
16+
#include "src/critical.cu"
17+
#include "src/data_sharing.cu"
18+
#include "src/libcall.cu"
19+
#include "src/loop.cu"
20+
#include "src/omp_data.cu"
21+
#include "src/omptarget-nvptx.cu"
22+
#include "src/parallel.cu"
23+
#include "src/reduction.cu"
24+
#include "src/sync.cu"
25+
#include "src/task.cu"

0 commit comments

Comments
 (0)