-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[flang] Create TBAA subtree for COMMON block variables. #153918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
In order to help LLVM disambiguate accesses to the COMMON block variables, this patch creates a TBAA sub-tree for each COMMON block, and the places all variables belonging to this COMMON block into this sub-tree. The structure looks like this: ``` common /blk/ a, b, c "global data" | |- "blk_" | |- "blk_/a" |- "blk_/b" |- "blk_/c" ``` The TBAA tag for "a" is created in "blk_/a" root, etc. If, for some reason, we cannot identify a variable's name, but we know that it belongs to COMMON "blk", the TBAA tag will be created in "blk_" root - this tag indicates that this access can overlap with any accesses of a/b/c. I measured 10% speed-up on 434.zeusmp and 20% speed-up on 200.sixtrack on Zen4. I expect around the same speed-ups on ARM.
Hi Tom, I would like to discuss the way to identify a variable as belonging to a COMMON block. I check for the linkage in this patch, but I do not think it is a solid solution. I can think of a bit flag on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for proposing this. The general idea seems sensible to me.
fir::TBAATree::SubtreeState *subTree = | ||
&state.getMutableFuncTreeWithScope(func, scopeOp).globalDataTree; | ||
|
||
// The COMMON blocks have their own sub-tree root under the "global data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can non-common block global data alias with common blocks? Maybe it depends how confident we are that information about common blocks doesn't get lost during lowering.
If we are confident then we could express this by putting the common block tree one level up (as a sibling to the global data tree).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the answer is no. There is a way to equivalence
a non-common variable A
with a common variable B
, but then A
effectively becomes a part of the B
's common block, so we think of A
as just another section of the common block.
I think we can put the common blocks roots as siblings of the global data tree. I think it should not matter now, because we may never create tags attached to the root of the global data tree, since the global variable names are always present. I would like to keep the common blocks sub-trees under the global data tree for the following (aritificial) reason: if we ever step on a memory access that we know accesses global memory, but we do not know exactly which one, we should assume this access aliases with any global variable (including the COMMON ones). In this case, the common sub-trees have to be under the global data root.
// Should we have an attribute on [hl]fir.declare | ||
// that specifies the name of the COMMON block the variable | ||
// belongs to? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree this would be better. In OpenMP we handle it all whilst we still have access to the symbol table, but I think with that gone it would be worth including that information here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured my changes do not work properly for equivalence
cases where two common block variables may [partially ]overlap while having different names. In order to make this right, I will need information about the starting offset of a variable within the COMMON block (their sizes are always known in the compilation time). So I am thinking about an attribute on [hl]fir.declare
that will provide a pair of the COMMON block symbol ref and the integer offset.
I am going to check how [partially ]overlapping variables may be represented with TBAA the best way. If I cannot communicate the exact offsets/sizes via TBAA, I will probably fall back to creating cliques of overlapping variables within a COMMON block and place them under the same root/tag.
I managed to resolve all the failures related to EQUIVALENCE of the COMMON block variables in this patch. I will create an RFC for adding an "attribute" to |
@llvm/pr-subscribers-flang-fir-hlfir Author: Slava Zakharin (vzakhari) ChangesIn order to help LLVM disambiguate accesses to the COMMON
The TBAA tag for "a" is created in "blk_/a" root, etc. I measured 10% speed-up on 434.zeusmp and 20% speed-up on 200.sixtrack Patch is 34.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/153918.diff 7 Files Affected:
diff --git a/flang/include/flang/Optimizer/Analysis/TBAAForest.h b/flang/include/flang/Optimizer/Analysis/TBAAForest.h
index 4d2281642b43d..b4932594114a1 100644
--- a/flang/include/flang/Optimizer/Analysis/TBAAForest.h
+++ b/flang/include/flang/Optimizer/Analysis/TBAAForest.h
@@ -46,6 +46,12 @@ struct TBAATree {
mlir::LLVM::TBAATypeDescriptorAttr getRoot() const { return parent; }
+ /// For the given name, get or create a subtree in the current
+ /// subtree. For example, this is used for creating subtrees
+ /// inside the "global data" subtree for the COMMON block variables
+ /// belonging to the same COMMON block.
+ SubtreeState &getOrCreateNamedSubtree(mlir::StringAttr name);
+
private:
SubtreeState(mlir::MLIRContext *ctx, std::string name,
mlir::LLVM::TBAANodeAttr grandParent)
@@ -57,6 +63,9 @@ struct TBAATree {
const std::string parentId;
mlir::MLIRContext *const context;
mlir::LLVM::TBAATypeDescriptorAttr parent;
+ // A map of named sub-trees, e.g. sub-trees of the COMMON blocks
+ // placed under the "global data" root.
+ llvm::DenseMap<mlir::StringAttr, SubtreeState> namedSubtrees;
};
/// A subtree for POINTER/TARGET variables data.
@@ -131,8 +140,8 @@ class TBAAForrest {
// responsibility to provide unique name for the scope.
// If the scope string is empty, returns the TBAA tree for the
// "root" scope of the given function.
- inline const TBAATree &getFuncTreeWithScope(mlir::func::FuncOp func,
- llvm::StringRef scope) {
+ inline TBAATree &getMutableFuncTreeWithScope(mlir::func::FuncOp func,
+ llvm::StringRef scope) {
mlir::StringAttr name = func.getSymNameAttr();
if (!scope.empty())
name = mlir::StringAttr::get(name.getContext(),
@@ -140,13 +149,20 @@ class TBAAForrest {
return getFuncTree(name);
}
+ inline const TBAATree &getFuncTreeWithScope(mlir::func::FuncOp func,
+ llvm::StringRef scope) {
+ return getMutableFuncTreeWithScope(func, scope);
+ }
+
private:
- const TBAATree &getFuncTree(mlir::StringAttr symName) {
+ TBAATree &getFuncTree(mlir::StringAttr symName) {
if (!separatePerFunction)
symName = mlir::StringAttr::get(symName.getContext(), "");
if (!trees.contains(symName))
trees.insert({symName, TBAATree::buildTree(symName)});
- return trees.at(symName);
+ auto it = trees.find(symName);
+ assert(it != trees.end());
+ return it->second;
}
// Should each function use a different tree?
diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
index e3a44f147b4cd..4b3087ed45788 100644
--- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h
+++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
@@ -365,7 +365,12 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
// Linkage helpers (inline). The default linkage is external.
//===--------------------------------------------------------------------===//
- mlir::StringAttr createCommonLinkage() { return getStringAttr("common"); }
+ static mlir::StringAttr createCommonLinkage(mlir::MLIRContext *context) {
+ return mlir::StringAttr::get(context, "common");
+ }
+ mlir::StringAttr createCommonLinkage() {
+ return createCommonLinkage(getContext());
+ }
mlir::StringAttr createInternalLinkage() { return getStringAttr("internal"); }
diff --git a/flang/lib/Optimizer/Analysis/TBAAForest.cpp b/flang/lib/Optimizer/Analysis/TBAAForest.cpp
index cce50e0de1bc7..945c1804352b2 100644
--- a/flang/lib/Optimizer/Analysis/TBAAForest.cpp
+++ b/flang/lib/Optimizer/Analysis/TBAAForest.cpp
@@ -11,12 +11,21 @@
mlir::LLVM::TBAATagAttr
fir::TBAATree::SubtreeState::getTag(llvm::StringRef uniqueName) const {
- std::string id = (parentId + "/" + uniqueName).str();
+ std::string id = (parentId + '/' + uniqueName).str();
mlir::LLVM::TBAATypeDescriptorAttr type =
mlir::LLVM::TBAATypeDescriptorAttr::get(
context, id, mlir::LLVM::TBAAMemberAttr::get(parent, 0));
return mlir::LLVM::TBAATagAttr::get(type, type, 0);
- // return tag;
+}
+
+fir::TBAATree::SubtreeState &
+fir::TBAATree::SubtreeState::getOrCreateNamedSubtree(mlir::StringAttr name) {
+ if (!namedSubtrees.contains(name))
+ namedSubtrees.insert(
+ {name, SubtreeState(context, parentId + '/' + name.str(), parent)});
+ auto it = namedSubtrees.find(name);
+ assert(it != namedSubtrees.end());
+ return it->second;
}
mlir::LLVM::TBAATagAttr fir::TBAATree::SubtreeState::getTag() const {
diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
index 85403ad257657..49df61d3e0bca 100644
--- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
+++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
@@ -14,6 +14,7 @@
#include "flang/Optimizer/Analysis/AliasAnalysis.h"
#include "flang/Optimizer/Analysis/TBAAForest.h"
+#include "flang/Optimizer/Builder/FIRBuilder.h"
#include "flang/Optimizer/Dialect/FIRDialect.h"
#include "flang/Optimizer/Dialect/FirAliasTagOpInterface.h"
#include "flang/Optimizer/Transforms/Passes.h"
@@ -61,8 +62,10 @@ namespace {
class PassState {
public:
PassState(mlir::DominanceInfo &domInfo,
- std::optional<unsigned> localAllocsThreshold)
- : domInfo(domInfo), localAllocsThreshold(localAllocsThreshold) {}
+ std::optional<unsigned> localAllocsThreshold,
+ const mlir::SymbolTable &symTab)
+ : domInfo(domInfo), localAllocsThreshold(localAllocsThreshold),
+ symTab(symTab) {}
/// memoised call to fir::AliasAnalysis::getSource
inline const fir::AliasAnalysis::Source &getSource(mlir::Value value) {
if (!analysisCache.contains(value))
@@ -72,13 +75,14 @@ class PassState {
}
/// get the per-function TBAATree for this function
- inline const fir::TBAATree &getFuncTree(mlir::func::FuncOp func) {
- return forrest[func];
+ inline fir::TBAATree &getMutableFuncTreeWithScope(mlir::func::FuncOp func,
+ fir::DummyScopeOp scope) {
+ auto &scopeMap = scopeNames.at(func);
+ return forrest.getMutableFuncTreeWithScope(func, scopeMap.lookup(scope));
}
inline const fir::TBAATree &getFuncTreeWithScope(mlir::func::FuncOp func,
fir::DummyScopeOp scope) {
- auto &scopeMap = scopeNames.at(func);
- return forrest.getFuncTreeWithScope(func, scopeMap.lookup(scope));
+ return getMutableFuncTreeWithScope(func, scope);
}
void processFunctionScopes(mlir::func::FuncOp func);
@@ -98,8 +102,14 @@ class PassState {
// attachment.
bool attachLocalAllocTag();
+ fir::GlobalOp getGlobalDefiningOp(mlir::StringAttr name) const {
+ return symTab.lookup<fir::GlobalOp>(name);
+ }
+
private:
mlir::DominanceInfo &domInfo;
+ std::optional<unsigned> localAllocsThreshold;
+ const mlir::SymbolTable &symTab;
fir::AliasAnalysis analysis;
llvm::DenseMap<mlir::Value, fir::AliasAnalysis::Source> analysisCache;
fir::TBAAForrest forrest;
@@ -117,8 +127,6 @@ class PassState {
// Local pass cache for derived types that contain descriptor
// member(s), to avoid the cost of isRecordWithDescriptorMember().
llvm::DenseSet<mlir::Type> typesContainingDescriptors;
-
- std::optional<unsigned> localAllocsThreshold;
};
// Process fir.dummy_scope operations in the given func:
@@ -310,14 +318,72 @@ void AddAliasTagsPass::runOnAliasInterface(fir::FirAliasTagOpInterface op,
source.kind == fir::AliasAnalysis::SourceKind::Global &&
!source.isBoxData()) {
mlir::SymbolRefAttr glbl = llvm::cast<mlir::SymbolRefAttr>(source.origin.u);
- const char *name = glbl.getRootReference().data();
- LLVM_DEBUG(llvm::dbgs().indent(2) << "Found reference to global " << name
- << " at " << *op << "\n");
- if (source.isPointer())
+ mlir::StringAttr name = glbl.getRootReference();
+ LLVM_DEBUG(llvm::dbgs().indent(2) << "Found reference to global "
+ << name.str() << " at " << *op << "\n");
+ if (source.isPointer()) {
tag = state.getFuncTreeWithScope(func, scopeOp).targetDataTree.getTag();
- else
- tag =
- state.getFuncTreeWithScope(func, scopeOp).globalDataTree.getTag(name);
+ } else {
+ // In general, place the tags under the "global data" root.
+ fir::TBAATree::SubtreeState *subTree =
+ &state.getMutableFuncTreeWithScope(func, scopeOp).globalDataTree;
+
+ // The COMMON blocks and global variables associated via EQUIVALENCE
+ // have their own sub-tree roots under the "global data"
+ // root, which is named after the name of the COMMON/EQUIVALENCE block.
+ // If we can identify the name of the member variable, then
+ // we create a sub-tree under the root of the COMMON/EQUIVALENCE block
+ // and place the tag there. If we cannot identify the name
+ // of the member variable (e.g. for whatever reason there is no
+ // fir.declare for it), then we place the tag under the root
+ // of the COMMON/EQUIVALENCE block.
+ if (auto globalOp = state.getGlobalDefiningOp(name)) {
+ // Get or create a sub-tree for the COMMON/EQUIVALENCE block
+ // or for a regular global variable.
+ subTree = &subTree->getOrCreateNamedSubtree(name);
+
+ auto declOp = mlir::dyn_cast_or_null<fir::DeclareOp>(
+ source.origin.instantiationPoint);
+ mlir::StringAttr varName;
+ if (declOp &&
+ // Equivalenced variables are defined by [hl]fir.declare
+ // operations with !fir.ptr<> results.
+ // TODO: the equivalenced variables belonging to a COMMON
+ // block, may alias each other, but the rest of the COMMON
+ // block variables may still be made non-aliasing with them.
+ // To implement that we need to know the sets of COMMON
+ // variables that alias between each other, then we can
+ // create separate sub-trees for each set.
+ !mlir::isa<fir::PointerType>(declOp.getType())) {
+ // The tag for the variable will be placed under its own
+ // root in the COMMON sub-tree.
+ if (auto declName = declOp.getUniqName())
+ if (declName != name) {
+ // The declaration name does not match the name of the global
+ // for all variables in COMMON blocks by lowering, so all COMMON
+ // variables with known names must end up here.
+ // Declaration name of an equivalenced variable may match
+ // the global's name, but the EQUIVALENCE variables are filtered
+ // above - their tags will be created under the EQUIVALENCE's
+ // named root.
+ // The name check here is avoiding the creation of redundant
+ // roots for regular global variables.
+ varName = declName;
+ tag = subTree->getTag(varName.str());
+ }
+ }
+
+ if (!varName)
+ tag = subTree->getTag();
+
+ LLVM_DEBUG(llvm::dbgs().indent(2)
+ << "Variable named '"
+ << (varName ? varName.str() : "<unknown>")
+ << "' is from COMMON block '" << name.str() << "'\n");
+ } else {
+ tag = subTree->getTag(name.str());
+ }
+ }
// TBAA for global variables with descriptors
} else if (enableDirect &&
@@ -401,11 +467,14 @@ void AddAliasTagsPass::runOnOperation() {
// thinks the pass operates on), then the real work of the pass is done in
// runOnAliasInterface
auto &domInfo = getAnalysis<mlir::DominanceInfo>();
- PassState state(domInfo, localAllocsThreshold.getPosition()
- ? std::optional<unsigned>(localAllocsThreshold)
- : std::nullopt);
-
mlir::ModuleOp mod = getOperation();
+ mlir::SymbolTable symTab(mod);
+ PassState state(domInfo,
+ localAllocsThreshold.getPosition()
+ ? std::optional<unsigned>(localAllocsThreshold)
+ : std::nullopt,
+ symTab);
+
mod.walk(
[&](fir::FirAliasTagOpInterface op) { runOnAliasInterface(op, state); });
diff --git a/flang/test/Transforms/tbaa-for-common-vars.fir b/flang/test/Transforms/tbaa-for-common-vars.fir
new file mode 100644
index 0000000000000..12331dcabac41
--- /dev/null
+++ b/flang/test/Transforms/tbaa-for-common-vars.fir
@@ -0,0 +1,237 @@
+// RUN: fir-opt --split-input-file --fir-add-alias-tags %s | FileCheck %s
+
+// Fortran source:
+// subroutine test1
+// real :: a, b
+// common /common1/ a, b
+// a = b
+// end subroutine test1
+fir.global common @common1_(dense<0> : vector<8xi8>) {alignment = 4 : i64} : !fir.array<8xi8>
+func.func @_QPtest1() {
+ %c4 = arith.constant 4 : index
+ %c0 = arith.constant 0 : index
+ %0 = fir.dummy_scope : !fir.dscope
+ %1 = fir.address_of(@common1_) : !fir.ref<!fir.array<8xi8>>
+ %2 = fir.convert %1 : (!fir.ref<!fir.array<8xi8>>) -> !fir.ref<!fir.array<?xi8>>
+ %3 = fir.coordinate_of %2, %c0 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+ %4 = fir.convert %3 : (!fir.ref<i8>) -> !fir.ref<f32>
+ %5 = fir.declare %4 {uniq_name = "_QFtest1Ea"} : (!fir.ref<f32>) -> !fir.ref<f32>
+ %6 = fir.coordinate_of %2, %c4 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+ %7 = fir.convert %6 : (!fir.ref<i8>) -> !fir.ref<f32>
+ %8 = fir.declare %7 {uniq_name = "_QFtest1Eb"} : (!fir.ref<f32>) -> !fir.ref<f32>
+ %9 = fir.load %8 : !fir.ref<f32>
+ fir.store %9 to %5 : !fir.ref<f32>
+ return
+}
+// CHECK: #[[$ATTR_0:.+]] = #llvm.tbaa_root<id = "Flang function root _QPtest1">
+// CHECK: #[[$ATTR_1:.+]] = #llvm.tbaa_type_desc<id = "any access", members = {<#[[$ATTR_0]], 0>}>
+// CHECK: #[[$ATTR_2:.+]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[$ATTR_1]], 0>}>
+// CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[$ATTR_2]], 0>}>
+// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[$ATTR_3]], 0>}>
+// CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc<id = "global data/common1_", members = {<#[[$ATTR_4]], 0>}>
+// CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc<id = "global data/common1_/_QFtest1Eb", members = {<#[[$ATTR_5]], 0>}>
+// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc<id = "global data/common1_/_QFtest1Ea", members = {<#[[$ATTR_5]], 0>}>
+// CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_6]], access_type = #[[$ATTR_6]], offset = 0>
+// CHECK: #[[$ATTR_9:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_7]], access_type = #[[$ATTR_7]], offset = 0>
+// CHECK-LABEL: func.func @_QPtest1() {
+// CHECK: fir.load{{.*}}{tbaa = [#[[$ATTR_8]]]} : !fir.ref<f32>
+// CHECK: fir.store{{.*}}{tbaa = [#[[$ATTR_9]]]} : !fir.ref<f32>
+
+// -----
+
+// Fortran source:
+// subroutine test2
+// real :: a, b
+// common /common2/ a, b
+// a = b
+// end subroutine test2
+fir.global common @common2_(dense<0> : vector<8xi8>) {alignment = 4 : i64} : !fir.array<8xi8>
+func.func @_QPtest2() {
+ %c4 = arith.constant 4 : index
+ %c0 = arith.constant 0 : index
+ %0 = fir.dummy_scope : !fir.dscope
+ %1 = fir.address_of(@common2_) : !fir.ref<!fir.array<8xi8>>
+ %2 = fir.convert %1 : (!fir.ref<!fir.array<8xi8>>) -> !fir.ref<!fir.array<?xi8>>
+ %3 = fir.coordinate_of %2, %c0 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+ %4 = fir.convert %3 : (!fir.ref<i8>) -> !fir.ref<f32>
+ %5 = fir.declare %4 {uniq_name = "_QFtest2Ea"} : (!fir.ref<f32>) -> !fir.ref<f32>
+ %6 = fir.coordinate_of %2, %c4 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+ %7 = fir.convert %6 : (!fir.ref<i8>) -> !fir.ref<f32>
+ %8 = fir.declare %7 {uniq_name = "_QFtest2Eb"} : (!fir.ref<f32>) -> !fir.ref<f32>
+ %9 = fir.load %8 : !fir.ref<f32>
+ fir.store %9 to %5 : !fir.ref<f32>
+ return
+}
+// CHECK: #[[$ATTR_10:.+]] = #llvm.tbaa_root<id = "Flang function root _QPtest2">
+// CHECK: #[[$ATTR_11:.+]] = #llvm.tbaa_type_desc<id = "any access", members = {<#[[$ATTR_10]], 0>}>
+// CHECK: #[[$ATTR_12:.+]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[$ATTR_11]], 0>}>
+// CHECK: #[[$ATTR_13:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[$ATTR_12]], 0>}>
+// CHECK: #[[$ATTR_14:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[$ATTR_13]], 0>}>
+// CHECK: #[[$ATTR_15:.+]] = #llvm.tbaa_type_desc<id = "global data/common2_", members = {<#[[$ATTR_14]], 0>}>
+// CHECK: #[[$ATTR_16:.+]] = #llvm.tbaa_type_desc<id = "global data/common2_/_QFtest2Eb", members = {<#[[$ATTR_15]], 0>}>
+// CHECK: #[[$ATTR_17:.+]] = #llvm.tbaa_type_desc<id = "global data/common2_/_QFtest2Ea", members = {<#[[$ATTR_15]], 0>}>
+// CHECK: #[[$ATTR_18:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_16]], access_type = #[[$ATTR_16]], offset = 0>
+// CHECK: #[[$ATTR_19:.+]] = #llvm.tbaa_tag<base_type = #[[$ATTR_17]], access_type = #[[$ATTR_17]], offset = 0>
+// CHECK-LABEL: func.func @_QPtest2() {
+// CHECK: fir.load{{.*}}{tbaa = [#[[$ATTR_18]]]} : !fir.ref<f32>
+// CHECK: fir.store{{.*}}{tbaa = [#[[$ATTR_19]]]} : !fir.ref<f32>
+
+// -----
+
+// Fortran source compiled with -mmlir -inline-all:
+// subroutine test3
+// real :: a, b
+// common /common3/ a, b
+// a = b
+// call inner(a, b)
+// contains
+// subroutine inner(c, d)
+// real :: c, d
+// c = d
+// end subroutine inner
+// end subroutine test3
+fir.global common @common3_(dense<0> : vector<8xi8>) {alignment = 4 : i64} : !fir.array<8xi8>
+func.func @_QPtest3() {
+ %c4 = arith.constant 4 : index
+ %c0 = arith.constant 0 : index
+ %0 = fir.dummy_scope : !fir.dscope
+ %1 = fir.address_of(@common3_) : !fir.ref<!fir.array<8xi8>>
+ %2 = fir.convert %1 : (!fir.ref<!fir.array<8xi8>>) -> !fir.ref<!fir.array<?xi8>>
+ %3 = fir.coordinate_of %2, %c0 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+ %4 = fir.convert %3 : (!fir.ref<i8>) -> !fir.ref<f32>
+ %5 = fir.declare %4 {uniq_name = "_QFtest3Ea"} : (!fir.ref<f32>) -> !fir.ref<f32>
+ %6 = fir.coordinate_of %2, %c4 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
+ %7 = fir.convert %6 : (!fir.ref<i8>) -> !fir.ref<f32>
+ %8 = fir.declare %7 {uniq_name = "_QFtest3Eb"} : (!fir.ref<f32>) -> !fir.ref<f32>
+ %9 = fir.load %8 : !fir.ref<f32>
+ fir.store %9 to %5 : !fir.ref<f32>
+ %10 = fir.dummy_scope : !fir.dscope
+ %11 = fir.declare %5 dummy_scope %10 {uniq_name = "_QFtest3FinnerEc"} : (!fir.ref<f32>, !fir.dscope) -> !fir.ref<f32>
+ %12 = fir.declare %8 dummy_scope %10 {uniq_name = "_QFtest3FinnerEd"} : (!fir.ref<f32>, !fir.dscope) -> !fir.ref<f32>
+ %13 = fir.load %12 : !fir.ref<f32>
+ fir.store %13 to %11 : !fir.ref<f32>
+ return
+}
+// CHECK: #[[ROOT3:.+]] = #llvm.tbaa_root<id = "Flang function root _QPtest3">
+// CHECK: #[[ROOT3INNER:.+]] = #llvm.tbaa_root<id = "Flang function root _QPtest3 - Scope 1">
+// CHECK: #[[ANYACC3:.+]] = #llvm.tbaa_type_desc<id = "any access", members = {<#[[ROOT3]], 0>}>
+// CHECK: #[[ANYACC3INNER:.+]] = #llvm.tbaa_type_desc<id = "any access", members = {<#[[ROOT3INNER]], 0>}>
+// CHECK: #[[ANYDATA3:.+]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[ANYACC3]], 0>}>
+// CHECK: #[[ANYDATA3INNER:.+]] = #llvm.tbaa_type_desc<id = "any data access", members = {<#[[ANYACC3INNER]], 0>}>
+// CHECK: #[[TARGETDATA3:.+]] = #llvm.tbaa_type_desc<id = "target data", members = {<#[[ANYDATA3]], 0>}>
+// CHECK: #[[DUMMYARG3INNER:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data", members = {<#[[ANYDATA3INNER]], 0>}>
+// CHECK: #[[GLOBALDATA3:.+]] = #llvm.tbaa_type_desc<id = "global data", members = {<#[[TARGETDATA3]], 0>}>
+// CHECK: #[[DUMMYD:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data/_QFtest3FinnerEd", members = {<#[[DUMMYARG3INNER]], 0>}>
+// CHECK: #[[DUMMYC:.+]] = #llvm.tbaa_type_desc<id = "dummy arg data/_QFtest3FinnerEc", members = {<#[[DUMMYARG3INNER]], 0>}>
+// CHECK: #[[DUMMYDTAG:.+]] = #llvm.tbaa_tag<base_type = #[[DUMMYD]], access_type = #[[DUMMYD]], offset = 0>
+// CHECK: #[[DUMMYCTAG:.+]] = #llvm.tbaa_tag<base_type = #[[DUMMYC]], access_type = #[[DUMMYC]...
[truncated]
|
These changes work incorrectly for the following case:
So in order to resolve this properly I need the information about the starting offsets withing the physical storage: https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026 |
The changes so far look good to me (once the commit message is updated), but yes this will need that RFC. |
In order to help LLVM disambiguate accesses to the COMMON
block variables, this patch creates a TBAA sub-tree for each
COMMON block, and the places all variables belonging to this
COMMON block into this sub-tree. The structure looks like this:
The TBAA tag for "a" is created in "blk_/a" root, etc.
If, for some reason, we cannot identify a variable's name,
but we know that it belongs to COMMON "blk", the TBAA tag
will be created in "blk_" root - this tag indicates that this access
can overlap with any accesses of a/b/c.
I measured 10% speed-up on 434.zeusmp and 20% speed-up on 200.sixtrack
on Zen4. I expect around the same speed-ups on ARM.