Skip to content

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Aug 28, 2025

This fixes new warnings when building with Clang 21, encountered in #155627.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:semantics flang:parser labels Aug 28, 2025
@nikic nikic requested a review from klausler August 28, 2025 16:01
@llvmbot
Copy link
Member

llvmbot commented Aug 28, 2025

@llvm/pr-subscribers-flang-parser

@llvm/pr-subscribers-flang-semantics

Author: Nikita Popov (nikic)

Changes

This fixes new warnings when building with Clang 21, encountered in #155627.


Full diff: https://github.com/llvm/llvm-project/pull/155873.diff

2 Files Affected:

  • (modified) flang/lib/Evaluate/fold-implementation.h (+1-1)
  • (modified) flang/lib/Parser/characters.cpp (+2-1)
diff --git a/flang/lib/Evaluate/fold-implementation.h b/flang/lib/Evaluate/fold-implementation.h
index d757ef6e62eb4..3fdf3a6f38848 100644
--- a/flang/lib/Evaluate/fold-implementation.h
+++ b/flang/lib/Evaluate/fold-implementation.h
@@ -1785,7 +1785,7 @@ common::IfNoLvalue<std::optional<TO>, FROM> ConvertString(FROM &&s) {
       if (static_cast<std::uint64_t>(*iter) > 127) {
         return std::nullopt;
       }
-      str.push_back(*iter);
+      str.push_back(static_cast<typename TO::value_type>(*iter));
     }
     return std::make_optional<TO>(std::move(str));
   }
diff --git a/flang/lib/Parser/characters.cpp b/flang/lib/Parser/characters.cpp
index f6ac777ea874c..1a00b16eefe9d 100644
--- a/flang/lib/Parser/characters.cpp
+++ b/flang/lib/Parser/characters.cpp
@@ -289,7 +289,8 @@ RESULT DecodeString(const std::string &s, bool backslashEscapes) {
         DecodeCharacter<ENCODING>(p, bytes, backslashEscapes)};
     if (decoded.bytes > 0) {
       if (static_cast<std::size_t>(decoded.bytes) <= bytes) {
-        result.append(1, decoded.codepoint);
+        result.append(
+            1, static_cast<typename RESULT::value_type>(decoded.codepoint));
         bytes -= decoded.bytes;
         p += decoded.bytes;
         continue;

@@ -1785,7 +1785,7 @@ common::IfNoLvalue<std::optional<TO>, FROM> ConvertString(FROM &&s) {
if (static_cast<std::uint64_t>(*iter) > 127) {
return std::nullopt;
}
str.push_back(*iter);
str.push_back(static_cast<typename TO::value_type>(*iter));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is safe due to the preceding check.

@@ -289,7 +289,8 @@ RESULT DecodeString(const std::string &s, bool backslashEscapes) {
DecodeCharacter<ENCODING>(p, bytes, backslashEscapes)};
if (decoded.bytes > 0) {
if (static_cast<std::size_t>(decoded.bytes) <= bytes) {
result.append(1, decoded.codepoint);
result.append(
1, static_cast<typename RESULT::value_type>(decoded.codepoint));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one looks like it might truncate. Don't know whether this is somehow excluded by the broader context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the overload std::u16string DecodeString<std::u16string, Encoding::UTF_8>. A character outside the unicode basic plane needs more than 16 bits.

According to https://flang.llvm.org/docs/Character.html#kinds-and-character-sets, Fortran strings are expected to be UCS-2 and truncation is expected.

In particular, conversions between kinds are assumed to be simple zero-extensions or truncation, not table look-ups.

@klausler klausler removed their request for review August 28, 2025 16:10
Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -289,7 +289,8 @@ RESULT DecodeString(const std::string &s, bool backslashEscapes) {
DecodeCharacter<ENCODING>(p, bytes, backslashEscapes)};
if (decoded.bytes > 0) {
if (static_cast<std::size_t>(decoded.bytes) <= bytes) {
result.append(1, decoded.codepoint);
result.append(
1, static_cast<typename RESULT::value_type>(decoded.codepoint));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the overload std::u16string DecodeString<std::u16string, Encoding::UTF_8>. A character outside the unicode basic plane needs more than 16 bits.

According to https://flang.llvm.org/docs/Character.html#kinds-and-character-sets, Fortran strings are expected to be UCS-2 and truncation is expected.

In particular, conversions between kinds are assumed to be simple zero-extensions or truncation, not table look-ups.

@nikic nikic merged commit f65f60e into llvm:main Aug 29, 2025
13 checks passed
@nikic nikic deleted the flang-warning-fix branch August 29, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:parser flang:semantics flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants