Skip to content

Conversation

alejandro-alvarez-sonarsource
Copy link
Contributor

Signature:

size_t strxfrm(char *dest, const char *src, size_t n);

The modeling covers:

  • src can never be null
  • dest can be null only if n is 0, and then the return value is some unspecified positive integer
  • src and dest must not overlap
  • dest must have at least n bytes of capacity
  • The return value can either be:
    • < n, and the contents of the buffer pointed by dest is invalidated
    • >= n, and the contents of the buffer pointed by dest is marked as undefined

CPP-6854

Signature:

```c
size_t strxfrm(char *dest, const char *src, size_t n);
```

The modeling covers:

* `src` can never be null
* `dest` can be null only if n is 0, and then the return value is some
   unspecified positive integer
* `src` and `dest` must not overlap
* `dest` must have at least `n` bytes of capacity
* The return value can either be:
    - `< n`, and the contents of the buffer pointed by `dest`
      is invalidated
    - `>= n`, and the contents of the buffer pointed by `dest`
      is marked as undefined

CPP-6854
Copy link

github-actions bot commented Sep 2, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource)

Changes

Signature:

size_t strxfrm(char *dest, const char *src, size_t n);

The modeling covers:

  • src can never be null
  • dest can be null only if n is 0, and then the return value is some unspecified positive integer
  • src and dest must not overlap
  • dest must have at least n bytes of capacity
  • The return value can either be:
    • &lt; n, and the contents of the buffer pointed by dest is invalidated
    • &gt;= n, and the contents of the buffer pointed by dest is marked as undefined

CPP-6854


Full diff: https://github.com/llvm/llvm-project/pull/156507.diff

2 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp (+99)
  • (modified) clang/test/Analysis/string.c (+57)
diff --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
index cfc6d34a75ca2..6b305e408354d 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -163,6 +163,7 @@ class CStringChecker
       {{CDM::CLibrary, {"strcasecmp"}, 2}, &CStringChecker::evalStrcasecmp},
       {{CDM::CLibrary, {"strncasecmp"}, 3}, &CStringChecker::evalStrncasecmp},
       {{CDM::CLibrary, {"strsep"}, 2}, &CStringChecker::evalStrsep},
+      {{CDM::CLibrary, {"strxfrm"}, 3}, &CStringChecker::evalStrxfrm},
       {{CDM::CLibrary, {"bcopy"}, 3}, &CStringChecker::evalBcopy},
       {{CDM::CLibrary, {"bcmp"}, 3},
        std::bind(&CStringChecker::evalMemcmp, _1, _2, _3, CK_Regular)},
@@ -211,6 +212,8 @@ class CStringChecker
                         bool ReturnEnd, bool IsBounded, ConcatFnKind appendK,
                         bool returnPtr = true) const;
 
+  void evalStrxfrm(CheckerContext &C, const CallEvent &Call) const;
+
   void evalStrcat(CheckerContext &C, const CallEvent &Call) const;
   void evalStrncat(CheckerContext &C, const CallEvent &Call) const;
   void evalStrlcat(CheckerContext &C, const CallEvent &Call) const;
@@ -2243,6 +2246,102 @@ void CStringChecker::evalStrcpyCommon(CheckerContext &C, const CallEvent &Call,
   C.addTransition(state);
 }
 
+void CStringChecker::evalStrxfrm(CheckerContext &C,
+                                 const CallEvent &Call) const {
+  // size_t strxfrm(char *dest, const char *src, size_t n);
+  CurrentFunctionDescription = "locale transformation function";
+
+  ProgramStateRef State = C.getState();
+  const LocationContext *LCtx = C.getLocationContext();
+  SValBuilder &SVB = C.getSValBuilder();
+
+  // Get arguments
+  DestinationArgExpr Dest = {{Call.getArgExpr(0), 0}};
+  SourceArgExpr Source = {{Call.getArgExpr(1), 1}};
+  SizeArgExpr Size = {{Call.getArgExpr(2), 2}};
+
+  // `src` can never be null
+  SVal SrcVal = State->getSVal(Source.Expression, LCtx);
+  State = checkNonNull(C, State, Source, SrcVal);
+  if (!State)
+    return;
+
+  // Check overlaps
+  State = CheckOverlap(C, State, Size, Dest, Source, CK_Regular);
+  if (!State)
+    return;
+
+  // The function returns an implementation-defined length needed for
+  // transformation
+  SVal RetVal = SVB.conjureSymbolVal(Call, C.blockCount());
+
+  State = State->BindExpr(Call.getOriginExpr(), LCtx, RetVal);
+
+  // Check if size is zero
+  SVal SizeVal = State->getSVal(Size.Expression, LCtx);
+  QualType SizeTy = Size.Expression->getType();
+
+  auto [StateZeroSize, StateSizeNonZero] =
+      assumeZero(C, State, SizeVal, SizeTy);
+
+  // If `n` is 0, we just return the implementation defined length
+  if (StateZeroSize && !StateSizeNonZero) {
+    C.addTransition(StateZeroSize);
+    return;
+  }
+
+  if (!StateSizeNonZero)
+    return;
+
+  // If `n` is not 0, `dest` can not be null.
+  SVal DestVal = State->getSVal(Dest.Expression, LCtx);
+  StateSizeNonZero = checkNonNull(C, StateSizeNonZero, Dest, DestVal);
+  if (!StateSizeNonZero)
+    return;
+
+  // Check that we can write to the destination buffer
+  StateSizeNonZero = CheckBufferAccess(C, StateSizeNonZero, Dest, Size,
+                                       AccessKind::write, CK_Regular);
+  if (!StateSizeNonZero)
+    return;
+
+  // Success: return value < `n`
+  // Failure: return value >= `n`
+  auto ComparisonVal = SVB.evalBinOp(StateSizeNonZero, BO_LT, RetVal, SizeVal,
+                                     SVB.getConditionType())
+                           .getAs<DefinedOrUnknownSVal>();
+
+  if (ComparisonVal) {
+    auto [StateSuccess, StateFailure] =
+        StateSizeNonZero->assume(*ComparisonVal);
+
+    if (StateSuccess) {
+      // In this case, the transformation invalidated the buffer.
+      StateSuccess = invalidateDestinationBufferBySize(
+          C, StateSuccess, Dest.Expression, Call.getCFGElementRef(), DestVal,
+          SizeVal, Size.Expression->getType());
+
+      C.addTransition(StateSuccess);
+    }
+
+    if (StateFailure) {
+      // In this case, dest buffer content is undefined
+      if (std::optional<Loc> DestLoc = DestVal.getAs<Loc>()) {
+        StateFailure = StateFailure->bindLoc(*DestLoc, UndefinedVal{}, LCtx);
+      }
+
+      C.addTransition(StateFailure);
+    }
+  } else {
+    // Fallback: invalidate the buffer.
+    StateSizeNonZero = invalidateDestinationBufferBySize(
+        C, StateSizeNonZero, Dest.Expression, Call.getCFGElementRef(), DestVal,
+        SizeVal, Size.Expression->getType());
+
+    C.addTransition(StateSizeNonZero);
+  }
+}
+
 void CStringChecker::evalStrcmp(CheckerContext &C,
                                 const CallEvent &Call) const {
   //int strcmp(const char *s1, const char *s2);
diff --git a/clang/test/Analysis/string.c b/clang/test/Analysis/string.c
index e017aff3b4a1d..427b99bfdf295 100644
--- a/clang/test/Analysis/string.c
+++ b/clang/test/Analysis/string.c
@@ -1789,3 +1789,60 @@ void CWE124_Buffer_Underwrite__malloc_char_memcpy() {
   free(dataBuffer);
 }
 #endif
+
+//===----------------------------------------------------------------------===
+// strxfrm()
+// It is not a built-in.
+//===----------------------------------------------------------------------===
+
+size_t strxfrm(char *dest, const char *src, size_t n);
+
+void strxfrm_null_dest(const char *src) {
+  strxfrm(NULL, src, 0); // no warning
+  strxfrm(NULL, src, 10); // expected-warning {{Null pointer passed as 1st argument}}
+}
+
+void strxfrm_null_source(char *dest) {
+  strxfrm(dest, NULL, 0); // expected-warning {{Null pointer passed as 2nd argument}}
+}
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void strxfrm_overflow(const char *src) {
+  char dest[10];
+  strxfrm(dest, src, 55); // expected-warning {{Locale transformation function overflows the destination buffer}}
+}
+#endif
+
+void strxfrm_source_smaller() {
+  char dest[10];
+  char source[5];
+  strxfrm(dest, source, 10);
+}
+
+void strxfrm_overlap(char *dest) {
+  strxfrm(dest, dest, 10); // expected-warning {{Arguments must not be overlapping buffers}}
+}
+
+void strxfrm_regular(const char *src) {
+  size_t n = strxfrm(NULL, src, 0);
+  char *dest = (char*)malloc(n + 1);
+  strxfrm(dest, src, n);
+  free(dest);
+}
+
+int strxfrm_dest_undef(const char *src) {
+  char dest[10] = {0};
+  size_t n = strxfrm(dest, src, sizeof(dest));
+
+  int c = 0;
+  if (n >= sizeof(dest)) {
+    for (int i = 0; i < sizeof(dest); ++i) {
+      c += dest[i]; // expected-warning {{Assigned value is uninitialized}}
+    }
+  } else {
+    for (int i = 0; i < sizeof(dest); ++i) {
+      c += dest[i]; // no-warning
+    }
+  }
+  return c;
+}

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Sep 4, 2025
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks great. So my instinct was good about the prev issues ;)

@alejandro-alvarez-sonarsource
Copy link
Contributor Author

Thanks! Do you mind merging? I have no rights to do so.

@steakhal steakhal merged commit b85d0c5 into llvm:main Sep 5, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants