Skip to content

Conversation

kaadam
Copy link
Contributor

@kaadam kaadam commented Sep 3, 2025

Previously, llvm-readelf dumped hex format values in different ways. Some of them were printed in upper-case, while the others were in lower-case format. This change switches the format to lower-case in all cases.

Why is this useful? As an example, FileCheck comparisons are case-sensitive by default. This change means it's easier to compare those values, because they have the same format.

Currently llvm-readobj prints e_entry address in uppercase mode,
however all other places such as symbols, relocs the addresses
are printed in lowercase mode. This PR aims to print the addresses
in uniform manner.

Other use-case:
FileCheck comparison is case-sensitive by default, so if you would
like to compare e_entry address with the same address in the symbol
table it won't match by default.
@kaadam kaadam marked this pull request as ready for review September 3, 2025 14:44
@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-llvm-binary-utilities

Author: Ádám Kallai (kaadam)

Changes

Currently llvm-readobj prints e_entry address in uppercase mode, however all other places such as 'symbols', 'relocs', the addresses are printed in lowercase mode. This PR aims to print the addresses in uniform manner.

Other use-case:
FileCheck comparison is case-sensitive by default, so if you would like to compare e_entry address with the same address in the symbol table it won't match by default.


Full diff: https://github.com/llvm/llvm-project/pull/156683.diff

1 Files Affected:

  • (modified) llvm/tools/llvm-readobj/ELFDumper.cpp (+1-1)
diff --git a/llvm/tools/llvm-readobj/ELFDumper.cpp b/llvm/tools/llvm-readobj/ELFDumper.cpp
index a440bad130f4c..79a6ce1fd88b8 100644
--- a/llvm/tools/llvm-readobj/ELFDumper.cpp
+++ b/llvm/tools/llvm-readobj/ELFDumper.cpp
@@ -3640,7 +3640,7 @@ template <class ELFT> void GNUELFDumper<ELFT>::printFileHeaders() {
   printFields(OS, "Machine:", Str);
   Str = "0x" + utohexstr(e.e_version);
   printFields(OS, "Version:", Str);
-  Str = "0x" + utohexstr(e.e_entry);
+  Str = "0x" + utohexstr(e.e_entry, /*LowerCase=*/true);
   printFields(OS, "Entry point address:", Str);
   Str = to_string(e.e_phoff) + " (bytes into file)";
   printFields(OS, "Start of program headers:", Str);

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! In principle, I think this makes sense. The majority of our hex printing is via Twine::utohexstr, which is always lower-case. I think the missing lower case argument in the calls to llvm::utohexstr are likely just oversights and should be updated.

There are several other places in this file that use llvm::utohexstr, some of which don't use the lower-case argument. Please could you update them all (and any tests that fail as a result of course). I note that in some cases (e.g. e_version), it's unlikely to matter due to the range of values that could be printed, but it's better to be consistent.

Adjusted the related tests to the newer format.
@kaadam
Copy link
Contributor Author

kaadam commented Sep 4, 2025

@jh7370 Thanks for your review. I've updated the utohexstr function call to use lower-case format. The corresponding tests are adjusted as well.

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm::utohexstr defaults to upper-case while Twine::utohexstr defaults to lower-case. I think lower-case is more popular and llvm-readelf always uses lower-case.

Using more lower-case for llvm-readobj makes sense. Unfortunately, changing some fields (relocation offsets and addends) for llvm-readobj might cause a lot of changes.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the PR title and description given the format change.

It may be worth updating the release notes to mention this, due to the potential for it to break people's automated scripts.

@kaadam kaadam changed the title [llvm-readobj][ELF] Print ELF e_entry in lowercase mode [llvm-readobj][ELF] Prints hex format addresses in lower-case mode Sep 5, 2025
@kaadam
Copy link
Contributor Author

kaadam commented Sep 5, 2025

t may be worth updating the release notes to mention this, due to the potential for it to break people's automated scripts.
@MaskRay @jh7370 Thanks for the review both of you. Updated the title and the description. I also added a description about this modification to the release notes. Hope I did it right. Could you take a look at it please?

Comment on lines 154 to 155
* `llvm-readelf` now dumps all hex format addresses in lower-case mode generally.
Therefore the corresponding tests have been udpated as well.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mention the tests in the release notes.

Also, this isn't specific to addresses any longer (e.g. you've made changes that impact printing of flags, version numbers and so on), so please drop references to them from this release note and the PR title and description. Something generic like "hex format values" would be enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@kaadam kaadam changed the title [llvm-readobj][ELF] Prints hex format addresses in lower-case mode [llvm-readobj][ELF] Prints hex format values in lower-case mode Sep 5, 2025
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest some slight grammar changes to the PR description:

Previously, llvm-readelf dumped hex format values in different ways. Some of them were printed in upper-case, while the others were in lower-case format. This change switches the format to lower-case in all cases.

Why is this useful? As an example, FileCheck comparisons are case-sensitive by default. This change means it's easier to compare those values, because they have the same format.

I've dropped the "where it is possible" bit. There aren't any within the llvm-readelf code that aren't possible, right?

You can also drop the "the corresponding tests have been updated as well" since that's normal procedure.

With those tweaks and the one nit in the release note, LGTM. Do you need assistance merging, or do you have commit access already?

@@ -151,6 +151,8 @@ Changes to the Debug Info
Changes to the LLVM tools
---------------------------------

* `llvm-readelf` now dumps all hex format values in lower-case mode generally.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop "generally".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped.

@kaadam
Copy link
Contributor Author

kaadam commented Sep 5, 2025

@jh7370 Thank you for your thorough review and the suggestions. Updated the description. I have no commit access yet. May I ask you to merge it?

@jh7370 jh7370 merged commit e110928 into llvm:main Sep 5, 2025
10 checks passed
@jh7370
Copy link
Collaborator

jh7370 commented Sep 5, 2025

Done, please keep a close eye out for build bot failure notifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants