Skip to content

ggml: check if non-native endian model is being loaded #13943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

taronaeo
Copy link
Contributor

@taronaeo taronaeo commented May 31, 2025

This PR adds more descriptive error messages for when a non-native endian model is being loaded on the host system.

Verification

To ensure that this implementation did not break anything, this PR has been tested on the following systems:

  1. IBM z15 Mainframe (16 IFLs / 160 RAIM / NOSMT / LPAR)
  2. M3 MacBook Air (8 Cores / 16 GB / SMT)
  3. Kindly request additional systems to be tested in this PR
System Granite 3.1 2B Instruct LE Model Granite 3.1 2B Instruct BE Model
IBM z15 Mainframe (BE System) ❌ Does not load (Expected) ✅ Loads (Expected)
M3 MacBook Air (LE System) ✅ Loads (Expected) ❌ Does not load (Expected)

taronaeo added 2 commits May 31, 2025 20:59
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label May 31, 2025
taronaeo added 2 commits June 1, 2025 17:12
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
@taronaeo
Copy link
Contributor Author

taronaeo commented Jun 1, 2025

@JohannesGaessler applied your suggested changes. I've moved the GGML_ASSERT just after the endianness check, otherwise it will trigger the assert before the check could produce any useful information for the end-user.

PTAL again.

P.S., GGML_ASSERT fails because the version field in non-native endian models are read as 50331648.

Edit: Also re-tested the new changes on ARM64 and s390x. Both worked as intended.

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Copy link
Collaborator

@JohannesGaessler JohannesGaessler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say with this error message it's fine to remove the assert again but either way is fine I think.

@taronaeo
Copy link
Contributor Author

taronaeo commented Jun 1, 2025

Unfortunately that GGML_ASSERT is still required because when I change the version to 0x0001 FFFF, the assert catches it via

llama.cpp-s390x/ggml/src/gguf.cpp:363: GGML_ASSERT(ctx->version > 0 && ctx->version <= 65535) failed

while the endianness check did not; as expected.

If the CI passes, feel free to merge it into master :)

@JohannesGaessler JohannesGaessler merged commit e57bb87 into ggml-org:master Jun 1, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants