[`GPT Big Code`] Fix attention scaling #40041

vasqu · 2025-08-08T18:02:33Z

Reported internally cc @hmellor @zucchini-nlp

HuggingFaceDocBuilderDev · 2025-08-08T18:16:02Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Can you add a fast test / fast integration test to make sure this does not happen again?

vasqu · 2025-08-11T17:57:08Z

run-slow: gpt_bigcode

github-actions · 2025-08-11T17:58:34Z

This comment contains run-slow, running the specified jobs:

models: ['models/gpt_bigcode']
quantizations: [] ...

vasqu · 2025-08-11T18:20:58Z

run-slow: gpt_bigcode

github-actions · 2025-08-11T18:22:45Z

This comment contains run-slow, running the specified jobs:

models: ['models/gpt_bigcode']
quantizations: [] ...

github-actions · 2025-08-11T18:35:48Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gpt_bigcode

vasqu · 2025-08-11T18:35:59Z

Only some parallel tests fail (as in main). Fixed the old regression tests and added a new one that would only produce newlines before this fix - hopefully, this is a "tougher" test so small breaks are easier to detect before.

Will merge after CI is green

fix

d345abb

vasqu added the for patch Tag issues / labels that should be included in the next patch label Aug 8, 2025

hmellor approved these changes Aug 8, 2025

View reviewed changes

ArthurZucker approved these changes Aug 11, 2025

View reviewed changes

update integration tests

85103d8

vasqu added 2 commits August 11, 2025 20:08

fmt

e9a33c7

add regression test

4b1ef2d

Merge branch 'main' into fix-gpt-bigcode

1968f43

vasqu enabled auto-merge (squash) August 11, 2025 18:47

flaky ci

2ff1ad4

vasqu merged commit e29919f into huggingface:main Aug 11, 2025
18 checks passed

vasqu deleted the fix-gpt-bigcode branch August 11, 2025 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[`GPT Big Code`] Fix attention scaling #40041

[`GPT Big Code`] Fix attention scaling #40041

vasqu commented Aug 8, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 8, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

vasqu commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

vasqu commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

vasqu commented Aug 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[GPT Big Code] Fix attention scaling #40041

[GPT Big Code] Fix attention scaling #40041

Conversation

vasqu commented Aug 8, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 8, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

vasqu commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

vasqu commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

vasqu commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[`GPT Big Code`] Fix attention scaling #40041

[`GPT Big Code`] Fix attention scaling #40041

vasqu commented Aug 11, 2025 •

edited

Loading