Skip to content

feat: use proto streams to increase maximum module files payload #18268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 52 commits into from
Jun 13, 2025

Conversation

Emyrk
Copy link
Member

@Emyrk Emyrk commented Jun 6, 2025

Increase maximum module files payload using protobuf streams

Problem

Coder has a 4MB protobuf message size limit that prevents templates with large Terraform modules from being processed. When dynamic parameters require module files from ~/.terraform/modules that exceed this limit, template creation fails.

Solution

This PR implements protobuf streaming to handle large module files by:

  1. Streaming large payloads: When module files exceed the 4MB limit, they're streamed in chunks using a new UploadFile RPC method
  2. Database storage: Streamed files are stored in the database and referenced by hash for deduplication
  3. Backward compatibility: Small module files continue using the existing direct payload method

Key Changes

  • New protobuf streams: Added UploadFile RPC and DataUpload/ChunkPiece message types
  • Smart fallback: Automatically detects oversized payloads and switches to streaming
  • File deduplication: Module files are deduplicated per template version using content hashes
  • Testing: Added test coverage for the streaming upload functionality.

Architecture

  • Coderd ↔️ Provisionerd: Uses new streaming RPC when payload exceeds limits.

    // UploadFile streams files to be inserted into the database.
    // The file upload_type should be used to determine how to handle the file.
    rpc UploadFile(stream UploadFileRequest) returns (Empty);

  • Provisionerd ↔️ Provisioner: Extends existing stream with new message types for in-memory communication

This change enables templates with arbitrarily large Terraform modules while maintaining performance through deduplication and preserving the existing fast path for smaller payloads.

Version to 1.7

Old versions (1.6) are compatible with new Coderd, as the previous uploading still works if the module_files are <4mb.

Large modules are omitted in v1.6, and templates throw a warning that it is missing modules on the Create Workspace page.

Version 1.7 is not backwards compatible. A 1.7 provisioner will send messages a previous Coderd does not support.

Manual QA

Old provisioner with a large module

The modules just are not uploaded, which is the expected behavior.

Screenshot From 2025-06-11 12-00-33

@Emyrk Emyrk force-pushed the stevenmasley/4mb branch from 4807dbc to 124d366 Compare June 6, 2025 14:19
@Emyrk Emyrk changed the title Stevenmasley/4mb feat: increase maximum module files payload using proto streams Jun 6, 2025
@Emyrk Emyrk changed the title feat: increase maximum module files payload using proto streams feat: use proto streams to increase maximum module files payload Jun 9, 2025
@Emyrk Emyrk marked this pull request as ready for review June 11, 2025 14:48
@Emyrk Emyrk requested a review from johnstcn as a code owner June 11, 2025 14:48
Comment on lines +168 to +192
if protobuf.Size(resp) > drpcsdk.MaxMessageSize {
// It is likely the modules that is pushing the message size over the limit.
// Send the modules over a stream of messages instead.
s.Logger.Info(s.Context(), "plan response too large, sending modules as stream",
slog.F("size_bytes", len(complete.ModuleFiles)),
)
dataUp, chunks := proto.BytesToDataUpload(proto.DataUploadType_UPLOAD_TYPE_MODULE_FILES, complete.ModuleFiles)

complete.ModuleFiles = nil // sent over the stream
complete.ModuleFilesHash = dataUp.DataHash
resp.Type = &proto.Response_Plan{Plan: complete}

err := s.stream.Send(&proto.Response{Type: &proto.Response_DataUpload{DataUpload: dataUp}})
if err != nil {
complete.Error = fmt.Sprintf("send data upload: %s", err.Error())
} else {
for i, chunk := range chunks {
err := s.stream.Send(&proto.Response{Type: &proto.Response_ChunkPiece{ChunkPiece: chunk}})
if err != nil {
complete.Error = fmt.Sprintf("send data piece upload %d/%d: %s", i, dataUp.Chunks, err.Error())
break
}
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only triggers if the message size is too large. In my testing, most modules are quite small. So this is actually the edge case.

On dev.coder.com we did hit a module >4mb. We made some changes to get it under the message size, however in the wild, it's quite easy to make a module too large. You could for example have a README.md with a large uncompressed image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope of this PR, but I think this is something we need to remain careful of on the dev registry. Maybe we need some kind of linter that complains if a built module is over the 4MB limit, with some allowlisting for known chonky bois.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I would like to build out more infra down the road to keep track of the total payload size of the template. Even more so then just the modules.

@Emyrk Emyrk requested a review from spikecurtis June 11, 2025 17:16
Comment on lines +168 to +192
if protobuf.Size(resp) > drpcsdk.MaxMessageSize {
// It is likely the modules that is pushing the message size over the limit.
// Send the modules over a stream of messages instead.
s.Logger.Info(s.Context(), "plan response too large, sending modules as stream",
slog.F("size_bytes", len(complete.ModuleFiles)),
)
dataUp, chunks := proto.BytesToDataUpload(proto.DataUploadType_UPLOAD_TYPE_MODULE_FILES, complete.ModuleFiles)

complete.ModuleFiles = nil // sent over the stream
complete.ModuleFilesHash = dataUp.DataHash
resp.Type = &proto.Response_Plan{Plan: complete}

err := s.stream.Send(&proto.Response{Type: &proto.Response_DataUpload{DataUpload: dataUp}})
if err != nil {
complete.Error = fmt.Sprintf("send data upload: %s", err.Error())
} else {
for i, chunk := range chunks {
err := s.stream.Send(&proto.Response{Type: &proto.Response_ChunkPiece{ChunkPiece: chunk}})
if err != nil {
complete.Error = fmt.Sprintf("send data piece upload %d/%d: %s", i, dataUp.Chunks, err.Error())
break
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope of this PR, but I think this is something we need to remain careful of on the dev registry. Maybe we need some kind of linter that complains if a built module is over the 4MB limit, with some allowlisting for known chonky bois.

@Emyrk Emyrk requested a review from johnstcn June 12, 2025 19:28
@Emyrk Emyrk merged commit c1341cc into main Jun 13, 2025
37 checks passed
@Emyrk Emyrk deleted the stevenmasley/4mb branch June 13, 2025 17:46
@github-actions github-actions bot locked and limited conversation to collaborators Jun 13, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants