Skip to content

Fix/use tempfiles for isolation #571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
Draft

Conversation

CyMule
Copy link
Contributor

@CyMule CyMule commented Jul 30, 2025

No description provided.

CyMule added a commit that referenced this pull request Jul 31, 2025
## Problem

S3 downloads were sometimes failing with `NotADirectoryError` and
`FileExistsError` when S3 buckets contained objects with conflicting
naming patterns that cannot be represented in traditional filesystem
hierarchies.

**Example conflict:**
- S3 object: `foo` (file)
- S3 object: `foo/documents` (file requiring foo to be a directory)

This created a race condition where download order determined
success/failure

## Solution

Used tempfile to create unique download paths for each S3 object:

**Before:**
```
S3: "foo" → Local: /downloads/foo
S3: "foo/documents" → Local: /downloads/foo/documents
Conflict: foo cannot be both file and directory
```

**After:**
```
S3: "foo" → Local: /downloads/a1b2c3d4e5f6/foo
S3: "foo/documents" → Local: /downloads/9g8h7i6j5k4l/documents
No conflicts: Each file gets unique directory
```

## Future Work

This PR targets only the s3 downloads. I think it would make sense to
use tempfiles for all downloads (as in [PR
#571](#571)),
but that requires more extensive changes to implement cleanly. This fix
provides immediate relief from the path conflict issues while we work on
the more comprehensive tempfile solution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant