Skip to content

Feat S3 Transfer Manager v2 UploadDirectory #3110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

wty-Bryant
Copy link
Contributor

Implement v2 s3 transfer manager's UploadDirectory api bound to single union client which mimics normal service client's initialization and api call. User could now upload a directory's files recursively/non-recursively to S3 with customized filtering and key naming in a single operation call.

Test: This PR passed unit tests for UploadDirectory which covers use cases with different config and folder structure to confirm this key-prefix uploader could find valid file location, generate correct key and pass upload requests to S3 concurrently. It also passed integration test which uploads source directory containing different sizes files to S3 via new directory uploader, then downloads objects back using S3 client and checks their content consistency

@wty-Bryant wty-Bryant requested a review from a team as a code owner June 9, 2025 17:48
@wty-Bryant wty-Bryant force-pushed the feat-keyprefix-upload branch from 775db9a to 7f51d9c Compare June 9, 2025 22:13
postprocessFunc := func() error {
// this cleans up all possible symlinks regardless of
// whether or not it is successfully created
os.Remove(symlinkPath1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why do we need to create and delete symlinks in tests. I suspect there are some gotchas when using git across multiple platforms, but want to confirm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why I create and delete them dynamically in test code, different platforms have different file separator and location to put go v2, so a static symlink will be invalid in that case.


// The s3 delimeter contatenating each object key based on local file separator
// and file's relative path
S3Delimiter string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth mentioning that this delimiter cannot be included in another directory. It makes sense, but took me by surprise reading the tests

} else {
key = keyPrefix + u.in.S3Delimiter + filepath.Base(path)
}
fileInfo, err := os.Lstat(absPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we check this err?

if err != nil {
return []string{}, err
}
subFiles, err := f.Readdir(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docs on Readdir say

Most clients are better served by the more efficient ReadDir method.

link

However, there may be a reason why we're using this instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only need sub file name, change to more efficient one


func (u *directoryUploader) uploadDirectory(ctx context.Context) (*UploadDirectoryOutput, error) {
u.init()
ch := make(chan fileChunk)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're doing file chunking, so I would call this just fileChan or something similar.

break
}
if u.getErr() != nil {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about the continue here. I think the idea is to continue uploading files if there's an error with the current file, but I don't see a place where we clear an error once it has been set, so wouldn't this always return non-nil once it has been set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed with break since an undrained channel will still be gc once closed and not referred by any struct

Delimiter string
KeyPrefix string
ExpectFilesUploaded int
ExpectFilesFailed int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have any test that uses this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed for now

@wty-Bryant wty-Bryant force-pushed the feat-keyprefix-upload branch from 7f51d9c to 8d95b62 Compare June 19, 2025 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants