-
Notifications
You must be signed in to change notification settings - Fork 2.6k
AWS, GCP: Fix double-checked-locking pattern in S3FileIO, GCSFileIO. #13276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS, GCP: Fix double-checked-locking pattern in S3FileIO, GCSFileIO. #13276
Conversation
…king pattern.
…checked-locking pattern.
@@ -434,9 +434,9 @@ private Map<String, PrefixedS3Client> clientByPrefix() { | |||
if (null == clientByPrefix) { | |||
synchronized (this) { | |||
if (null == clientByPrefix) { | |||
this.clientByPrefix = Maps.newHashMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow the issue on how there can be visibility issues since the field is volatile
and is assigned in a synchronized
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The issue is not about the field visibility problem, it is about field being pre-maturely visible to all other threads before populating all the required entries in the map.
- Assuming there are 2 threads executing the method.
- First thread entered the synchronized block and initialized empty hashmap before populating the entries.
- It is possible that second thread don't even reach synchronized block as the first check
(null == clientByPrefix)
evaluates to false because the first thread assigned field to an empty hashmap. - Second thread proceeds its computation consuming the empty hashmap and could result in error.
- To cover this scenario, we should have field assignment as a last statement of synchronized block. This ensures the map field is visible to all other threads only after it is completely populated, until then other threads will be waiting at synchronized block as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation, yeah this makes sense now. It's possible that there are 2 threads executing clientsByPrefix()
; currently, the first thread will make clientsByPrefix
non-null and the second thread will incorrectly think everything is populated even though it isn't. I agree with the fix that we should only make clientsByPrefix
non-null at the end of population.
@@ -434,9 +434,9 @@ private Map<String, PrefixedS3Client> clientByPrefix() { | |||
if (null == clientByPrefix) { | |||
synchronized (this) { | |||
if (null == clientByPrefix) { | |||
this.clientByPrefix = Maps.newHashMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation, yeah this makes sense now. It's possible that there are 2 threads executing clientsByPrefix()
; currently, the first thread will make clientsByPrefix
non-null and the second thread will incorrectly think everything is populated even though it isn't. I agree with the fix that we should only make clientsByPrefix
non-null at the end of population.
Thanks @ChaladiMohanVamsi! |
Fix: double-checked-locking pattern in S3FileIO, GCSFileIO.
Ensure Thread-Safe Initialization of
clientByPrefix
andstorageByPrefix
This PR addresses a thread-safety issue during the lazy initialization of the
clientByPrefix
map. Previously, the shared map was being assigned before it was fully populated, which could lead to race conditions and visibility issues in a concurrent environment.Changes:
localClientByPrefix
inside the synchronized block.clientByPrefix
field.Benefits:
NullPointerException
due to premature assignment.