-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Support GetFileSize API in FSRandomAccessFile #13676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@xingbowang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would include as many built-in implementations as you can in this PR (git grep 'public FSRandomAccessFile'
) and add a basic unit test in env_test that simply writes a file of some random size, up to ~100KB, opens it for reads and checks that the written size agrees with all the file size APIs.
We also need to override this function in all the wrapper classes. Basically, if you can't compile the full repository with this function pure virtual instead of defaulting to NotSupported, you should have an explanation for each of the internal cases you have not implemented in this PR.
include/rocksdb/file_system.h
Outdated
@@ -909,6 +909,13 @@ class FSRandomAccessFile { | |||
return IOStatus::NotSupported("Prefetch"); | |||
} | |||
|
|||
// Get the file size. The default implementation is not supported, as some of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say
The default implementation returns "not supported" so that user implementations
of FSRandomAccessFile do not need to immediately implement this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. I will address them.
I also noticed another class RandomAccessFile, which had exact same functions as FSRandomAccessFile. Should I add the new API there as well. I recall you mentioned that it was from a legacy system used in env. The code seems indicate that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not be needed for the legacy Env-related class RandomAccessFile. Just adds to the confusion about what should be overridden. #9274
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RandomAccessFile is used in LegacyRandomAccessFileWrapper. To make the wrapper work, I had to add the new API GetFileSize in RandomAccessFile. Let me know if it is a concern.
Also, use |
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
@xingbowang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
@xingbowang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
@xingbowang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
Deprecate skip_checking_sst_file_sizes_on_db_open option. Summary: This change is used to address this issue facebook#13619 It supports file size validation in ReadFooterFromFile. In favor of this change, CheckConsistency function and skip_checking_sst_file_sizes_on_db_open flag are deprecated. The CheckConsistency function checks each file size matches what was recorded in manifest during DB open. Meantime, ReadFooterFromFile was called for each file in LoadTables function. Since ReadFooterFromFile always validates file size, the CheckConsistency is redundant. In addtion, CheckConsistency is ecxecuted in a single thread. This could slow down DB open when a network file system is used. Therefore, the flag skip_checking_sst_file_sizes_on_db_open was added to skip this check. After this change, ReadFooterFromFile was executed in parallel through multiple threads. Therefore, the concern of DB open slowness is eliminated, and the flag could be deprecated. There is 2 slight concerns of this change. *If max_open_files is set with smaller value, engine will not open all the files during DB open. This means if there is a corruption on file size, it will not be detected during DB open, but rather at a later time. Since the default is -1, which means open all the files, and it is rarely overridden and a lot of new features rely on it to be -1, the risk is very low. *If FIFO compaction is used, engine could fail to open DB unnecessarily on the corrupted files that would never be used again. However, this is a very rare case as well. The error could still be ignored by setting paranoid_checks operationally. The risk is very low. To remain backward compatibility. The public facing flag was kept and marked as no-op internally. Another change is required to fully remove the flag. Test Plan: make check A new unit test was added to validate file size check API works as expected.
@xingbowang has updated the pull request. You must reimport the pull request before landing. |
Summary:
This change is used to address this issue
#13619
It supports GetFileSize API in FSRandomAccessFile. This allows ReadFooterFromFile to quickly get the file size for file size validation.
Test Plan:
make check
Reviewers:
Peter Dillinger
Subscribers:
Tasks:
Tags: