-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Moving predictor to WorkingArea to make it thread safe #13706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…omplete test code
Summary: Added compiler options in the stress test. Need to verify it is correct. Need to verify ldb is correct.
util/auto_skip_compressor.h
Outdated
@@ -33,6 +33,20 @@ class CompressionRejectionProbabilityPredictor { | |||
size_t window_size_; | |||
}; | |||
|
|||
class AutoSkipCompressionContext : public Compressor::WorkingArea { | |||
public: | |||
explicit AutoSkipCompressionContext(Compressor::ManagedWorkingArea&& wa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can call this AutoSkipWorkingArea
util/auto_skip_compressor.cc
Outdated
if (prediction <= kProbabilityCutOff) { | ||
// decide to compress | ||
return CompressBlockAndRecord(uncompressed_data, compressed_output, | ||
out_compression_type, wa); | ||
} else { | ||
// decide to bypass compression | ||
// bypassed compression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you accidentally reversed some good fix such as this comment. Make sure to check other places too
util/auto_skip_compressor.h
Outdated
// Predict rejection probability using a moving window approach | ||
// This class is not thread safe | ||
// Predicts Rejection Probability using previous using past data of certain | ||
// window size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another accidental revert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure this PR is rebased on the upstream main after it includes all your previous fixes to auto skip. This is to not accidentally revert your changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave @pdillinger for the final pass. For quick pass, overall design looks right to me except for some small stuff
util/auto_skip_compressor.cc
Outdated
@@ -65,27 +65,44 @@ Status AutoSkipCompressorWrapper::CompressBlock( | |||
return CompressBlockAndRecord(uncompressed_data, compressed_output, | |||
out_compression_type, wa); | |||
} else { | |||
auto prediction = predictor_->Predict(); | |||
auto predictor_ptr = | |||
static_cast<AutoSkipCompressionContext*>(wa->get())->predictor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the API commend on CompressBlock,
// The working area is optional and used to optimize repeated compression by
// a single thread. ManagedWorkingArea is provided rather than just
// WorkingArea so that it can be used only if the `owner` matches expectation.
// This could be useful for a Compressor wrapping more than one alternative
// underlying Compressor.
Specifically, the working area should only be used if the owner matches expectations. See BuiltinCompressorV2::CompressBlock checking owner() for example.
Also the working area is optional. It is typically provided by the RocksDB table builder but not guaranteed in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdillinger If the working area is not provided or a working area with wrong owner is provided, does it mean they are not using AutoSkip? Since the predictor is in the working area, we don't have access to any prediction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it is OK to provide degraded functionality (always attempt compression) if the WorkingArea is not provided. I might reconsider the exact contract for this in the future.
Some detail: Right now the compressed secondary cache can't efficiently provide a working area because we cannot predict which threads might access the cache, and how many times they might access it. It is perhaps unlikely that AutoSkip would be used with the compressed secondary cache, but it's possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The owner() check part is still outstanding also.
util/auto_skip_compressor.cc
Outdated
Compressor::ManagedWorkingArea AutoSkipCompressorWrapper::ObtainWorkingArea() { | ||
auto wrap_wa = wrapped_->ObtainWorkingArea(); | ||
return ManagedWorkingArea( | ||
static_cast<WorkingArea*>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_cast should be unnecessary
util/auto_skip_compressor.cc
Outdated
Status AutoSkipCompressorWrapper::CompressBlockAndRecord( | ||
Slice uncompressed_data, std::string* compressed_output, | ||
CompressionType* out_compression_type, ManagedWorkingArea* wa) { | ||
auto wrap_wa = | ||
&(static_cast<AutoSkipCompressionContext*>(wa->get())->wrapped); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this function take AutoSkipCompressionContext instead of WorkingArea to avoid so many casts?
@shubhajeet has updated the pull request. You must reimport the pull request before landing. |
@shubhajeet has updated the pull request. You must reimport the pull request before landing. |
@shubhajeet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very close. Resolve the WorkingArea extra checks needed
util/auto_skip_compressor.cc
Outdated
@@ -62,14 +62,18 @@ Status AutoSkipCompressorWrapper::CompressBlock( | |||
"AutoSkipCompressorWrapper::CompressBlock::exploitOrExplore", | |||
&exploration); | |||
if (exploration) { | |||
auto autoskip_wa = static_cast<AutoSkipWorkingArea*>(wa->get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still have three ugly static_cast<AutoSkipWorkingArea*>
where I believe one should be sufficient.
util/auto_skip_compressor.cc
Outdated
@@ -65,27 +65,44 @@ Status AutoSkipCompressorWrapper::CompressBlock( | |||
return CompressBlockAndRecord(uncompressed_data, compressed_output, | |||
out_compression_type, wa); | |||
} else { | |||
auto prediction = predictor_->Predict(); | |||
auto predictor_ptr = | |||
static_cast<AutoSkipCompressionContext*>(wa->get())->predictor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The owner() check part is still outstanding also.
@shubhajeet has updated the pull request. You must reimport the pull request before landing. |
@shubhajeet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@shubhajeet has updated the pull request. You must reimport the pull request before landing. |
@shubhajeet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
@shubhajeet merged this pull request in 34d8f03. |
Summary: **Summary:** We need to move the Predictor to WorkingArea so that it is local to each thread and thus is thread safe. Pull Request resolved: facebook#13706 Test Plan: It should pass the test case written in ./compression_test. Reviewed By: pdillinger Differential Revision: D76836846 Pulled By: shubhajeet fbshipit-source-id: 0d0170baf65f4bb95ba107fec77151e66b8a4449
Summary:
We need to move the Predictor to WorkingArea so that it is local to each thread and thus is thread safe.
Test Plan:
It should pass the test case written in ./compression_test.