Skip to content

Add an optional min file size requirement for deletion triggered compaction #13707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

cbi42
Copy link
Member

@cbi42 cbi42 commented Jun 17, 2025

Summary: add the min_file_size parameter to CompactOnDeletionCollector. A file must be at least this size for it to qualify for DTC. This is useful when a user wants to specific a min file size requirement that is larger than the size constraint imposed by the sliding window's deletion_trigger requirement.

Added some comment explaining that the file_size provided to table property collector only includes data blocks and may not be up-to-date. This PR also updates DTC to consider SingleDelete and DeletionWithTimestamp as tombstones.

Test plan:

  • new unit test for when min_file_size is specified.
  • existing unit test for when min_file_size is not specified.

@cbi42 cbi42 marked this pull request as ready for review June 17, 2025 17:57
@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 requested review from pdillinger and hx235 June 17, 2025 19:59
@@ -31,7 +31,8 @@ class CompactOnDeletionCollectorFactory
// based on deletion ratio.
Copy link
Contributor

@hx235 hx235 Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also expand this comment to include min_file_size and edit: note the limitation with parallel compression.

Copy link
Contributor

@hx235 hx235 Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accumulated_file_size would be a better name IMO (but not a big deal if we can't change it now due to public API tax)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: // entries >= deletion_ratio of the entire file - just to emphasize that this deletion_ratio applies to the whole file (not like how D and N enforces a deletion ratio in a sliding portion of the file)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some notes on file size and fixed the comment for nit.

Comment on lines +95 to +98
// The file may qualify for compaction based on file size constraints,
// even if max_deletion_in_window_ is not updated.
if (max_deletion_in_window_ >= deletion_trigger_ &&
file_size >= min_file_size_) {
need_compaction_ = true;
}
Copy link
Contributor

@hx235 hx235 Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still forming the right mental model of this. I feel like this is the same/replace-able by checking the file size and max_deletion_in_window_ (historical highest) >= deletion_trigger_ once in Finish().

Did I miss anything why you have to do the check here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh by the way, global_max_deletions_in_window seems to indicate the historical aspect slightly better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remember the file size, then it can be done in Finish(). There may be cases where we want to know if a file can be marked for compaction before the end of a file (e.g. for early file cutting).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I thought "Marked For Compaction" is an info only useful to a file that already finishes writing (so after Finish() is called). What do you mean by early file cutting?

std::unique_ptr<TablePropertiesCollector> collector(
factory->CreateTablePropertiesCollector(context));

// Add enough deletions to meet both sliding window and ratio triggers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "and ratio triggers", I assume you don't mean deletion_ratio but the "deletion ratio" enforced in the sliding window. Can we clarify?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it's an outdated comment, this case is meant for sliding window only.

// @params file_size the current file size. For BlockBasedTable, this
// includes all the data blocks written so far, upto but not including
// the current block being built. With parallel compression, data
// blocks are written async so it depends on the compression progress.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be a follow up work: add the CompactOnDeletionCollectorFactory to stress test

options.table_properties_collector_factories.emplace_back(
      std::make_shared<DbStressTablePropertiesCollectorFactory>());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that will be a good onboarding task.

Copy link
Contributor

@hx235 hx235 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great - the only real blocking comment is https://github.com/facebook/rocksdb/pull/13707/files#r2155144571 before I can approve.

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -39,8 +42,10 @@ Status CompactOnDeletionCollector::AddUserKey(const Slice& /*key*/,
const Slice& /*value*/,
EntryType type,
SequenceNumber /*seq*/,
uint64_t /*file_size*/) {
uint64_t file_size) {
#ifndef NDEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not strictly required because assert() is a macro

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was hoping to reduce the memory needed but it won't make a difference given other existing class members.

file_size));

ASSERT_OK(collector->Finish(nullptr));
ASSERT_EQ(collector->NeedCompact(), file_size >= kMinFileSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally there would be an integration test (as well as the unit test), but I'm not going to go crazy over that for this feature building on existing stuff.

@cbi42 cbi42 force-pushed the min-file-size-dtc branch from 2c23bff to 9b3abb1 Compare June 19, 2025 02:25
@facebook-github-bot
Copy link
Contributor

@cbi42 has updated the pull request. You must reimport the pull request before landing.

Copy link
Member Author

@cbi42 cbi42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review

@@ -31,7 +31,8 @@ class CompactOnDeletionCollectorFactory
// based on deletion ratio.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some notes on file size and fixed the comment for nit.

// @params file_size the current file size. For BlockBasedTable, this
// includes all the data blocks written so far, upto but not including
// the current block being built. With parallel compression, data
// blocks are written async so it depends on the compression progress.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that will be a good onboarding task.

std::unique_ptr<TablePropertiesCollector> collector(
factory->CreateTablePropertiesCollector(context));

// Add enough deletions to meet both sliding window and ratio triggers
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it's an outdated comment, this case is meant for sliding window only.

@@ -39,8 +42,10 @@ Status CompactOnDeletionCollector::AddUserKey(const Slice& /*key*/,
const Slice& /*value*/,
EntryType type,
SequenceNumber /*seq*/,
uint64_t /*file_size*/) {
uint64_t file_size) {
#ifndef NDEBUG
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was hoping to reduce the memory needed but it won't make a difference given other existing class members.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@hx235 hx235 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants