-
Notifications
You must be signed in to change notification settings - Fork 3.7k
chore(dataobj): add ability to buffer pending sections to disk #18780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
chore(dataobj): add ability to buffer pending sections to disk #18780
Conversation
bc07f2a
to
600f868
Compare
// TODO(rfratto): the memory footprint of [Encoder] can very slowly grow in | ||
// memory as [bufpool] is filled with buffers with increasing capacity: | ||
// each encoding pass has a different number of elements, shuffling which | ||
// elements of the hierarchy get which pooled buffers. | ||
// | ||
// This means that elements that require more bytes will grow the capacity of | ||
// the buffer and put the buffer back into the pool. Even if further encoding | ||
// passes don't need that many bytes, the buffer is kept alive with its larger | ||
// footprint. Given enough time, all buffers in the pool will have a large | ||
// capacity. | ||
// | ||
// The bufpool package provides bucketed pools as a solution to, but this | ||
// requires knowing how many bytes are needed. | ||
// | ||
// Encoder can eventually be moved to the bucketed pools by calculating a | ||
// rolling maximum of encoding size used per element across usages of an | ||
// Encoder instance. This would then allow larger buffers to be eventually | ||
// reclaimed regardless of how often encoding is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer relevant with this change (even if using memoryScratchStore
, since there's one buffer per section) 🎉
600f868
to
732c859
Compare
sectionScratchStore is an interface for representing how completed sections are buffered while the rest of the data object is still being created. scratchStore has two implementations: - memoryScratchStore, which accumulates completed sections in memory - diskScratchStore, which accumulates completed sections on disk diskScratchStore is intended to be given an ephemeral storage location. Any existing files in the scratch filepath are ignored. To handle situations where a disk runs out of space, diskScratchStore will fallback to using in-memory storage if a write happens to fail.
A `snapshot` is the representation of a complete data object constructed by the encoder, but without fully loading that complete data object into memory at once. `snapshots` refer to a sectionScratchStore to read section data and metadata on demand.
Change the encoder to return a snapshot instead of writing data to a provided buffer. This allows the returned object of Builder.Flush to internally read from the scratchSectionStore.
NewBuilder now accepts an argument for where to store scratch data. When this argument is specified, a diskScratchStore is used for accumulating completed sections while the overall object is still in progress. Since creating a diskScratchStore can fail when given an invalid path, NewBuilder now returns an error.
section_scratch_path can now be used to tell data object builders that completed sections pending an object flush should be stored on disk to reduce peak memory usage of the builder.
732c859
to
287d6f2
Compare
I ran this for about three hours in a dev environment and things look stable. It cuts memory usage of consumers by about 50%. While I suspected the memory improvement would be more significant than that, it does have the bonus effect of making memory usage far more stable over time. I'm going to add some metrics for monitoring the cache, but I think this is good for review and merging. |
Add metrics for tracking: * The current number of sections in the scratch store * The total number of bytes stored in the scratch store * The time taken to do operations in the scratch store
While a data object is being constructed, completed sections accumulate in memory until the overall object flushes. When building large objects, a flush can take quite a long time to happen, leading to sustained memory usage of the builder.
Since completed sections serve no purpose, accumulating them in memory is wasteful.
This PR adds an abstraction so that users can optionally provide a path on disk for storing these pending sections. When specified, the path on disk must exist. The path to use for pending sections is intended to be ephemeral storage. Existing files in that path are ignored between restarts of Loki.
Completed sections are placed on disk using their SHA-224 as the filename (with
.data
or.metadata
extension depending on the region being stored). If for some reason the sections fails to write to disk (such as the disk being full), in-memory buffers are used as a fallback.When flushing a builder, the returned object reads from these section files on demand, allowing callers to upload objects to object storage without needing to buffer the entire object on memory. The section files are deleted after closing the
io.Closer
returned fromdataobj.Builder.Flush
.