-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Jira Link: DB-18376
Description
Currently, when appending a new WAL entry batch, we check if the batch would cause the file size to exceed the preallocation size. If so, we trigger an async rollover to a new segment. However, the current batch (and any subsequent appends before the rollover completes) still go into the current WAL segment, so the file often grows beyond its preallocated size.
code from Log::DoAppend:
Status Log::DoAppend(LogEntryBatch* entry_batch, SkipWalWrite skip_wal_write) {
...
// If the size of this entry overflows the current segment, get a new one.
if (allocation_state() == SegmentAllocationState::kAllocationNotStarted) {
if (active_segment_->Size() + entry_batch_bytes + kEntryHeaderSize > cur_max_segment_size_) {
LOG_WITH_PREFIX(INFO) << "Max segment size " << cur_max_segment_size_ << " reached. "
<< "Starting new segment allocation.";
RETURN_NOT_OK(AsyncAllocateSegment());
if (!options_.async_preallocate_segments) {
RETURN_NOT_OK(RollOver());
}
}
} else if (allocation_state() == SegmentAllocationState::kAllocationFinished) {
RETURN_NOT_OK(RollOver());
} else {
VLOG_WITH_PREFIX(1) << "Segment allocation already in progress...";
}
... (append the batch)
}
Issue:
The intention of preallocation is to reduce filesystem overhead, but in practice most WAL segments end up overflowing anyway. This may reduce the benefits of preallocation and potentially add extra filesystem overhead.
Open Questions:
- How much performance impact does this overflow typically have in practice?
- If the impact is significant, would it make sense to trigger rollover earlier (e.g., once reach fixed reserve size in bytes based on average/max typical batch sizee), so that we reduce the chance of repeated overflows?
Next Steps:
Measure the actual performance impact of these overflows.
If the overhead is significant, consider a proactive rollover threshold (e.g., at ~80% of preallocation size).
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
- I confirm this issue does not contain any sensitive information.