Garbage Collection
SlateDB’s garbage collector runs as a background task in the client process, periodically checking for obsolete files in the database storage.
The garbage collector has a configurable minimum age and interval for each file type (WAL SSTs, compacted SSTs, and manifests). Garbage collection for a file type can be disabled by setting its options to None
. The collector runs every interval
seconds and will delete files older than min_age
that are not referenced by any active manifest or checkpoint.
Below is a diagram illustrating the high-level flow of garbage collection in SlateDB:
flowchart TD A["Start GC Cycle (interval timer)"] --> B[Remove Expired Checkpoints from Manifests] B --> C[Run WAL SST GC Task] C --> C1[List WAL SSTs older than last compacted ID] C1 --> C2[Filter by min_age and active references] C2 --> C3[Delete eligible WAL SSTs] B --> D[Run Compacted SST GC Task] D --> D1[List all compacted and L0 SSTs] D1 --> D2[Gather active SST IDs from manifests] D2 --> D3[Delete SSTs not referenced and older than min_age] B --> E[Run Manifest GC Task] E --> E1["List all manifests (exclude latest)"] E1 --> E2[Gather active manifest IDs from checkpoints] E2 --> E3[Delete manifests not referenced and older than min_age] C3 & D3 & E3 --> G[Wait for Next Interval or Shutdown]