Skip to content

Metrics

SlateDB uses a recorder-based metrics system. You provide an implementation of the MetricsRecorder trait and SlateDB pushes metrics to it. If no recorder is configured, all metric operations are silently discarded with nearly no overhead.

Pass your recorder to DbBuilder:

use slatedb::Db;
use slatedb::object_store::memory::InMemory;
use slatedb_common::metrics::MetricsRecorder;
use std::sync::Arc;
let recorder: Arc<dyn MetricsRecorder> = /* your implementation */;
let db = Db::builder("my_db", Arc::new(InMemory::new()))
.with_metrics_recorder(recorder)
.build()
.await
.unwrap();

MetricsRecorder has four registration methods, one per metric type:

pub trait MetricsRecorder: Send + Sync {
fn register_counter(&self, name: &str, description: &str, labels: &[(&str, &str)])
-> Arc<dyn CounterFn>;
fn register_gauge(&self, name: &str, description: &str, labels: &[(&str, &str)])
-> Arc<dyn GaugeFn>;
fn register_up_down_counter(&self, name: &str, description: &str, labels: &[(&str, &str)])
-> Arc<dyn UpDownCounterFn>;
fn register_histogram(&self, name: &str, description: &str, labels: &[(&str, &str)], boundaries: &[f64])
-> Arc<dyn HistogramFn>;
}

Each method returns a handle that SlateDB calls on the hot path:

TraitMethodValue typeUse case
CounterFnincrement(u64)MonotonicRequest counts, bytes written
GaugeFnset(i64)AbsoluteMemory usage, queue depth
UpDownCounterFnincrement(i64)Additive (positive or negative)In-flight compactions
HistogramFnrecord(f64)DistributionLatency, I/O sizes

GaugeFn and UpDownCounterFn are separate following OpenTelemetry semantics: gauges represent point-in-time snapshots (set), while up-down counters track additive changes (increment with positive or negative values).

Labels are fixed at registration time as &[(&str, &str)] key-value pairs. SlateDB uses labels to collapse related metrics into a single name (e.g. slatedb.db.request_count with an op label instead of separate counters per operation).

All metric names use dot-separated notation: slatedb.<subsystem>.<metric>.

NameTypeLabelsDescription
slatedb.db.request_countcounterop: get, scan, flushNumber of DB requests
slatedb.db.write_opscounterWrite operations
slatedb.db.write_batch_countcounterWrite batches
slatedb.db.backpressure_countcounterBackpressure events
slatedb.db.immutable_memtable_flushescounterImmutable memtable flushes
slatedb.db.wal_buffer_flushescounterWAL buffer flushes
slatedb.db.wal_buffer_flush_requestscounterWAL buffer flush requests
slatedb.db.wal_buffer_estimated_bytesgaugeEstimated WAL buffer size
slatedb.db.total_mem_size_bytesgaugeTotal memory usage
slatedb.db.l0_sst_countgaugeL0 SST count
slatedb.db.sst_filter_false_positive_countcounterBloom filter false positives
slatedb.db.sst_filter_positive_countcounterBloom filter positives
slatedb.db.sst_filter_negative_countcounterBloom filter negatives
NameTypeLabelsDescription
slatedb.db_cache.access_countcounterentry_kind: filter, index, data_block, stats; result: hit, missCache accesses
slatedb.db_cache.error_countcounterCache errors
NameTypeLabelsDescription
slatedb.compactor.bytes_compactedcounterTotal bytes compacted
slatedb.compactor.last_compaction_timestamp_secgaugeLast compaction time (epoch seconds)
slatedb.compactor.running_compactionsup_down_counterCurrently running compactions
slatedb.compactor.total_bytes_being_compactedgaugeBytes in active compactions
slatedb.compactor.total_throughput_bytes_per_secgaugeCompaction throughput
NameTypeLabelsDescription
slatedb.gc.deleted_countcounterresource: manifest, wal, compacted, compactionsDeleted resources
slatedb.gc.countcounterGC runs

All object store metrics carry four labels:

LabelValues
componentdb, reader, gc, compactor
store_typemain, wal
opget, put, delete
apiget, get_range, get_ranges, head, put, multipart_init, multipart_part, multipart_complete, delete
NameTypeDescription
slatedb.object_store.request_countcounterTotal API calls (success and error)
slatedb.object_store.error_countcounterFailed API calls
slatedb.object_store.request_duration_secondshistogramPer-request latency

The instrumented store sits beneath the retrying layer, so each retry attempt is counted separately. Cache hits that never reach the remote store are not counted.

Object store cache (slatedb.object_store_cache.*)

Section titled “Object store cache (slatedb.object_store_cache.*)”
NameTypeLabelsDescription
slatedb.object_store_cache.part_hit_countcounterCache part hits
slatedb.object_store_cache.part_access_countcounterCache part accesses
slatedb.object_store_cache.cache_keysgaugeCached keys
slatedb.object_store_cache.cache_bytesgaugeCached bytes
slatedb.object_store_cache.evicted_keyscounterEvicted keys
slatedb.object_store_cache.evicted_bytescounterEvicted bytes

These are passed to register_histogram as the boundaries parameter.

slatedb-common ships a DefaultMetricsRecorder backed by atomics. It’s useful in tests and in some production scenarios that don’t have a dedicated metrics backend (e.g. periodic logging):

use slatedb_common::metrics::DefaultMetricsRecorder;
use std::sync::Arc;
let recorder = Arc::new(DefaultMetricsRecorder::new());
let db = Db::builder("test_db", object_store)
.with_metrics_recorder(recorder.clone())
.build()
.await
.unwrap();
// ... perform operations ...
let snapshot = recorder.snapshot();
for metric in snapshot.all() {
println!("{}: {:?} {:?}", metric.name, metric.labels, metric.value);
}

The metrics crate is a lightweight facade (similar to log for logging). Install any compatible exporter (e.g. metrics-exporter-prometheus) and wire it up with a thin adapter:

use metrics::{counter, describe_counter, describe_gauge, describe_histogram, gauge, histogram};
struct MetricsRsCounter(metrics::Counter);
impl CounterFn for MetricsRsCounter {
fn increment(&self, value: u64) {
self.0.increment(value);
}
}
// Gauge, UpDownCounter, and Histogram wrappers follow the same pattern.
pub struct MetricsRsRecorder;
impl MetricsRecorder for MetricsRsRecorder {
fn register_counter(
&self, name: &str, desc: &str, labels: &[(&str, &str)]
) -> Arc<dyn CounterFn> {
let labels: Vec<(String, String)> = labels.iter()
.map(|(k, v)| (k.to_string(), v.to_string())).collect();
describe_counter!(name.to_string(), desc.to_string());
Arc::new(MetricsRsCounter(counter!(name.to_string(), &labels)))
}
// ... other methods follow the same pattern
}

MetricsRsRecorder is stateless since the metrics facade manages all state globally.

A Prometheus recorder maps each register_* call to a labeled time series within a Family:

use prometheus_client::metrics::counter::Counter as PromCounter;
use prometheus_client::metrics::family::Family;
use prometheus_client::registry::Registry;
type Labels = Vec<(String, String)>;
// Thin wrapper that bridges prometheus-client's Counter to SlateDB's CounterFn.
struct PromCounterHandle(PromCounter);
impl CounterFn for PromCounterHandle {
fn increment(&self, value: u64) {
self.0.inc_by(value);
}
}
struct PrometheusRecorder {
registry: Mutex<Registry>,
counters: Mutex<HashMap<String, Family<Labels, PromCounter>>>,
// ... gauges, histograms
}
impl MetricsRecorder for PrometheusRecorder {
fn register_counter(
&self, name: &str, desc: &str, labels: &[(&str, &str)]
) -> Arc<dyn CounterFn> {
let mut families = self.counters.lock().unwrap();
let family = families.entry(name.to_string()).or_insert_with(|| {
let f = Family::<Labels, PromCounter>::default();
self.registry.lock().unwrap().register(name, desc, f.clone());
f
});
let labels: Labels = labels.iter()
.map(|(k, v)| (k.to_string(), v.to_string())).collect();
let counter = family.get_or_create(&labels).clone();
Arc::new(PromCounterHandle(counter))
}
// ... other methods follow the same pattern
}

A recorder that maps directly to the OpenTelemetry SDK instruments:

use opentelemetry::metrics::MeterProvider;
use opentelemetry::KeyValue;
use opentelemetry_sdk::metrics::SdkMeterProvider;
struct OtelRecorder {
meter: opentelemetry::metrics::Meter,
}
impl OtelRecorder {
fn new(provider: &SdkMeterProvider) -> Self {
Self { meter: provider.meter("slatedb") }
}
}
impl MetricsRecorder for OtelRecorder {
fn register_counter(
&self, name: &str, desc: &str, labels: &[(&str, &str)]
) -> Arc<dyn CounterFn> {
let attrs: Vec<KeyValue> = labels.iter()
.map(|(k, v)| KeyValue::new(k.to_string(), v.to_string()))
.collect();
let counter = self.meter.u64_counter(name.to_string())
.with_description(desc.to_string()).build();
Arc::new(OtelCounter { counter, attrs })
}
// ... other methods follow the same pattern
}