Vitastor 3.0.14 released
2026-06-21
How many (bugs!) I’ve knifed, how many I’ve slit! (more than 50)
General note: most bug fixes now include regression tests to verify that they don’t repeat in the future. Most bugs fixed in this release were detected by using LLM analysis (Claude Opus/Fable, GPT 5.5).
OSD
- Fix OSD hanging with an infinite loop when setting autosync_interval to 0 at runtime
- (IMPORTANT) Fix EC PGs hanging in REPEERING when the last final commit/rollback in a batch completes with an error. An interesting note: the issue was related to a problem classified in 3.0.12 as a dangerous memory corruption, but after reproducing it in a regression test, it turned out that, apparently, no memory corruption could have actually occurred there.
- Limit pg_size by 64 because peering doesn’t handle larger values with EC — they just lead to ‘incomplete’ objects
- Fix OSD crash with an “assertion failed” error on EIO retry in snapshot chain read (i.e. when some chunks belong to a corrupted replica with checksum mismatch)
- (IMPORTANT) Disable chunked PG count resharding due to possible interference with compaction changes (will be re-enabled after fixes)
- (IMPORTANT) Fix incorrect snapshot allocation bitmap recovery during EC chained read
- Add on-wire request size validation to prevent possible OOM/DoS/heap corruption on receiving invalid data from the network
- (IMPORTANT) Fix parity-less EC writes destroying snapshot allocation bitmaps (i.e. when all parity OSDs in a PG are missing)
- (IMPORTANT) Fix EC N+K, K>=2 recovery destroying snapshot allocation bitmaps of live parity chunks
- (IMPORTANT) Fix a possible OSD crash during EC misplaced object scrubbing
- (IMPORTANT) Verify object bitmap consistency during scrub (only data was checked previously)
- (IMPORTANT) Fix corrupted object chunks incorrectly marked as non-corrupted on the second scrub
- (IMPORTANT) Fix cached EC decoding of multiple stripes with ISA-L (ISA-L is the default)
New store
- Fix a theoretically possible OSD crash on startup when using the previously added workaround for the “double-claim” problem
- Remove theoretically possible incorrect metadata block writes during batch EC COMMITs restarted due to a full metadata area
- Fix incorrect compaction counter tracking after OSD restart (could probably lead to compaction not restarted correctly after a restart)
- (IMPORTANT) Fix some of parallel big_writes possibly not waiting for data fsync, thus not providing durability
- Fix possible OSD crash on sync retry when io_uring is full
- Fix a possible crash during startup on corrupted on-disk data with too small entry sizes
Old store
- Prevent loading extra garbage metadata entries from the last 4 MB of metadata area
- Fix read operations possibly crashing if a metadata read (with inmemory_metadata=false) was restarted due to a full io_uring
- Fix a possible memory leak of temporary buffers and bitmaps/checksums when a read was restarted due to a full io_uring (reproducible with either inmemory_metadata=false or block_size>256k)
- Fix a possible OSD crash during padded checksum reads if buffer count exceeded 1024 (IOV_MAX) (reproducible only with csum_block_size > 4k and block_size >= 4M)
- (IMPORTANT) Fix partial padded read journal checksum verification with csum_block_size > 4k
- Fix incorrect marking of corrupted objects as non-corrupted after flushing data from journal (with inmemory_journal=false)
- (IMPORTANT) Fix deferred freeing of a different block when a block was used by a parallel read
- Fix per-inode statistics not being disabled for FS and S3 pools correctly, leading to etcd overload with unneeded per-inode statistics, slower etcd operation, increased memory usage, and too many Prometheus statistics exported by the monitor
Both stores
- Fix possibly left garbage in the metadata area if the first OSD startup was interrupted — metadata header is now written only after initializating metadata
- Check for short reads during initialization (just in case, doesn’t happen in real life)
Clients
- Fix write-back queue item split in case when write-back is enabled at runtime
- Implement bdrv_detach_aio_context & bdrv_attach_aio_context in the QEMU driver (should fix migration with iothread)
- Do not crash on full io_uring in ublk server
- Fix missing --readonly option handling in NBD server
- Stop gracefully on NBD_CMD_DISC instead of just exit(0) in NBD server
- Fix writeback detection in ublk server for --image mode
- Limit the amount of incoming data for NFS clients to prevent choking on memory in async mount mode
Tools (vitastor-disk/vitastor-cli)
- Prevent vitastor-cli merge possibly exiting before completing the last sync/delete operations
- Fix vitastor-disk incorrectly validating too large small_write entry length
- Fix vitastor-cli merge ignoring input option validation errors
- Fix vitastor-cli rm-data always skipping the final fsync
- Fix vitastor-disk resize not moving the last used data block
- Fix vitastor-disk write-meta incorrectly importing new store small_write entries
- Fix vitastor-disk write-journal and write-meta importing old store data incorrectly when csum_block_size is > 4k
- Support --io option for vitastor-disk dump-journal/write-journal
- Fix vitastor-disk resize crash when converting from very old (0.5.x) metadata
- Fix vitastor-disk trim incorrectly rounding block ranges with --discard_granularity option explicitly set to a value > 4k, possibly leading to discarding live data
- Fix vitastor-disk write-meta importing new store metadata incorrectly with > 4 GB metadata area size
- (IMPORTANT) Fix vitastor-cli modify --resize to a smaller size clearing all image data O_o
Other
- Do not crash with an uncaught exception when an invalid /osd/state/ with a non-numeric suffix is present in etcd (in OSD and all client services)
- Fix possible crash in vitastor-kv when handling a corrupted DB due to a uint32 overflow
- Fix NFS-RDMA memory allocator crashing in some situations
- Fix small shared file extend-write potentially reading unallocated memory (NFS)
- Add bounds checks to prevent uint32 overflows in NFS/XDR
- Re-enable accidentally disabled safety checks (asserts) in files with included cpp-btree
- Fix too small memory allocation in NFS portmap
Links
- Git: https://git.yourcmc.ru/vitalif/vitastor/releases/tag/v3.0.14
- Installation instructions: https://vitastor.io/en/docs/installation/packages.html