Bcachefs Publishes Patches For Disk Accounting Rewrite

Written by Michael Larabel in Linux Storage on 25 February 2024 at 06:26 AM EST. 41 Comments
LINUX STORAGE
Kent Overstreet on Saturday evening posted a set of 21 patches to overhaul the disk accounting code for the Bcachefs file-system. This change does break compatibility with the existing disk accounting on-disk format and thus will require an upgrade when moving to the new version, which may land for Linux v6.9.

Overstreet has been working toward this Bcachefs disk accounting rewrite for a while and this weekend published the initial patch series. He explained of this rewrite:
The old disk accounting scheme was fast, but had some limitations:

- lack of scalability: it was based on percpu counters additionally sharded by outstanding journal buffer, and then just prior to journal write we'd roll up the counters and add them to the journal entry. But this meant that all counters were added to every journal write, which meant it'd never be able to support per-snapshot counters.

- it was a pain to extend this was why, until now, we didn't have proper compressed accounting, and getting compression ratio required a full btree scan

In the new scheme:

- every set of counters is a bkey, a key in a btree (BTREE_ID_accounting). this means they aren't pinned in the journal

- the key has structure, and is extensible disk_accounting_key is a tagged union, and it's just union'd over bpos

- counters are deltas, until flushed to the underlying btree this means counter updates are normal btree updates; the btree write buffer makes counter updates efficient.

Since reading counters from the btree would be expensive - it'd require a write buffer flush to get up-to-date counters - we also maintain a parallel set of accounting in memory, a bit like the old scheme but without the per-journal-buffer sharding. The in memory accounters indexed in an eytzinger tree by disk_accounting_key/bpos, with the counters themselves being percpu u64s.

This breaks the on-disk format for the file-system and thus needs to regenerate accounting when upgrading (or downgrading) past this new version. This should happen transparently via kernel fsck but there is some known limitations for Linux 6.7 users at the moment. The hope is to potentially have this disk accounting rewrite ready for Bcachefs in Linux v6.9.

More details on this new code via the patch series.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week