mirror of
https://github.com/Zygo/bees.git
synced 2025-05-17 21:35:45 +02:00
docs: add some notes about interactions with balance
Prompted by discussion at https://github.com/Zygo/bees/issues/105 Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit is contained in:
parent
f41fd73760
commit
e1de933f93
@ -119,6 +119,50 @@ If bees is terminated with SIGKILL, only step #1 and #4 are performed (the
|
|||||||
kernel performs these automatically if bees exits). This reduces the
|
kernel performs these automatically if bees exits). This reduces the
|
||||||
shutdown time at the cost of increased startup time.
|
shutdown time at the cost of increased startup time.
|
||||||
|
|
||||||
|
Balances
|
||||||
|
--------
|
||||||
|
|
||||||
|
A btrfs balance relocates data on disk by making a new copy of the
|
||||||
|
data, replacing all references to the old data with references to the
|
||||||
|
new copy, and deleting the old copy. To bees, this is the same as any
|
||||||
|
other combination of new and deleted data (e.g. from defrag, or ordinary
|
||||||
|
file operations): some new data has appeared (to be scanned) and some
|
||||||
|
old data has disappeared (to be removed from the hash table when it is
|
||||||
|
detected).
|
||||||
|
|
||||||
|
As bees scans the newly balanced data, it will get hits on the hash
|
||||||
|
table pointing to the old data (it's identical data, so it would look
|
||||||
|
like a duplicate). These old hash table entries will not be valid any
|
||||||
|
more, so when bees tries to compare new data with old data, it will not
|
||||||
|
be able to find the old data at the old address, and bees will delete
|
||||||
|
the hash table entries. If no other duplicates are found, bees will
|
||||||
|
then insert new hash table entries pointing to the new data locations.
|
||||||
|
The erase is performed before the insert, so the new data simply replaces
|
||||||
|
the old and there is (little or) no impact on hash table entry lifetimes
|
||||||
|
(depending on how overcommitted the hash table is). Each block is
|
||||||
|
processed one at a time, which can be slow if there are many of them.
|
||||||
|
|
||||||
|
Routine btrfs maintenance balances rarely need to relocate more than 0.1%
|
||||||
|
of the total filesystem data, so the impact on bees is small even after
|
||||||
|
taking into account the extra work bees has to do.
|
||||||
|
|
||||||
|
If the filesystem must undergo a full balance (e.g. because disks were
|
||||||
|
added or removed, or to change RAID profiles), then every data block on
|
||||||
|
the filesystem will be relocated to a new address, which invalidates all
|
||||||
|
the data in the bees hash table at once. bees and the full balance will
|
||||||
|
both work correctly if they are both allowed to run at the same time,
|
||||||
|
but it is quite slow. In such cases it is a good idea to:
|
||||||
|
|
||||||
|
1. Stop bees before the full balance starts,
|
||||||
|
2. Wipe the `$BEESHOME` directory (or delete and recreate `beeshash.dat`),
|
||||||
|
3. Restart bees after the full balance is finished.
|
||||||
|
|
||||||
|
bees will perform a full filesystem scan automatically after the balance
|
||||||
|
since all the data has "new" btrfs transids. bees won't waste any time
|
||||||
|
invalidating stale hash table data after the balance if the hash table
|
||||||
|
is empty. This can considerably improve the performance of both bees
|
||||||
|
(since it has no stale hash table entries to invalidate) and btrfs balance
|
||||||
|
(since it's not competing with bees for iops).
|
||||||
|
|
||||||
Snapshots
|
Snapshots
|
||||||
---------
|
---------
|
||||||
|
Loading…
x
Reference in New Issue
Block a user