mirror of
https://github.com/Zygo/bees.git
synced 2025-05-17 13:25:45 +02:00
docs: update the bug reporting and status instructions
Thread names have changed. Document some of the newer ones. Don't jump immediately to blaming poor performance on qgroups or autodefrag. These do sometimes have kernel regressions but not all the time. Emphasize advantage of controlling bees deferred work requests at the source, before btrfs gets stuck committing them. Avoid asserting that it's OK for gdb to crash. Remove mention of lower-layer block device issues wrt corruption. Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit is contained in:
parent
e457f502b7
commit
7fcde97b70
@ -4,16 +4,13 @@ What to do when something goes wrong with bees
|
|||||||
Hangs and excessive slowness
|
Hangs and excessive slowness
|
||||||
----------------------------
|
----------------------------
|
||||||
|
|
||||||
### Are you using qgroups or autodefrag?
|
|
||||||
|
|
||||||
Read about [bad btrfs feature interactions](btrfs-other.md).
|
|
||||||
|
|
||||||
### Use load-throttling options
|
### Use load-throttling options
|
||||||
|
|
||||||
If bees is just more aggressive than you would like, consider using
|
If bees is just more aggressive than you would like, consider using
|
||||||
[load throttling options](options.md). These are usually more effective
|
[load throttling options](options.md). These are usually more effective
|
||||||
than `ionice`, `schedtool`, and the `blkio` cgroup (though you can
|
than `ionice`, `schedtool`, and the `blkio` cgroup (though you can
|
||||||
certainly use those too).
|
certainly use those too) because they limit work that bees queues up
|
||||||
|
for later execution inside btrfs.
|
||||||
|
|
||||||
### Check `$BEESSTATUS`
|
### Check `$BEESSTATUS`
|
||||||
|
|
||||||
@ -52,10 +49,6 @@ dst = 15 /run/bees/ede84fbd-cb59-0c60-9ea7-376fa4984887/data.new/home/builder/li
|
|||||||
|
|
||||||
Thread names of note:
|
Thread names of note:
|
||||||
|
|
||||||
* `crawl_12345`: scan/dedupe worker threads (the number is the subvol
|
|
||||||
ID which the thread is currently working on). These threads appear
|
|
||||||
and disappear from the status dynamically according to the requirements
|
|
||||||
of the work queue and loadavg throttling.
|
|
||||||
* `bees`: main thread (doesn't do anything after startup, but its task execution time is that of the whole bees process)
|
* `bees`: main thread (doesn't do anything after startup, but its task execution time is that of the whole bees process)
|
||||||
* `crawl_master`: task that finds new extents in the filesystem and populates the work queue
|
* `crawl_master`: task that finds new extents in the filesystem and populates the work queue
|
||||||
* `crawl_transid`: btrfs transid (generation number) tracker and polling thread
|
* `crawl_transid`: btrfs transid (generation number) tracker and polling thread
|
||||||
@ -64,6 +57,13 @@ dst = 15 /run/bees/ede84fbd-cb59-0c60-9ea7-376fa4984887/data.new/home/builder/li
|
|||||||
* `hash_writeback`: trickle-writes the hash table back to `beeshash.dat`
|
* `hash_writeback`: trickle-writes the hash table back to `beeshash.dat`
|
||||||
* `hash_prefetch`: prefetches the hash table at startup and updates `beesstats.txt` hourly
|
* `hash_prefetch`: prefetches the hash table at startup and updates `beesstats.txt` hourly
|
||||||
|
|
||||||
|
Most other threads have names that are derived from the current dedupe
|
||||||
|
task that they are executing:
|
||||||
|
|
||||||
|
* `ref_205ad76b1000_24K_50`: extent scan performing dedupe of btrfs extent bytenr `205ad76b1000`, which is 24 KiB long and has 50 references
|
||||||
|
* `extent_250_32M_16E`: extent scan searching for extents between 32 MiB + 1 and 16 EiB bytes long, tracking scan position in virtual subvol `250`.
|
||||||
|
* `crawl_378_18916`: subvol scan searching for extent refs in subvol `378`, inode `18916`.
|
||||||
|
|
||||||
### Dump kernel stacks of hung processes
|
### Dump kernel stacks of hung processes
|
||||||
|
|
||||||
Check the kernel stacks of all blocked kernel processes:
|
Check the kernel stacks of all blocked kernel processes:
|
||||||
@ -91,7 +91,7 @@ bees Crashes
|
|||||||
(gdb) thread apply all bt full
|
(gdb) thread apply all bt full
|
||||||
|
|
||||||
The last line generates megabytes of output and will often crash gdb.
|
The last line generates megabytes of output and will often crash gdb.
|
||||||
This is OK, submit whatever output gdb can produce.
|
Submit whatever output gdb can produce.
|
||||||
|
|
||||||
**Note that this output may include filenames or data from your
|
**Note that this output may include filenames or data from your
|
||||||
filesystem.**
|
filesystem.**
|
||||||
@ -160,8 +160,7 @@ Kernel crashes, corruption, and filesystem damage
|
|||||||
-------------------------------------------------
|
-------------------------------------------------
|
||||||
|
|
||||||
bees doesn't do anything that _should_ cause corruption or data loss;
|
bees doesn't do anything that _should_ cause corruption or data loss;
|
||||||
however, [btrfs has kernel bugs](btrfs-kernel.md) and [interacts poorly
|
however, [btrfs has kernel bugs](btrfs-kernel.md), so corruption is
|
||||||
with some Linux block device layers](btrfs-other.md), so corruption is
|
|
||||||
not impossible.
|
not impossible.
|
||||||
|
|
||||||
Issues with the btrfs filesystem kernel code or other block device layers
|
Issues with the btrfs filesystem kernel code or other block device layers
|
||||||
|
Loading…
x
Reference in New Issue
Block a user