"Storm of softlockups" starts with a simple BUG_ON, but after the
BUG_ON, all cores that are waiting on spinlocks get stuck.
The _first_ kernel call trace is required to identify the bug.
At least two such bugs have been identified.
Add some notes about the conflict between LOGICAL_INO and balance,
and the recently added bees workaround.
Update the gotchas page for balances to point to the kernel bugs page.
Remove "bees and the full balance will both work correctly" as that
statement is not true.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Update the version ranges on the dependencies.
FIXME/TODO: start dropping early versions that don't work with current
code?
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
flushoncommit or not-flushoncommit isn't really a bees matter--it's
a sysadmin's tradeoff between reliability and performance. bees does
not affect that tradeoff because all dedupe src extents are flushed, so
bees introduces no *new* data loss risks in the noflushoncommit
case--i.e. any data that you could lose while running bees, you'd also
lose when not running bees.
Note that the converse is not true: bees might trigger flushing on
data that would not normally have been flushed with noflushoncommit,
and improve data integrity after a crash as a side-effect of dedupe
operations. The risks of noflushoncommit might be reduced by running
bees. I don't have evidence based on experimental data to support that
conclusion, so I'll just leave this possibility as a rumor in a commit
log message.
lvmcache can be moved from the "bad" list to the "good" list now.
bcache remains in the "bad" list due to some non-data-losing failures
that only seem to happen with bcache.
Add a note about CPUs with strange endianness or page sizes, as nobody
seems to have tried those.
Remove "at great cost" from the btrfs send workaround. The cost is
the cost, there is no need to editorialize.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The existence of information about known data corruption bugs should be
visible from the top-level page.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
* comprehensive list of kernels with bees-triggered corruption bug fixes
* deadlock between dedupe and rename is now fixed (in some places)
* compressed data corruption is now fixed (in more places)
* btrfs send fix for one bug is now merged in 5.2-rc1, another bug remains
* retired the bcache/lvmcache bug (can't reproduce those bugs any more,
although I *can* reproduce an interesting non-destructive bcache bug)
* new minor bug entries for two harmless kernel warnings
* new entry for storm-of-soft-lockups
Fixes: https://github.com/Zygo/bees/issues/107
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This may help users understand some of the things that happen inside
bees...or it may just be horribly long and confusing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
libcrucible at one time in the distant past had to be a shared library
to force global C++ object initialization; however, this is no longer
required.
Make libcrucible static to solve various rpath and soname versioning
issues, especially when distros try (unwisely) to package the library
separately.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
If /bin/sh is bash, the 'type' builtin produces a list of filenames
that match the arguments to $PATH.
If /bin/sh is dash, we get errors like:
/bin/sh: 1: P:: not found
Hopefully having a build-dep on bash is not controversial.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The two files are identical except README.md links to docs/* while
index.md links to *.
A sed script can do that transformation, so use sed to do it.
This does modify a file in git, but this is necessary to make all
the Github views work consistently.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
https://github.com/Zygo/bees/issues/91 describes problems encountered
when running bees on systems with many CPU cores.
Limit the computed number of threads (using --thread-factor or the
default) to a maximum of 8 (i.e. the number of logical cores in a modern
laptop). Users can override the limit by using --thread-count.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
options.md was a disorganized mess that markdown couldn't parse properly.
Break the options list down into sections by theme. Add the new
'--workaround-btrfs-send' option to the new 'Workarounds' section.
Clean up the rest of the text and fix some inconsistencies.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The 16MB hash table extent size did not serve any useful defragmentation
or compression purpose, and for very small filesystems (under 100GB),
16MB is much larger than necessary.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
systemd-coredumpctl collects core files for later analysis
with gdb. It's a convenient thing if the keys you use to encrypt
/var/lib/systemd/coredump are the same as the keys you use to encrypt
the filesystem where you're running bees.
Add it to the documentation just before the hand-rolled version.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Standard crash backtrace collection, plus $BEESSTATUS for the high-level
overview of what bees is doing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Split the rather large README into smaller sections with a pitch and
a ToC at the top.
Move the sections into docs/ so that Github Pages can read them.
'make doc' produces a local HTML tree.
Update the kernel bugs and gotchas list.
Add some information that has been accumulating in Github comments.
Remove information about bugs in kernels earlier than 4.14.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>