1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00

54 Commits

Author SHA1 Message Date
Zygo Blaxell
9cdeb608f5 bees: drop the balance/logical workaround that has been disabled for two years
Kernels that needed the balance workaround frankly are too buggy
to run bees at all.  The workaround also makes the locking stories
around logical_ino calls and process exit complicated, so get rid of
it completely.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-12-20 20:50:58 -05:00
Zygo Blaxell
a32cd5247f docs: update kernel bugs list for 5.18 ptvf fix
Also correct my own style for the fixed version column.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-08-17 13:04:06 -04:00
Zygo Blaxell
9c68f15474 README: update copyright year 2022
It has been some years since the copyright statement was updated.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-07-29 22:20:02 -04:00
Zygo Blaxell
5f3cb9b374 docs: update kernel bugs list for 2022-07-29
* RAID1 device count problems fixed
 * log tree replay parent transid verify failure in 5.18 and 5.19 added, patches available but not upstream yet
 * flushoncommit issues fixed, discussion section removed
 * LOGICAL_INO vs dedupe hang added

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2022-07-29 22:07:26 -04:00
Zygo Blaxell
007067b83f docs: add missing 'adjust_offset_hit' counter
Reported by York-Simon Johannsen via github issue 208.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:10:02 -05:00
suorcd
bb5160987e docs: spell "snapshot" correctly
https://github.com/Zygo/bees/pull/209

Edited: regenerate docs for the downstream change in index.md.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-12-19 15:08:26 -05:00
Zygo Blaxell
670fce5be5 resolve: reword the too-many-duplicates exception message
For one thing, it should _say_ that there are too many duplicates.
We were making the user read the manual to find that out.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
7f67f55746 docs: remove some stray whitespace
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
eb2630dee6 docs: document resolve_overflow
In commit d9e3c0070b8e6b382b7956d286e43e0e6643f360 "context: stop creating
new refs when there are too many already" we added a new counter, but didn't
document it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
11fabd66a8 context: add experimental code for avoiding tiny extents
In the current architecture we can't directly measure the physical extent
size, and we can't make good decisions with the extent data (reference)
item alone.  If the early return is enabled here, there is a small speedup
and a large drop in dedupe hit rate, especially when extent splits occur.

Leave the early return commented for now, but collect the event statistics.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-11-29 21:27:48 -05:00
Zygo Blaxell
b436f8483b docs: add readahead_ event group
readahead and unreadahead have new event counters.  Document them.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-10-04 20:44:25 -04:00
Zygo Blaxell
b083003cf7 docs: update kernel bugs table as of 5.12.3
Two new tree mod log bugs #5 and #6 (uncovered by the zoned IO work,
though #6 has been seen in the wild on 5.10.29).

Tweak the next of some of the workarounds.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:56:54 -04:00
Zygo Blaxell
592580369e docs: btrfs-kernel: add the extent ref hash bug
Fixed in 5.11 and 5.10 but _not_ 5.10 or 5.4 (yet).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
0bbaddd54c docs: finally concede that the consensus spelling is "dedupe"
Change documentation and comments to use the word "dedupe," not "dedup"
as found in circa-3.15 kernel sources.

No changes in code or program output--if they used "dedup" before, they
will continue to be spelled "dedup" now.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
Zygo Blaxell
fbd1091052 options: remove default 8 CPU thread limit
Higher CPU core counts became more common, and kernel bugs became less
common, since the arbitrary 8-thread limit was introduced.  We can remove
the limit now, and treat any remaining scaling inefficiency as a bug to
be removed.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-06-11 20:49:15 -04:00
SeerLite
3bf6db0354 install.md: Update Arch Linux instructions
bees is now available in the community repository.

Also changed AUR installation line to something more generic.
2021-06-11 13:21:41 -04:00
Zygo Blaxell
8a60850e32 docs: note that FIEMAP is also affected by backref performance issue
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:20:03 -04:00
Zygo Blaxell
9d21e6b456 docs: drop incomplete build recipe for ubuntu 14.04
The kernel from such an old distro version likely has several unfixed
bugs.  Better not to support it at all.

Users who can upgrade the kernel are probably also sophisticated enough
to fix the build issues too.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:20:03 -04:00
Zygo Blaxell
bcf3e7de3e uuid: drop dependency on uuid.h
The weird things distros do to the path where uuid.h gets installed
have broken bees builds for the last time.

We were only using uuid to support a legacy feature that was removed
over four years ago.

Hypothetical users who are upgrading directly from bees v0.1 should
probably restart all the crawlers anyway--there were bugs.  Also, if any
such users exist, I respect their tremendous patience with the horrible
performance all these years--bees got about 30x faster since v0.1.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-23 08:16:50 -04:00
Zygo Blaxell
6465a7c37c docs: btrfs-kernel: update recommended kernels list, slow backrefs bug has been backported
The slow backrefs performance improvement is confirmed by reports from
multiple users:

	* Me (5.4.60 + backref patches, 5.7 to 5.11)

	* https://github.com/Zygo/bees/issues/161 (5.8)

	* https://github.com/Zygo/bees/issues/162 (5.8)

	* IRC user S0rin (5.4.88 + backref patches)

The issue still exists, but at a significantly reduced scale:  now about
2 ms of CPU per ref on a fast machine.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-04-04 14:01:55 -04:00
Zygo Blaxell
177f393ed6 docs: btrfs-kernel: add the 5.10 performance regression, the Ctrl-C on balance kernel crash has been fixed
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-23 17:37:51 -05:00
Zygo Blaxell
5f40f9edb0 docs: remove libbtrfs-dev as a build-time dependency
We no longer require ctree.h from libbtrfs-dev.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-22 20:07:06 -05:00
Zygo Blaxell
8e9b53b3fd stats: remove nonsense dedup_unique_bytes stat
A long time ago, when bees used dedicated threads to scan each subvol, the
calculation of the "dedup_unique_bytes" statistic was still wrong.

This stat can only be calculated when dedupe runs on extent data items
instead of extent reference items.  Remove the stat variable until
that happens.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-17 17:54:51 -05:00
Zygo Blaxell
1b9b437c11 docs: btrfs-kernel: 4.20 adds 32-bit single convert bug, tree mod log issue #4
There was a 4th tree mod log crash that showed up in testing.  It can
be reproduced or eliminated by applying or reverting d2311e698578
("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
to a 5.4.x kernel before 5.4.54.

Unfortunately, the test can only run if several other patches that
fixed other bugs in d2311e698578 are applied or removed at the same time.
Commit d2311e698578 introduces a bug which destroys filesystems under test
long before tree mod log failures can be reproduced in testing.  One of
those patches also fixes tree mod log issue #4.  I do not know which one,
but since kernels after 5.1 cannot run without all of those patches, I do
not think it matters.

Tree mod issue #4 is the reason why the tree mod workaround is still
required on all kernels before 5.4.  The issue still exists on older
LTS kernels, e.g. 4.9.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-10-09 21:25:23 -04:00
Zygo Blaxell
217f5c781b docs: expand the tree mod log issues
The fixes appear inconsistently in stable/LTS kernels, so they can't be
mashed into a single row.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-10-09 17:26:57 -04:00
Zygo Blaxell
dceea2ebbc docs: improve send workaround text, add references to backref commits, make grammar more good now
Rewrite the text related to 'btrfs send' to clarify that the send
workaround is no longer necessary to avoid kernel crashes, but still
useful because send and dedupe still do not work at the same time.

Replace "many backref code changes" with a specific commit reference,
and improve the grammar of some issue descriptions.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-10-09 16:25:29 -04:00
Zygo Blaxell
bb8b6d6c50 docs: fix table formatting for kernel bugs list
Apparently there's Github Flavored Markdown, and there's the markup
language that github uses, and they are distinct things.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-10-09 12:52:47 -04:00
Zygo Blaxell
6843846b97 docs: update kernel bug tracking for October 2020
Present known kernel bugs in table form with issue descriptions,
fixed and broken kernel versions, and references to fixes.

Update kernel version recommendations to include information on kernel
versions up to 5.8.14.

Reduce emphasis on data corruption bugs which are 1) two or more
years old now, and 2) much less bad than the bugs in kernel 5.1.

Add deprecation warning for kernels before 4.15.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-10-09 12:24:14 -04:00
Zygo Blaxell
d040bde2c9 docs: use Github Flavored Markdown with table extension
Prefer to use cmark-gfm with extension 'table' so we can use tables in
locally-generated HTML files.  If cmark-gfm is not installed then
fall back to some other Markdown implemeentation, but the tables will
be broken on every other implementation I have tried so far.

Also make the HTML output depend on the Makefile, since there may be
document translation options specified there (like '-e table' or an
entirely different Markdown implementation).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-10-09 12:24:14 -04:00
Zygo Blaxell
07e5e7bd1b docs: update known kernel bugs list
"Storm of softlockups" starts with a simple BUG_ON, but after the
BUG_ON, all cores that are waiting on spinlocks get stuck.
The _first_ kernel call trace is required to identify the bug.
At least two such bugs have been identified.

Add some notes about the conflict between LOGICAL_INO and balance,
and the recently added bees workaround.

Update the gotchas page for balances to point to the kernel bugs page.
Remove "bees and the full balance will both work correctly" as that
statement is not true.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-11-28 00:17:10 -05:00
Zygo Blaxell
b149528828 docs: tested build with btrfs-progs 4.20.2
Update the version ranges on the dependencies.

FIXME/TODO:  start dropping early versions that don't work with current
code?

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-06-12 22:48:05 -04:00
Zygo Blaxell
ce2521b407 docs: update btrfs feature interaction status for flushoncommit and SSD caching layers
flushoncommit or not-flushoncommit isn't really a bees matter--it's
a sysadmin's tradeoff between reliability and performance.  bees does
not affect that tradeoff because all dedupe src extents are flushed, so
bees introduces no *new* data loss risks in the noflushoncommit
case--i.e. any data that you could lose while running bees, you'd also
lose when not running bees.

Note that the converse is not true:  bees might trigger flushing on
data that would not normally have been flushed with noflushoncommit,
and improve data integrity after a crash as a side-effect of dedupe
operations.  The risks of noflushoncommit might be reduced by running
bees.  I don't have evidence based on experimental data to support that
conclusion, so I'll just leave this possibility as a rumor in a commit
log message.

lvmcache can be moved from the "bad" list to the "good" list now.

bcache remains in the "bad" list due to some non-data-losing failures
that only seem to happen with bcache.

Add a note about CPUs with strange endianness or page sizes, as nobody
seems to have tried those.

Remove "at great cost" from the btrfs send workaround.  The cost is
the cost, there is no need to editorialize.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-06-12 22:48:05 -04:00
Zygo Blaxell
17a75e61f8 README: highlight DATA CORRUPTION WARNING
The existence of information about known data corruption bugs should be
visible from the top-level page.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-06-12 22:48:05 -04:00
Zygo Blaxell
e1476260e1 docs: update kernel compatibility page, now recommending 5.0.4
* comprehensive list of kernels with bees-triggered corruption bug fixes
 * deadlock between dedupe and rename is now fixed (in some places)
 * compressed data corruption is now fixed (in more places)
 * btrfs send fix for one bug is now merged in 5.2-rc1, another bug remains
 * retired the bcache/lvmcache bug (can't reproduce those bugs any more,
   although I *can* reproduce an interesting non-destructive bcache bug)
 * new minor bug entries for two harmless kernel warnings
 * new entry for storm-of-soft-lockups

Fixes: https://github.com/Zygo/bees/issues/107
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-06-12 22:47:57 -04:00
Zygo Blaxell
7548d865a0 docs: event counter documentation
This may help users understand some of the things that happen inside
bees...or it may just be horribly long and confusing.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-01-07 22:48:16 -05:00
Zygo Blaxell
e1de933f93 docs: add some notes about interactions with balance
Prompted by discussion at https://github.com/Zygo/bees/issues/105

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-01-07 22:48:15 -05:00
Zygo Blaxell
f41fd73760 docs: add Gotcha for SIGTERM
This summarizes the discussion at:

	https://github.com/Zygo/bees/issues/100

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-01-06 01:54:57 -05:00
Zygo Blaxell
d583700962 docs: describe expected exceptions and impact of exception handling
Add some docs about the exceptions that are less easy to suppress
directly.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-01-06 01:54:57 -05:00
Zygo Blaxell
843f78c380 docs: bees can stop now
Remove the paragraph stating otherwise.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-12-10 19:56:08 -05:00
Zygo Blaxell
5f063dd752 docs: tested with GCC 6.3.0
Update the list of compiler versions tested.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-12-09 23:39:44 -05:00
Zygo Blaxell
7933ccb660 build: make libcrucible a static library
libcrucible at one time in the distant past had to be a shared library
to force global C++ object initialization; however, this is no longer
required.

Make libcrucible static to solve various rpath and soname versioning
issues, especially when distros try (unwisely) to package the library
separately.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-12-09 23:39:44 -05:00
Zygo Blaxell
f051d96d51 docs: dash more useful than previously believed
It turns out both dash and bash support `command -v` so let's use that.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-25 23:21:52 -05:00
Zygo Blaxell
ba5fda1605 docs: use bash "type -p" because dash isn't useful
If /bin/sh is bash, the 'type' builtin produces a list of filenames
that match the arguments to $PATH.

If /bin/sh is dash, we get errors like:

	/bin/sh: 1: P:: not found

Hopefully having a build-dep on bash is not controversial.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-22 21:37:09 -05:00
Zygo Blaxell
6cf16c4849 docs: add instructions for Ubuntu 18.10
As described in https://github.com/Zygo/bees/issues/88

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-22 21:36:39 -05:00
Zygo Blaxell
5a80ce5cd6 README: reintroduce new btrfs-send-compatibility workaround
Now it appears in both the github.io and github.com feature lists.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-22 21:22:10 -05:00
Zygo Blaxell
012219bbfb docs: derive docs/index.md from README.md
The two files are identical except README.md links to docs/* while
index.md links to *.

A sed script can do that transformation, so use sed to do it.

This does modify a file in git, but this is necessary to make all
the Github views work consistently.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-22 21:21:29 -05:00
Zygo Blaxell
e0c8df6809 docs: working with btrfs send is kind of a feature
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-21 23:19:37 -05:00
Zygo Blaxell
34b04f4255 bees: soft-limit computed thread counts to 8
https://github.com/Zygo/bees/issues/91 describes problems encountered
when running bees on systems with many CPU cores.

Limit the computed number of threads (using --thread-factor or the
default) to a maximum of 8 (i.e. the number of logical cores in a modern
laptop).  Users can override the limit by using --thread-count.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-21 21:49:16 -05:00
Zygo Blaxell
d9c788d30a docs: reorganize options, add workaround for btrfs send
options.md was a disorganized mess that markdown couldn't parse properly.

Break the options list down into sections by theme.  Add the new
'--workaround-btrfs-send' option to the new 'Workarounds' section.

Clean up the rest of the text and fix some inconsistencies.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-21 21:49:16 -05:00
Zygo Blaxell
19859b0a0d docs: toxic extents and btrfs send
Update documentation of toxic extent / slow backref workaround.

Add notes about btrfs send kernel bugs and incremental send failures.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-08 21:31:02 -05:00