1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00

326 Commits

Author SHA1 Message Date
Kai Krakow
f2dec480a6
Makefile: mkdir .depends only when needed
Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-11-08 02:56:48 +01:00
Kai Krakow
d4535901a5
Makefile: Use the jobserver properly
Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-11-08 02:52:04 +01:00
Zygo Blaxell
8cbd6fc67a fs: support LOGICAL_INO_V2
Automatically fall back to LOGICAL_INO if LOGICAL_INO_V2 fails and no
_V2 flags are used.

Add methods to set the flags argument with build portability to older
headers.

Use thread_local storage for the somewhat large buffers used by
LOGICAL_INO_V2 (and other users of BtrfsDataContainer like INO_PATHS).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-05 21:12:36 -05:00
Zygo Blaxell
c2762740ef context: remove limit on the number of references to an extent
Better toxic extent detection means we can now handle extents with
many more references--easily hundreds of thousands.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-05 21:12:11 -05:00
rsjaffe
8bec9624da
systemd service replace deprecated parameters
Replace CPU shares and IO block weight by CPU weight and IO weight. Note that new parameters are roughly 1/100 of old one--I believe that's the right conversion. Also removed duplicate Nice parameter and alphabetized the parameters for ease of reading.
2018-11-05 12:35:17 -08:00
Zygo Blaxell
aa74a238b3 hash: remove preloaded toxic hash blacklist
Faster and more reliable toxic extent detection means we can now be much
less paranoid about creating toxic extents.

The paranoia has significant impact on dedupe hit rates because every
extent that contains even one toxic hash is abandoned.  The preloaded
toxic hashes were chosen because they occur more frequently than any
other block contents in typical filesystem data.  The combination of these
resulted in as much as 30% of duplicate extents being left untouched.

Remove the preloaded toxic extent blacklist, and rely on the new
kernel-CPU-usage-based workaround instead.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-31 23:03:01 -04:00
Zygo Blaxell
6e6b08ea0e scripts: put AL16M back to avoid breaking existing scripts
Leave AL16M defined in beesd to avoid breaking scripts based on
beesd.conf.sample which used this constant.

Use the absolute size in beesd.conf.sample to avoid any future problems.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-31 22:50:36 -04:00
Zygo Blaxell
542371684c context: better detection for toxic extents
We detect toxic extents by measuring how long the LOGICAL_INO ioctl takes
to run.  If it is above some threshold, we consider the extent toxic,
and blacklist it; otherwise, we process the extent normally.

The detector was using the execution time of the ioctl, which detects
toxic extents, but it also detects pauses of the bees process and
transaction commit latency due to load.  This leads to a significant
number of false positives.  The detection threshold was also very long,
burning a lot of kernel CPU before the detection was triggered.

Use the per-thread system CPU statistics to measure the kernel CPU usage
of the LOGICAL_INO call directly.  This is much more reliable because it
is not confounded by other threads, and it's faster because we can set
the time threshold two orders of magnitude lower.

Also remove the lock and mutex added in "context: serialize LOGICAL_INO
calls" because we theoretically no longer need it (but leave the code
there with #if 0 in case we do need it in practice).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-31 21:12:16 -04:00
Zygo Blaxell
9a97699dd9 roots: reimplement transid_max_nocache using extent tree root
ROOT_TREE contains the ROOT_ITEM for EXTENT_TREE.  Every modification
(that we care about) to a btrfs must go through EXTENT_TREE, and must
modify the page in ROOT_TREE pointing to the root of EXTENT_TREE...
which makes that a very good source for the filesystem transid.

Remove the loop and the root lookups, and just look at one item for
max_transid.

Also note that every caller of transid_max_nocache() immediately
feeds the return value to m_transid_re.update(), so don't do that
inside transid_max_nocache().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-31 00:09:49 -04:00
Zygo Blaxell
0e8b591232 Revert "roots: simplify BeesRoots::transid_max_nocache"
It turns out that we do need to scan all the subvols in order
to find transid_max.

Keep the bug fix though.

This reverts commit bf6ae80eeec6afcbee505d22af8e62f60dc1c9a6.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 23:29:05 -04:00
Zygo Blaxell
bf6ae80eee roots: simplify BeesRoots::transid_max_nocache
BeesRoots::transid_max_nocache calls btrfs_get_root_transid() which
retrieves the transid of the root of the given Fd.  Since the FS_TREE
(subvol 5) is the root of the subvol hierarchy, it will always have
the highest transid on the filesystem, and we do not need to look at
any others.

Also fix a bug where we pass BTRFS_FS_TREE_OBJECTID instead of the
file descriptor root_fd() to btrfs_get_root_transid().  If BEESHOME
is somewhere on the same btrfs filesystem, and there are no leaked FDs
at bees startup, then BTRFS_FS_TREE_OBJECTID (5) usually has the same
integer value as a valid file descriptor of some object on the filesystem
that has a regularly increasing transid value.  If Fd 5 happens to be a
file in BEESHOME then bees itself drives the transid increments.  This,
combined with the search of all subvol roots, hides the bug (unless Fd
5 gets closed somehow).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:17 -04:00
Zygo Blaxell
1a51bb53bf context: cache result of home_fd()
BeesContext::home_fd() is supposed to open $BEESHOME once and cache
the Fd for later calls; however, instead it was reopening a new Fd each
time it was called, and _also_ holding that Fd in a BeesContext member.
Fds clean themselves up when they are forgotten, so it was not leaking
per se, but it certainly had more open Fds than it needed to.

Check to see if we have m_home_fd open, and return that if so.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:16 -04:00
Zygo Blaxell
35b21687bc bees: drop unused member m_uuid
There is a m_root_uuid which is used.  m_uuid is not, so drop it
and save a tiny amount of memory.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:16 -04:00
Zygo Blaxell
63ddbb9a4f context: serialize LOGICAL_INO calls
LOGICAL_INO can trip over the btrfs slow-backrefs bug, resulting in
some very long in-kernel runtimes.  If too many threads are executing
LOGICAL_INO then there may be no cores left on the system to run other
tasks.

Toxic extent detection is done by a very rudimentary algorithm which
can be confused by unrelated sources of latency within btrfs (especially
commit latency).  The algorithm can also be confused by other threads
executing the LOGICAL_INO ioctl.

These are two good reasons to prevent any two threads in a single bees
process instance from executing LOGICAL_INO at the same time, so let's
do that.

It is possible to limit the number of threads executing LOGICAL_INO with
the -c and -C options; however, this also limits the number of threads
which can perform any operation, while only LOGICAL_INO (*) has such a
profound effect on the rest of system operation.

Also make the status message clearer about exactly when LOGICAL_INO is
executed, as opposed to merely waiting to acquire a lock before executing
the ioctl.

(*) or maybe FILE_EXTENT_SAME.  The problem function that keeps showing
up in kernel stack traces is find_parent_nodes, which is called by both
the LOGICAL_INO and FILE_EXTENT_SAME ioctls.  We'll try this change
first and see if it prevents any recurrences of forced watchdog reboots;
if it does not, then we'll limit FILE_EXTENT_SAME the same way.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:16 -04:00
Zygo Blaxell
373b9ef038 roots: fix subvol scan rollover on subvols with empty transid range
The ordering function for BeesCrawlState did not consider

	root 292 inode 0 min_transid 2345 max_transid 3456

to be larger than

	root 292 inode 258 min_transid 2345 max_transid 2345

so when we attempted to update the end pointer for the crawl progress,
the new state was not considered newer than the old state because the
min_transid was equal, but the new crawl state's inode number was smaller.

Normally this is not a problem because subvol scans typically begin
and end in separate transactions (in part because we don't start a
subvol scan until at least two transactions are available); however,
the cleanup code for the aftermath of the recent transid_min() bug can
create crawlers with equal max_transid and min_transid records.

Fix this by ordering both transid fields before any others in the
crawl state.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:14 -04:00
Zygo Blaxell
866a35c7fb roots: do not accept 18446744073709551615 as max_transid in beescrawl.dat
Due to an earlier bug some beescrawl.dat files will contain uint64_t
max as max_transid.  This prevents any further scanning on the subvol
because there is no possibiity of having a real transid (or any other
uint64_t number) larger than uint64_t max.

If we detect a bad transid in beescrawl.dat, log a warning, then use
some more plausible value:  either min_transid to repeat the previous
incremental crawl, or 0 to restart the subvol scan from the beginning.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:14 -04:00
Zygo Blaxell
90132182fd roots: do not allow transid_min to be numeric_limits<uint64_t>::max()
On a few test machines max_transid on subvols is getting set to
18446744073709551615 (aka uint64_t max).

Prevent transid_min() from ever returning this value.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-30 21:12:14 -04:00
Zygo Blaxell
90f98250c2 hash: remove pointless copy
"saved" is used only during hash table correctness analysis, which is
normally not enabled at compile time, and requires source modification
to enable.

Remove the pointless copy and save a tiny bit of CPU.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-19 20:21:04 -04:00
Zygo Blaxell
0c714cd55c scripts: use multiples (not power) of 128K
Adjust the scripts for the new smaller hash table extent size.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-19 20:21:04 -04:00
Zygo Blaxell
924008603e hash: reduce hash table extent size to 128KB
The 16MB hash table extent size did not serve any useful defragmentation
or compression purpose, and for very small filesystems (under 100GB),
16MB is much larger than necessary.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-19 20:21:04 -04:00
Zygo Blaxell
c01f129eee src: add bees-version.new.c to .gitignore
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-19 20:21:04 -04:00
Zygo Blaxell
5a49870fc9 docs: add coredumpctl
systemd-coredumpctl collects core files for later analysis
with gdb.  It's a convenient thing if the keys you use to encrypt
/var/lib/systemd/coredump are the same as the keys you use to encrypt
the filesystem where you're running bees.

Add it to the documentation just before the hand-rolled version.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-19 20:21:04 -04:00
Zygo Blaxell
14b35e3426 docs: add "what to do when something goes wrong" page
Standard crash backtrace collection, plus $BEESSTATUS for the high-level
overview of what bees is doing.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-04 20:54:08 -04:00
Zygo Blaxell
7bba096077 Merge remote-tracking branch 'nilninull/master' 2018-10-02 22:13:55 -04:00
nilninull
aa324de9ed
FIX: The systemd service file is always installed 2018-10-03 10:19:43 +09:00
Zygo Blaxell
e8298570ed README: split into sections, reformat for github.io
Split the rather large README into smaller sections with a pitch and
a ToC at the top.

Move the sections into docs/ so that Github Pages can read them.

'make doc' produces a local HTML tree.

Update the kernel bugs and gotchas list.

Add some information that has been accumulating in Github comments.

Remove information about bugs in kernels earlier than 4.14.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-10-02 03:41:31 -04:00
Kai Krakow
32d2739b0d
Makefile: Specify version when building from tarball
When package maintainers build from a tarball, the .git directory does
not exist to extract the version tag. Let's add a hack to work around
this issue and let them specify `BEES_VERSION="v0.y"` on the make
cmdline.

Github-Bug: https://github.com/Zygo/bees/issues/75
Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-30 04:20:26 +02:00
Kai Krakow
faf11b1c0c
Update references to Gentoo
Gentoo has officially merged the ebuild into portage as of:
https://github.com/gentoo/gentoo/pull/9925

Let's update the readme and get rid of the `contrib/gentoo-bees`
directory, so we have no potentially outdated information in the future.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-29 22:26:56 +02:00
Kai Krakow
3504439d5c
contrib/gentoo: Update ebuild
Now that the packaging preparations were merged, we should update the
ebuild to reflect the upstream master branch.

Signed-off-by: Kai Krakow <kai@kaishome.de>
v0.6
2018-09-27 10:55:24 +02:00
Zygo Blaxell
d4b3836493 extentwalker: don't fetch absurd numbers of extents just to throw them away
ExtentWalker doesn't gain significant benefits from caching, and the
extra SEARCH_V2 ioctls were blamed for a 33% kernel CPU overhead by perf.

Reduce the number of extents to 16 in lieu of fixing the caching.

This gives a significant speed boost on CPU-bound workloads compared
to the original 1024--almost 40% faster on a single SSD with a filesystem
consisting of raw VM images mounted with compress=zstd.

This also seems to reduce LOGICAL_INO overhead.  Perhaps SEARCH_V2 and
LOGICAL_INO were trying to lock the same extents, and interfering with
each other?

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-26 23:29:56 -04:00
Kai Krakow
f053e0e1a7 beesd: Fix the wrapper not finding any config file
`grep -q something | grep -q something_else` will never find anything.
The for-loop is redundant anyways because `grep -l` can already work for
us. Let's replace this with a shorter and working version.

CC: Timofey Titovets <timofey.titovets@synesis.ru>
(fixes: commit 06d41fd "Rewrite beesd arg parser")
Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-16 17:56:31 -04:00
Zygo Blaxell
bcfc3cf08b Merge https://github.com/Zygo/bees/pull/62 2018-09-15 00:09:46 -04:00
Zygo Blaxell
9dbe2d6fee bees: add -G/--thread-min option for minimum thread count
The -g option limits the number of worker threads when the target load
average is exceeded.  On some systems the load normally runs high, and
continuous bees operation is required to avoid running out of disk space.

Add a -G/--thread-min option to force at least some threads to continue
running.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:07 -04:00
Zygo Blaxell
dd3c32a43d README: spell 'available' correctly
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:07 -04:00
Zygo Blaxell
3d536ea6df roots: if queue is full run again
The task queue may already be full of tasks when the crawl task is
executed.  In this case simply reschedule the crawl task at the
end of the current queue.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:06 -04:00
Zygo Blaxell
e66086516f bees: dynamic thread pool size based on system load average
Add -g / --loadavg-target parameter to track system load and add or
remove bees worker threads dynamically to keep system load close to the
loadavg target.  Thread count may vary from zero to the maximum
specified by -c or -C, and is adjusted every 5 seconds.

This is better than implementing a similar load average scheme from
outside of the process (though that is still possible) because the
in-process load tracker does not disrupt the performance timing feedback
mechanisms as a freezer cgroup or SIGSTOP would when controlling bees
from outside.  The internal load average tracker can also adjust the
number of active threads while an external tracker can only choose from
the maximum or zero.

Also fix a bug where a Task could deadlock waiting for itself to exit
if it tries to insert a new Task after the number of worker threads has
been set to zero.

Also correct usage message for --scan-mode (values are 0..2) since
we are touching adjacent lines anyway.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:03 -04:00
Zygo Blaxell
96eb100ded bees: use readahead instead of posix_fadvise
Other btrfs utils use readahead() not posix_fadvise().

There does not appear to be a performance or correctness difference
between the three (none, posix_fadvise, or readahead()).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:00 -04:00
Zygo Blaxell
041ad717a5 bees: configurable log verbosity
Log messages were already labelled with log levels, but there was no
way to filter by log level at run time.

Implement the filter inside the bees process so it can skip evaluation
of the BEESLOG* arguments if the log messages would not be emitted.

Fixes: https://github.com/Zygo/bees/issues/67

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:00 -04:00
Zygo Blaxell
b22db12390 context: log dedups with single unbroken log message
When BEESLOGINFO is called multiple times it generates separate log
records that can be mixed up when multiple threads dedup.

Use a single BEESLOGINFO call for each dedup to prevent this.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:00 -04:00
Zygo Blaxell
8938caa029 README.md: update build-deps
btrfs/ioctl.h has been moved to a different package.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:49:57 -04:00
Zygo Blaxell
8bc4bee8a3 crucible: progress: drop the set() method
set() was broken and redundant.  Calling hold() and discarding the
returned object has the correct effect.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:49:54 -04:00
Zygo Blaxell
1beb61fb78 crucible: error: record location of exception in what() message
Make the log show where the exception is thrown from.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:49:51 -04:00
Timofey Titovets
06d41fd518 Rewrite beesd arg parser
Signed-off-by: Timofey Titovets <timofey.titovets@synesis.ru>
2018-09-15 00:21:06 +03:00
Kai Krakow
788774731b Gentoo: Rework Gentoo ebuild into overlay
This commit squashes all the little changes from the previous
integration branch into one, adjusts to the new Makefile changes, and
introduces an overlay layout so that the contrib/gentoo-bees subtree
can be directly added as a Portage overlay to the system.

The following list contains the previous commit descriptions:

sys-fs/bees: Keyword tested architecture ~amd64

    Bees was tested on this platform.

sys-fs/bees: Add kernel version checks

    Add checking the kernel versions and write some info and/or warnings
    before building and installing the package. Running bees on older
    kernels may have some serious performance and stability impacts, let's
    tell the user about it.

    Closes #55

sys-fs/bees: Add metadata.xml

sys-fs/bees: There's no configure script

    So, there's no point in calling "default".

sys-fs/bees: Simplify src_configure()

sys-fs/bees: Don't depend on markdown

    It makes no sense to install both README.md and README.html, and we can
    get rid of one dependency.

Dependencies: btrfs-progs is no longer a buildtime-only dep

    It is actually needed by the bees service wrapper script, as pointed out
    by Gentoo QA review.

sys-fs/bees: DOCS is not needed

    "COPYING" is already covered by the licensing. The ebuild defaults
    already include README*

sys-fs/bees: Make warnings exclusive

    It was recommended by Gentoo QA to show only either one or another
    warning, and change the texts accordingly.

sys-fs/bees: RDEPEND is not implicit

    RDEPEND does not implicitly default to DEPEND. Let's explicitly set the
    variable.

sys-fs/bees: IUSE=test is only needed for explicit dependencies

    Thus, remove it.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 05:06:39 +02:00
Kai Krakow
679a327ac5 Makefile: Do not force optimizations by default
Make life easier for package maintainers by not forcing architecture or
compiler optimizations by default. E.g., Gentoo QA refuses to accept
both "-march=native" and "-O3". These are usually provided by the
package tooling.

Instead, we provide easily accessible templates in "makeflags".

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 04:05:15 +02:00
Kai Krakow
31b41bb3c2 Makefile: Do not force making README.html
This forces us to depend on markdown which would be otherwise optional.
Most of the time it is sufficient to let package managers just install
the README.md file.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 03:34:48 +02:00
Kai Krakow
d7e235c178 Makefile: "which" is not portable
It was pointed out by Gentoo QA that "type -P" is a better choice.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 03:14:18 +02:00
Kai Krakow
51108f839d Makefile: Due to VPATH, libcrucible links to hard-coded libuuid path
Due to VPATH and how make resolves source paths, libcrucible.so ends up
with a hard-coded path to link against libuuid.so. Let's fix it by
turning the general rule into an explicit rule for libcrucible.so.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 03:07:20 +02:00
Kai Krakow
8d102abf8b Makefile: create a template compiler
This creates a simple template compiler using sed in as a reusable
variable.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 02:59:54 +02:00
Kai Krakow
83e8f87dc9 Scripts: Don't prefix timestamps when running with systemd
Since systemd prefix it's own timestamps, we can unconditionally remove
timestamps when bees is executed by systemd.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 02:59:54 +02:00