1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00

325 Commits

Author SHA1 Message Date
Zygo Blaxell
a466ccf2f1 build: include localconf everywhere
Overriding makeflags did not work from localconf in the src, lib, or
test directories.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
v0.6.5
2021-02-08 12:52:45 -05:00
Zygo Blaxell
ba04fe1349 roots: make it build with clang
Remove an unnecessary cast that was breaking namespace lookup for clang.

Closes: #159

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:48 -05:00
Zygo Blaxell
830df63d4c chatter: make it build with clang
Silence the unused variable warning.  The compiler is correct, but we
may implement line-level debug at some point in the future, so we
want to keep the member and parameters.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:42 -05:00
Zygo Blaxell
20c9d2ff6a clang: fix struct/class declaration/definition mismatches
clang does not like a defined class to be declared as a struct.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:40 -05:00
Zygo Blaxell
7bbb4d14cb bees context: make it build with clang
Remove unused function getenv_or_die.  All of our environment variable
parameters are optional or have default values.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:38 -05:00
Zygo Blaxell
363c45b8cd bees: make it build with clang
Remove unused "addr check" functions.  We have ranged_cast for detecting
overflow bits.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:35 -05:00
Zygo Blaxell
4ec2b8ac16 task: make it build with clang
Remove unused closure captures.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:30 -05:00
Zygo Blaxell
26d31225fa extentwalker: make it build with clang
Remove unused MAX_OFFSET.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:27 -05:00
Zygo Blaxell
21ae937201 roots: reimplement transid_max_nocache using extent tree root
Commit 9a97699dd9045715d9943cf98ca5573708eb1a53 upstream.

This commit accidentally fixes a bug where we call btrfs_get_root_transid
with BTRFS_FS_TREE_OBJECTID instead of m_ctx->root_fd().  This leads
to storms of messages like this:

	crawl_transid[5334]: exception type std::system_error: BTRFS_IOC_INO_LOOKUP: rv = readlink(path.c_str(), buf, size + 1): No such file or directory at fs.cc:430: No such file or directory

The code was working before because BTRFS_FS_TREE_OBJECTID == 5.
bees is constantly opening files, and the Linux kernel fills in unused
fd numbers starting from 0, so it's quite likely that the process has fd
5 open to some existing file somewhere on the target btrfs filesystem
most of the time.  If fd 5 is closed, or if it is open to an orphan
file (one without an existing name), the ioctl in btrfs_get_root_id
(called by btrfs_get_root_transid) will fail and throw and exception.
The exception breaks out of the crawl_transid task before it can do any
scanning work, so bees will stop deduping until FD 5 is open again with
an existing file.  This can only happen if other threads are opening
files, so if bees is idle at the instant when this failure occurs,
it will never dedupe again until the process is terminated and restarted.

The remainder is the original commit message:

ROOT_TREE contains the ROOT_ITEM for EXTENT_TREE.  Every modification
(that we care about) to a btrfs must go through EXTENT_TREE, and must
modify the page in ROOT_TREE pointing to the root of EXTENT_TREE...
which makes that a very good source for the filesystem transid.

Remove the loop and the root lookups, and just look at one item for
max_transid.

Also note that every caller of transid_max_nocache() immediately
feeds the return value to m_transid_re.update(), so don't do that
inside transid_max_nocache().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
v0.6.4
2020-12-23 17:27:44 -05:00
Zygo Blaxell
7283126e5c bees: initialize context in the correct order
We cannot use BeesContext::roots() until after
BeesContext::set_root_path() has been called.
Save up the parameter settings until then.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
v0.6.3
2020-08-31 22:39:51 -04:00
Zygo Blaxell
ac53e50d3e context: workaround to prevent LOGICAL_INO and btrfs balance from running concurrently
This avoids some kernel bugs.  One of them is fixed in 5.3.4 and later:

	efad8a853a "Btrfs: fix use-after-free when using the tree modification log"

There are apparently others in current kernels, so for now just put bees
on pause until the balance is done.

At some point we may want to provide an option to disable this
workaround; however, running bees and balance at the same time makes
neither particularly fast, so maybe we'll just leave it this way.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-11-28 11:32:30 +01:00
Zygo Blaxell
6e75857d71 main: single BeesContext instance per process
After weeks of testing I copied part of a change to main without copying
the rest of the change, leading to an immediate segfault on startup.

So here is the rest of the change:  limit the number of
BeesContexts per process to 1.  This change was discussed at
https://github.com/Zygo/bees/issues/54#issuecomment-360332529 but there
are more reasons to do it now:  the candidates to replace the current
hash table format are less forgiving of sharing hash tables, and it may
even become necessary to have more than one hash table per BeesContext
instance (e.g. to keep datasum and nodatasum data separate).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
v0.6.2
2019-10-30 00:07:59 -04:00
Zygo Blaxell
9a9dd89177 process: Fix gettid() ambiguity with glibc >= 2.30
In version 2.30 glibc added it's own gettid() function. This resulted in
"error: call of overloaded ‘gettid()’ is ambiguous" because gettid()
now exists in both namespace crucible and std.

For now, use explicit references to namespace crucible.  This continues
to work with new and old libc without having to test specific library
versions.

At some point, glibc gettid() will be deployed widely enough that we can
remove the crucible version entirely.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-10-29 23:34:36 -04:00
Zygo Blaxell
7e5c9b6bbf
lib: fix non-local lambda expression cannot have a capture-default
We got away with this because GCC 4.8 (and apparently every GCC prior
to 9) didn't notice or care, and because there is nothing referenced
inside the lambda function body that isn't accessible from any other
kind of function body (i.e. the capture wasn't needed at all).

GCC 9 now enforces what the C++ standard said all along:  there is
no need to allow capture-default in this case, so it is not.

Fix by removing the offending capture-default.

Fixes: https://github.com/Zygo/bees/issues/112
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-10-29 23:19:31 +01:00
Zygo Blaxell
5ee09ef9e8
tempfile: drop the fsync()
The deadlock seems to be fixed now (if there ever was one--there certainly
were deadlocks, but matching deadlocks to root causes is non-trivial
and a number of distinct deadlock cases have been fixed in recent years).

The benchmark data is inconclusive about whether it is better to fsync or
not to fsync.  A paranoia option might be useful here.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-10-29 23:19:31 +01:00
Zygo Blaxell
a5b9919d26
roots: quick fix for task scheduling bug leading to loss of crawl_master
The crawl_master task had a simple atomic variable that was supposed
to prevent duplicate crawl_master tasks from ending up in the queue;
however, this had a race condition that could lead to m_task_running
being set with no crawl_master task running to clear it.  This would in
turn prevent crawl_thread from scheduling any further crawl_master tasks,
and bees would eventually stop doing any more work.

A proper fix is to modify the Task class and its friends such that
Task::run() guarantees that 1) at most one instance of a Task is ever
scheduled or running at any time, and 2) if a Task is scheduled while
an instance of the Task is running, the scheduling is deferred until
after the current instance completes.  This is part of a fairly large
planned change set, but it's not ready to push now.

So instead, unconditionally push a new crawl_master Task into the queue
on every poll, then silently and quickly exit if the queue is too full
or the supply of new extents is empty.  Drop the scheduling-related
members of BeesRoots as they will not be needed when the proper fix lands.

Fixes: 4f0bc78a "crawl: don't block a Task waiting for new transids"
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-10-29 23:19:31 +01:00
Zygo Blaxell
04dbfd5bf1
bees: soft-limit computed thread counts to 8
https://github.com/Zygo/bees/issues/91 describes problems encountered
when running bees on systems with many CPU cores.

Limit the computed number of threads (using --thread-factor or the
default) to a maximum of 8 (i.e. the number of logical cores in a modern
laptop).  Users can override the limit by using --thread-count.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-10-29 23:14:35 +01:00
Zygo Blaxell
df640062e7
workarounds: add workaround for btrfs send
Introduce --workaround options which trade performance or effectiveness to
avoid triggering kernel bugs.

The first such option is --workaround-btrfs-send, which avoids making any
modification to read-only subvols to avoid btrfs send bugs.

Clean up usage message:  no tabs for formatting, split options into
sections by theme.

Make scan mode a non-static data member like all (most?) other options.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-10-29 23:14:35 +01:00
Kai Krakow
1d369a3c18
Makefile: Fix git usage for non-git source archive
We didn't take enough care to fix all invocations of git in this
scenario.

Fixes: 32d2739 ("Makefile: Specify version when building from tarball")
Signed-off-by: Kai Krakow <kai@kaishome.de>
2019-10-29 23:13:51 +01:00
Kai Krakow
14ccf88050
crucible: Try repairing a build failure around swap macro
Gentoo-Bug: https://bugs.gentoo.org/670606
Fixes: https://github.com/Zygo/bees/issues/85
Suggested-by: Zygo Blaxell <bees@furryterror.org>
Signed-off-by: Kai Krakow <kai@kaishome.de>
v0.6.1
2018-11-09 06:48:01 +01:00
rsjaffe
b6e4511446
systemd service replace deprecated parameters
Replace CPU shares and IO block weight by CPU weight and IO weight. Note that new parameters are roughly 1/100 of old one--I believe that's the right conversion. Also removed duplicate Nice parameter and alphabetized the parameters for ease of reading.
2018-11-09 06:48:01 +01:00
Zygo Blaxell
256da15ac1
context: cache result of home_fd()
BeesContext::home_fd() is supposed to open $BEESHOME once and cache
the Fd for later calls; however, instead it was reopening a new Fd each
time it was called, and _also_ holding that Fd in a BeesContext member.
Fds clean themselves up when they are forgotten, so it was not leaking
per se, but it certainly had more open Fds than it needed to.

Check to see if we have m_home_fd open, and return that if so.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-09 06:48:01 +01:00
Zygo Blaxell
c426794542
roots: fix subvol scan rollover on subvols with empty transid range
The ordering function for BeesCrawlState did not consider

	root 292 inode 0 min_transid 2345 max_transid 3456

to be larger than

	root 292 inode 258 min_transid 2345 max_transid 2345

so when we attempted to update the end pointer for the crawl progress,
the new state was not considered newer than the old state because the
min_transid was equal, but the new crawl state's inode number was smaller.

Normally this is not a problem because subvol scans typically begin
and end in separate transactions (in part because we don't start a
subvol scan until at least two transactions are available); however,
the cleanup code for the aftermath of the recent transid_min() bug can
create crawlers with equal max_transid and min_transid records.

Fix this by ordering both transid fields before any others in the
crawl state.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-08 01:04:14 +01:00
Zygo Blaxell
ce0c1ab629
roots: do not accept 18446744073709551615 as max_transid in beescrawl.dat
Due to an earlier bug some beescrawl.dat files will contain uint64_t
max as max_transid.  This prevents any further scanning on the subvol
because there is no possibiity of having a real transid (or any other
uint64_t number) larger than uint64_t max.

If we detect a bad transid in beescrawl.dat, log a warning, then use
some more plausible value:  either min_transid to repeat the previous
incremental crawl, or 0 to restart the subvol scan from the beginning.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-08 01:04:10 +01:00
Zygo Blaxell
d11906c4e8
roots: do not allow transid_min to be numeric_limits<uint64_t>::max()
On a few test machines max_transid on subvols is getting set to
18446744073709551615 (aka uint64_t max).

Prevent transid_min() from ever returning this value.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-08 01:04:08 +01:00
Zygo Blaxell
e7fbd0c732
src: add bees-version.new.c to .gitignore
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-11-08 01:02:03 +01:00
Kai Krakow
cf9d1d0b78
Makefile: Specify version when building from tarball
When package maintainers build from a tarball, the .git directory does
not exist to extract the version tag. Let's add a hack to work around
this issue and let them specify `BEES_VERSION="v0.y"` on the make
cmdline.

Github-Bug: https://github.com/Zygo/bees/issues/75
Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-11-08 01:01:25 +01:00
Kai Krakow
3504439d5c
contrib/gentoo: Update ebuild
Now that the packaging preparations were merged, we should update the
ebuild to reflect the upstream master branch.

Signed-off-by: Kai Krakow <kai@kaishome.de>
v0.6
2018-09-27 10:55:24 +02:00
Zygo Blaxell
d4b3836493 extentwalker: don't fetch absurd numbers of extents just to throw them away
ExtentWalker doesn't gain significant benefits from caching, and the
extra SEARCH_V2 ioctls were blamed for a 33% kernel CPU overhead by perf.

Reduce the number of extents to 16 in lieu of fixing the caching.

This gives a significant speed boost on CPU-bound workloads compared
to the original 1024--almost 40% faster on a single SSD with a filesystem
consisting of raw VM images mounted with compress=zstd.

This also seems to reduce LOGICAL_INO overhead.  Perhaps SEARCH_V2 and
LOGICAL_INO were trying to lock the same extents, and interfering with
each other?

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-26 23:29:56 -04:00
Kai Krakow
f053e0e1a7 beesd: Fix the wrapper not finding any config file
`grep -q something | grep -q something_else` will never find anything.
The for-loop is redundant anyways because `grep -l` can already work for
us. Let's replace this with a shorter and working version.

CC: Timofey Titovets <timofey.titovets@synesis.ru>
(fixes: commit 06d41fd "Rewrite beesd arg parser")
Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-16 17:56:31 -04:00
Zygo Blaxell
bcfc3cf08b Merge https://github.com/Zygo/bees/pull/62 2018-09-15 00:09:46 -04:00
Zygo Blaxell
9dbe2d6fee bees: add -G/--thread-min option for minimum thread count
The -g option limits the number of worker threads when the target load
average is exceeded.  On some systems the load normally runs high, and
continuous bees operation is required to avoid running out of disk space.

Add a -G/--thread-min option to force at least some threads to continue
running.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:07 -04:00
Zygo Blaxell
dd3c32a43d README: spell 'available' correctly
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:07 -04:00
Zygo Blaxell
3d536ea6df roots: if queue is full run again
The task queue may already be full of tasks when the crawl task is
executed.  In this case simply reschedule the crawl task at the
end of the current queue.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:06 -04:00
Zygo Blaxell
e66086516f bees: dynamic thread pool size based on system load average
Add -g / --loadavg-target parameter to track system load and add or
remove bees worker threads dynamically to keep system load close to the
loadavg target.  Thread count may vary from zero to the maximum
specified by -c or -C, and is adjusted every 5 seconds.

This is better than implementing a similar load average scheme from
outside of the process (though that is still possible) because the
in-process load tracker does not disrupt the performance timing feedback
mechanisms as a freezer cgroup or SIGSTOP would when controlling bees
from outside.  The internal load average tracker can also adjust the
number of active threads while an external tracker can only choose from
the maximum or zero.

Also fix a bug where a Task could deadlock waiting for itself to exit
if it tries to insert a new Task after the number of worker threads has
been set to zero.

Also correct usage message for --scan-mode (values are 0..2) since
we are touching adjacent lines anyway.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:03 -04:00
Zygo Blaxell
96eb100ded bees: use readahead instead of posix_fadvise
Other btrfs utils use readahead() not posix_fadvise().

There does not appear to be a performance or correctness difference
between the three (none, posix_fadvise, or readahead()).

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:00 -04:00
Zygo Blaxell
041ad717a5 bees: configurable log verbosity
Log messages were already labelled with log levels, but there was no
way to filter by log level at run time.

Implement the filter inside the bees process so it can skip evaluation
of the BEESLOG* arguments if the log messages would not be emitted.

Fixes: https://github.com/Zygo/bees/issues/67

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:00 -04:00
Zygo Blaxell
b22db12390 context: log dedups with single unbroken log message
When BEESLOGINFO is called multiple times it generates separate log
records that can be mixed up when multiple threads dedup.

Use a single BEESLOGINFO call for each dedup to prevent this.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:50:00 -04:00
Zygo Blaxell
8938caa029 README.md: update build-deps
btrfs/ioctl.h has been moved to a different package.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:49:57 -04:00
Zygo Blaxell
8bc4bee8a3 crucible: progress: drop the set() method
set() was broken and redundant.  Calling hold() and discarding the
returned object has the correct effect.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:49:54 -04:00
Zygo Blaxell
1beb61fb78 crucible: error: record location of exception in what() message
Make the log show where the exception is thrown from.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2018-09-14 23:49:51 -04:00
Timofey Titovets
06d41fd518 Rewrite beesd arg parser
Signed-off-by: Timofey Titovets <timofey.titovets@synesis.ru>
2018-09-15 00:21:06 +03:00
Kai Krakow
788774731b Gentoo: Rework Gentoo ebuild into overlay
This commit squashes all the little changes from the previous
integration branch into one, adjusts to the new Makefile changes, and
introduces an overlay layout so that the contrib/gentoo-bees subtree
can be directly added as a Portage overlay to the system.

The following list contains the previous commit descriptions:

sys-fs/bees: Keyword tested architecture ~amd64

    Bees was tested on this platform.

sys-fs/bees: Add kernel version checks

    Add checking the kernel versions and write some info and/or warnings
    before building and installing the package. Running bees on older
    kernels may have some serious performance and stability impacts, let's
    tell the user about it.

    Closes #55

sys-fs/bees: Add metadata.xml

sys-fs/bees: There's no configure script

    So, there's no point in calling "default".

sys-fs/bees: Simplify src_configure()

sys-fs/bees: Don't depend on markdown

    It makes no sense to install both README.md and README.html, and we can
    get rid of one dependency.

Dependencies: btrfs-progs is no longer a buildtime-only dep

    It is actually needed by the bees service wrapper script, as pointed out
    by Gentoo QA review.

sys-fs/bees: DOCS is not needed

    "COPYING" is already covered by the licensing. The ebuild defaults
    already include README*

sys-fs/bees: Make warnings exclusive

    It was recommended by Gentoo QA to show only either one or another
    warning, and change the texts accordingly.

sys-fs/bees: RDEPEND is not implicit

    RDEPEND does not implicitly default to DEPEND. Let's explicitly set the
    variable.

sys-fs/bees: IUSE=test is only needed for explicit dependencies

    Thus, remove it.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 05:06:39 +02:00
Kai Krakow
679a327ac5 Makefile: Do not force optimizations by default
Make life easier for package maintainers by not forcing architecture or
compiler optimizations by default. E.g., Gentoo QA refuses to accept
both "-march=native" and "-O3". These are usually provided by the
package tooling.

Instead, we provide easily accessible templates in "makeflags".

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 04:05:15 +02:00
Kai Krakow
31b41bb3c2 Makefile: Do not force making README.html
This forces us to depend on markdown which would be otherwise optional.
Most of the time it is sufficient to let package managers just install
the README.md file.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 03:34:48 +02:00
Kai Krakow
d7e235c178 Makefile: "which" is not portable
It was pointed out by Gentoo QA that "type -P" is a better choice.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 03:14:18 +02:00
Kai Krakow
51108f839d Makefile: Due to VPATH, libcrucible links to hard-coded libuuid path
Due to VPATH and how make resolves source paths, libcrucible.so ends up
with a hard-coded path to link against libuuid.so. Let's fix it by
turning the general rule into an explicit rule for libcrucible.so.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 03:07:20 +02:00
Kai Krakow
8d102abf8b Makefile: create a template compiler
This creates a simple template compiler using sed in as a reusable
variable.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 02:59:54 +02:00
Kai Krakow
83e8f87dc9 Scripts: Don't prefix timestamps when running with systemd
Since systemd prefix it's own timestamps, we can unconditionally remove
timestamps when bees is executed by systemd.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 02:59:54 +02:00
Kai Krakow
4417b18d9e Makefile: .version.o is made from a generated file
We should probably not put it into the objects list. Let's instead
explicitly put it as a depend of libcrucible.so.

This allows us to not use *.cc as a depend for .version.cc which makes
more sense as CRUCIBLE_OBJS is also explicitly defined and not built
from wildcards.

Signed-off-by: Kai Krakow <kai@kaishome.de>
2018-09-08 02:59:54 +02:00