"saved" is used only during hash table correctness analysis, which is
normally not enabled at compile time, and requires source modification
to enable.
Remove the pointless copy and save a tiny bit of CPU.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The 16MB hash table extent size did not serve any useful defragmentation
or compression purpose, and for very small filesystems (under 100GB),
16MB is much larger than necessary.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
systemd-coredumpctl collects core files for later analysis
with gdb. It's a convenient thing if the keys you use to encrypt
/var/lib/systemd/coredump are the same as the keys you use to encrypt
the filesystem where you're running bees.
Add it to the documentation just before the hand-rolled version.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Standard crash backtrace collection, plus $BEESSTATUS for the high-level
overview of what bees is doing.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Split the rather large README into smaller sections with a pitch and
a ToC at the top.
Move the sections into docs/ so that Github Pages can read them.
'make doc' produces a local HTML tree.
Update the kernel bugs and gotchas list.
Add some information that has been accumulating in Github comments.
Remove information about bugs in kernels earlier than 4.14.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When package maintainers build from a tarball, the .git directory does
not exist to extract the version tag. Let's add a hack to work around
this issue and let them specify `BEES_VERSION="v0.y"` on the make
cmdline.
Github-Bug: https://github.com/Zygo/bees/issues/75
Signed-off-by: Kai Krakow <kai@kaishome.de>
Gentoo has officially merged the ebuild into portage as of:
https://github.com/gentoo/gentoo/pull/9925
Let's update the readme and get rid of the `contrib/gentoo-bees`
directory, so we have no potentially outdated information in the future.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Now that the packaging preparations were merged, we should update the
ebuild to reflect the upstream master branch.
Signed-off-by: Kai Krakow <kai@kaishome.de>
ExtentWalker doesn't gain significant benefits from caching, and the
extra SEARCH_V2 ioctls were blamed for a 33% kernel CPU overhead by perf.
Reduce the number of extents to 16 in lieu of fixing the caching.
This gives a significant speed boost on CPU-bound workloads compared
to the original 1024--almost 40% faster on a single SSD with a filesystem
consisting of raw VM images mounted with compress=zstd.
This also seems to reduce LOGICAL_INO overhead. Perhaps SEARCH_V2 and
LOGICAL_INO were trying to lock the same extents, and interfering with
each other?
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
`grep -q something | grep -q something_else` will never find anything.
The for-loop is redundant anyways because `grep -l` can already work for
us. Let's replace this with a shorter and working version.
CC: Timofey Titovets <timofey.titovets@synesis.ru>
(fixes: commit 06d41fd "Rewrite beesd arg parser")
Signed-off-by: Kai Krakow <kai@kaishome.de>
The -g option limits the number of worker threads when the target load
average is exceeded. On some systems the load normally runs high, and
continuous bees operation is required to avoid running out of disk space.
Add a -G/--thread-min option to force at least some threads to continue
running.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The task queue may already be full of tasks when the crawl task is
executed. In this case simply reschedule the crawl task at the
end of the current queue.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Add -g / --loadavg-target parameter to track system load and add or
remove bees worker threads dynamically to keep system load close to the
loadavg target. Thread count may vary from zero to the maximum
specified by -c or -C, and is adjusted every 5 seconds.
This is better than implementing a similar load average scheme from
outside of the process (though that is still possible) because the
in-process load tracker does not disrupt the performance timing feedback
mechanisms as a freezer cgroup or SIGSTOP would when controlling bees
from outside. The internal load average tracker can also adjust the
number of active threads while an external tracker can only choose from
the maximum or zero.
Also fix a bug where a Task could deadlock waiting for itself to exit
if it tries to insert a new Task after the number of worker threads has
been set to zero.
Also correct usage message for --scan-mode (values are 0..2) since
we are touching adjacent lines anyway.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Other btrfs utils use readahead() not posix_fadvise().
There does not appear to be a performance or correctness difference
between the three (none, posix_fadvise, or readahead()).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Log messages were already labelled with log levels, but there was no
way to filter by log level at run time.
Implement the filter inside the bees process so it can skip evaluation
of the BEESLOG* arguments if the log messages would not be emitted.
Fixes: https://github.com/Zygo/bees/issues/67
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
When BEESLOGINFO is called multiple times it generates separate log
records that can be mixed up when multiple threads dedup.
Use a single BEESLOGINFO call for each dedup to prevent this.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
set() was broken and redundant. Calling hold() and discarding the
returned object has the correct effect.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit squashes all the little changes from the previous
integration branch into one, adjusts to the new Makefile changes, and
introduces an overlay layout so that the contrib/gentoo-bees subtree
can be directly added as a Portage overlay to the system.
The following list contains the previous commit descriptions:
sys-fs/bees: Keyword tested architecture ~amd64
Bees was tested on this platform.
sys-fs/bees: Add kernel version checks
Add checking the kernel versions and write some info and/or warnings
before building and installing the package. Running bees on older
kernels may have some serious performance and stability impacts, let's
tell the user about it.
Closes#55
sys-fs/bees: Add metadata.xml
sys-fs/bees: There's no configure script
So, there's no point in calling "default".
sys-fs/bees: Simplify src_configure()
sys-fs/bees: Don't depend on markdown
It makes no sense to install both README.md and README.html, and we can
get rid of one dependency.
Dependencies: btrfs-progs is no longer a buildtime-only dep
It is actually needed by the bees service wrapper script, as pointed out
by Gentoo QA review.
sys-fs/bees: DOCS is not needed
"COPYING" is already covered by the licensing. The ebuild defaults
already include README*
sys-fs/bees: Make warnings exclusive
It was recommended by Gentoo QA to show only either one or another
warning, and change the texts accordingly.
sys-fs/bees: RDEPEND is not implicit
RDEPEND does not implicitly default to DEPEND. Let's explicitly set the
variable.
sys-fs/bees: IUSE=test is only needed for explicit dependencies
Thus, remove it.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Make life easier for package maintainers by not forcing architecture or
compiler optimizations by default. E.g., Gentoo QA refuses to accept
both "-march=native" and "-O3". These are usually provided by the
package tooling.
Instead, we provide easily accessible templates in "makeflags".
Signed-off-by: Kai Krakow <kai@kaishome.de>
This forces us to depend on markdown which would be otherwise optional.
Most of the time it is sufficient to let package managers just install
the README.md file.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Due to VPATH and how make resolves source paths, libcrucible.so ends up
with a hard-coded path to link against libuuid.so. Let's fix it by
turning the general rule into an explicit rule for libcrucible.so.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Since systemd prefix it's own timestamps, we can unconditionally remove
timestamps when bees is executed by systemd.
Signed-off-by: Kai Krakow <kai@kaishome.de>
We should probably not put it into the objects list. Let's instead
explicitly put it as a depend of libcrucible.so.
This allows us to not use *.cc as a depend for .version.cc which makes
more sense as CRUCIBLE_OBJS is also explicitly defined and not built
from wildcards.
Signed-off-by: Kai Krakow <kai@kaishome.de>
This commit adds support for putting package configuration options into
header files. This is needed to prepare reading config files from /etc.
Signed-off-by: Kai Krakow <kai@kaishome.de>
This commit removes USR_PREFIX and introduces ETC_PREFIX instead. The
purpose of PREFIX is the installation prefix in the system, not the
installation destination. The latter one is what DESTDIR is used for.
This should clear up the confusion. PREFIX was already mis-used as
installation destination. But that doesn't mix well with how the make
targets are designed.
CC: Timofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: Kai Krakow <kai@kaishome.de>
There's now a new make target called "install_tools" which would not run
by default on installation.
One can add "OPTIONAL_INSTALL_TARGETS=install_tools" into localconf to
install these by default.
fiewalk would be installed to sbin, as only root can run it, the other
goes to bin.
Gentoo can use this to optionally install these tools as a package
feature.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Instead, introduce "make reallyall" and make it the default target. Now,
one can override the default target using localconf.
Needed for preparing Gentoo ebuild test behavior.
Signed-off-by: Kai Krakow <kai@kaishome.de>
Also split "bad feature interactions" into "unknown" (which is what it
really was before) and "bad" (which includes some filesystem-destroying
problems).
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Linux kernel 4.14, while resistant to extent toxicity, is not immune to it.
Go back to the paranoid setting to avoid tying up filesystems in
ridiculously long kernel loops in find_parent_nodes.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
The memset is just doing an assignment from one dereferenced pointer to
another, so do an assignment to keep GCC 8 happy.
Fixes: https://github.com/Zygo/bees/issues/64
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
An empty BeesBlockData from the chasing algorithm used to mean that data
was found at the expected location but it does not match; however, there
are now other reasons for this and they occur much more often. The name
is misleading.
Change the name to report more correctly what happens: no data, without
any guess about the reason.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>