1
0
mirror of https://github.com/Zygo/bees.git synced 2025-08-02 05:43:29 +02:00

11 Commits

Author SHA1 Message Date
Zygo Blaxell
a466ccf2f1 build: include localconf everywhere
Overriding makeflags did not work from localconf in the src, lib, or
test directories.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:52:45 -05:00
Zygo Blaxell
ba04fe1349 roots: make it build with clang
Remove an unnecessary cast that was breaking namespace lookup for clang.

Closes: #159

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:48 -05:00
Zygo Blaxell
830df63d4c chatter: make it build with clang
Silence the unused variable warning.  The compiler is correct, but we
may implement line-level debug at some point in the future, so we
want to keep the member and parameters.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:42 -05:00
Zygo Blaxell
20c9d2ff6a clang: fix struct/class declaration/definition mismatches
clang does not like a defined class to be declared as a struct.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:40 -05:00
Zygo Blaxell
7bbb4d14cb bees context: make it build with clang
Remove unused function getenv_or_die.  All of our environment variable
parameters are optional or have default values.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:38 -05:00
Zygo Blaxell
363c45b8cd bees: make it build with clang
Remove unused "addr check" functions.  We have ranged_cast for detecting
overflow bits.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:35 -05:00
Zygo Blaxell
4ec2b8ac16 task: make it build with clang
Remove unused closure captures.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:30 -05:00
Zygo Blaxell
26d31225fa extentwalker: make it build with clang
Remove unused MAX_OFFSET.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2021-02-08 12:49:27 -05:00
Zygo Blaxell
21ae937201 roots: reimplement transid_max_nocache using extent tree root
Commit 9a97699dd9 upstream.

This commit accidentally fixes a bug where we call btrfs_get_root_transid
with BTRFS_FS_TREE_OBJECTID instead of m_ctx->root_fd().  This leads
to storms of messages like this:

	crawl_transid[5334]: exception type std::system_error: BTRFS_IOC_INO_LOOKUP: rv = readlink(path.c_str(), buf, size + 1): No such file or directory at fs.cc:430: No such file or directory

The code was working before because BTRFS_FS_TREE_OBJECTID == 5.
bees is constantly opening files, and the Linux kernel fills in unused
fd numbers starting from 0, so it's quite likely that the process has fd
5 open to some existing file somewhere on the target btrfs filesystem
most of the time.  If fd 5 is closed, or if it is open to an orphan
file (one without an existing name), the ioctl in btrfs_get_root_id
(called by btrfs_get_root_transid) will fail and throw and exception.
The exception breaks out of the crawl_transid task before it can do any
scanning work, so bees will stop deduping until FD 5 is open again with
an existing file.  This can only happen if other threads are opening
files, so if bees is idle at the instant when this failure occurs,
it will never dedupe again until the process is terminated and restarted.

The remainder is the original commit message:

ROOT_TREE contains the ROOT_ITEM for EXTENT_TREE.  Every modification
(that we care about) to a btrfs must go through EXTENT_TREE, and must
modify the page in ROOT_TREE pointing to the root of EXTENT_TREE...
which makes that a very good source for the filesystem transid.

Remove the loop and the root lookups, and just look at one item for
max_transid.

Also note that every caller of transid_max_nocache() immediately
feeds the return value to m_transid_re.update(), so don't do that
inside transid_max_nocache().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-23 17:27:44 -05:00
Zygo Blaxell
7283126e5c bees: initialize context in the correct order
We cannot use BeesContext::roots() until after
BeesContext::set_root_path() has been called.
Save up the parameter settings until then.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-08-31 22:39:51 -04:00
Zygo Blaxell
ac53e50d3e context: workaround to prevent LOGICAL_INO and btrfs balance from running concurrently
This avoids some kernel bugs.  One of them is fixed in 5.3.4 and later:

	efad8a853a "Btrfs: fix use-after-free when using the tree modification log"

There are apparently others in current kernels, so for now just put bees
on pause until the balance is done.

At some point we may want to provide an option to disable this
workaround; however, running bees and balance at the same time makes
neither particularly fast, so maybe we'll just leave it this way.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2019-11-28 11:32:30 +01:00
12 changed files with 65 additions and 55 deletions

View File

@@ -13,7 +13,7 @@ namespace crucible {
template <class T>
class ProgressTracker {
class ProgressTrackerState;
struct ProgressTrackerState;
class ProgressHolderState;
public:
using value_type = T;

View File

@@ -20,6 +20,7 @@ CRUCIBLE_OBJS = \
uuid.o \
include ../makeflags
-include ../localconf
include ../Defines.mk
configure.h: configure.h.in

View File

@@ -124,6 +124,7 @@ namespace crucible {
} else if (!chatter_names->empty()) {
cerr << "CRUCIBLE_CHATTER does not list '" << m_file << "' or '" << m_pretty_function << "'" << endl;
}
(void)m_line; // not implemented yet
// cerr << "ChatterBox " << reinterpret_cast<void*>(this) << " constructed" << endl;
}

View File

@@ -14,7 +14,6 @@ namespace crucible {
// fm_start, fm_length, fm_flags, m_extents
// fe_logical, fe_physical, fe_length, fe_flags
static const off_t MAX_OFFSET = numeric_limits<off_t>::max();
static const off_t FIEMAP_BLOCK_SIZE = 4096;
static bool __ew_do_log = getenv("EXTENTWALKER_DEBUG");

View File

@@ -110,9 +110,6 @@ namespace crucible {
}
}
template<>
struct ResourceHandle<Process::id, Process>;
pid_t
gettid()
{

View File

@@ -6,6 +6,7 @@ PROGRAMS = \
all: $(PROGRAMS)
include ../makeflags
-include ../localconf
LIBS = -lcrucible -lpthread
LDFLAGS = -L../lib

View File

@@ -11,17 +11,6 @@
using namespace crucible;
using namespace std;
static inline
const char *
getenv_or_die(const char *name)
{
const char *rv = getenv(name);
if (!rv) {
THROW_ERROR(runtime_error, "Environment variable " << name << " not defined");
}
return rv;
}
BeesFdCache::BeesFdCache()
{
m_root_cache.func([&](shared_ptr<BeesContext> ctx, uint64_t root) -> Fd {
@@ -773,11 +762,42 @@ BeesResolveAddrResult::BeesResolveAddrResult()
{
}
void
BeesContext::wait_for_balance()
{
Timer balance_timer;
BEESNOTE("WORKAROUND: waiting for balance to stop");
while (true) {
btrfs_ioctl_balance_args args;
memset_zero<btrfs_ioctl_balance_args>(&args);
const int ret = ioctl(root_fd(), BTRFS_IOC_BALANCE_PROGRESS, &args);
if (ret < 0) {
// Either can't get balance status or not running, exit either way
break;
}
if (!(args.state & BTRFS_BALANCE_STATE_RUNNING)) {
// Balance not running, doesn't matter if paused or cancelled
break;
}
BEESLOGDEBUG("WORKAROUND: Waiting " << balance_timer << "s for balance to stop");
sleep(BEES_BALANCE_POLL_INTERVAL);
}
}
BeesResolveAddrResult
BeesContext::resolve_addr_uncached(BeesAddress addr)
{
THROW_CHECK1(invalid_argument, addr, !addr.is_magic());
THROW_CHECK0(invalid_argument, !!root_fd());
// Is there a bug where resolve and balance cause a crash (BUG_ON at fs/btrfs/ctree.c:1227)?
// Apparently yes, and more than one.
// Wait for the balance to finish before we run LOGICAL_INO
wait_for_balance();
// Time how long this takes
Timer resolve_timer;
// There is no performance benefit if we restrict the buffer size.

View File

@@ -207,19 +207,15 @@ uint64_t
BeesRoots::transid_max_nocache()
{
uint64_t rv = 0;
uint64_t root = BTRFS_FS_TREE_OBJECTID;
BEESNOTE("Calculating transid_max (" << rv << " as of root " << root << ")");
BEESTRACE("Calculating transid_max...");
rv = btrfs_get_root_transid(root);
// XXX: Do we need any of this? Or is
// m_transid_re.update(btrfs_get_root_transid(BTRFS_FS_TREE_OBJECTID)) good enough?
BEESNOTE("Calculating transid_max");
BEESTRACE("Calculating transid_max");
// We look for the root of the extent tree and read its transid.
// Should run in O(1) time and be fairly reliable.
BtrfsIoctlSearchKey sk;
sk.tree_id = BTRFS_ROOT_TREE_OBJECTID;
sk.min_type = sk.max_type = BTRFS_ROOT_BACKREF_KEY;
sk.min_objectid = root;
sk.min_type = sk.max_type = BTRFS_ROOT_ITEM_KEY;
sk.min_objectid = sk.max_objectid = BTRFS_EXTENT_TREE_OBJECTID;
while (true) {
sk.nr_items = 1024;
@@ -229,21 +225,18 @@ BeesRoots::transid_max_nocache()
break;
}
// We are just looking for the highest transid on the filesystem.
// We don't care which object it comes from.
for (auto i : sk.m_result) {
sk.next_min(i);
if (i.type == BTRFS_ROOT_BACKREF_KEY) {
if (i.transid > rv) {
BEESLOGDEBUG("transid_max root " << i.objectid << " parent " << i.offset << " transid " << i.transid);
BEESCOUNT(transid_max_miss);
}
root = i.objectid;
}
if (i.transid > rv) {
rv = i.transid;
}
}
}
m_transid_re.update(rv);
// transid must be greater than zero, or we did something very wrong
THROW_CHECK1(runtime_error, rv, rv > 0);
return rv;
}
@@ -980,7 +973,7 @@ BeesCrawl::fetch_extents()
// Lock in the old state
set_state(old_state);
BEESTRACE("Searching crawl sk " << static_cast<btrfs_ioctl_search_key&>(sk));
BEESTRACE("Searching crawl sk " << sk);
bool ioctl_ok = false;
{
BEESNOTE("searching crawl sk " << static_cast<btrfs_ioctl_search_key&>(sk));

View File

@@ -204,20 +204,6 @@ BeesNote::get_status()
// static inline helpers ----------------------------------------
static inline
bool
bees_addr_check(uint64_t v)
{
return !(v & (1ULL << 63));
}
static inline
bool
bees_addr_check(int64_t v)
{
return !(v & (1ULL << 63));
}
string
pretty(double d)
{
@@ -667,6 +653,7 @@ bees_main(int argc, char *argv[])
unsigned thread_min = 0;
double load_target = 0;
bool workaround_btrfs_send = false;
BeesRoots::ScanMode root_scan_mode = BeesRoots::SCAN_MODE_ZERO;
// Configure getopt_long
static const struct option long_options[] = {
@@ -735,7 +722,7 @@ bees_main(int argc, char *argv[])
load_target = stod(optarg);
break;
case 'm':
bc->roots()->set_scan_mode(static_cast<BeesRoots::ScanMode>(stoul(optarg)));
root_scan_mode = static_cast<BeesRoots::ScanMode>(stoul(optarg));
break;
case 'p':
crucible::set_relative_path("");
@@ -806,11 +793,16 @@ bees_main(int argc, char *argv[])
BEESLOGNOTICE("setting worker thread pool maximum size to " << thread_count);
TaskMaster::set_thread_count(thread_count);
// Set root path
string root_path = argv[optind++];
BEESLOGNOTICE("setting root path to '" << root_path << "'");
bc->set_root_path(root_path);
// Workaround for btrfs send
bc->roots()->set_workaround_btrfs_send(workaround_btrfs_send);
// Create a context and start crawlers
bc->set_root_path(argv[optind++]);
// Set root scan mode
bc->roots()->set_scan_mode(root_scan_mode);
BeesThread status_thread("status", [&]() {
bc->dump_status();

View File

@@ -117,6 +117,9 @@ const size_t BEES_TRANSID_FACTOR = 10;
// The actual limit in LOGICAL_INO seems to be 2730, but let's leave a little headroom
const size_t BEES_MAX_EXTENT_REF_COUNT = 2560;
// Wait this long for a balance to stop
const double BEES_BALANCE_POLL_INTERVAL = 60.0;
// Flags
const int FLAGS_OPEN_COMMON = O_NOFOLLOW | O_NONBLOCK | O_CLOEXEC | O_NOATIME | O_LARGEFILE | O_NOCTTY;
const int FLAGS_OPEN_DIR = FLAGS_OPEN_COMMON | O_RDONLY | O_DIRECTORY;
@@ -170,7 +173,7 @@ public:
T at(string idx) const;
friend ostream& operator<< <>(ostream &os, const BeesStatTmpl<T> &bs);
friend class BeesStats;
friend struct BeesStats;
};
using BeesRates = BeesStatTmpl<double>;
@@ -716,6 +719,7 @@ class BeesContext : public enable_shared_from_this<BeesContext> {
void set_root_fd(Fd fd);
BeesResolveAddrResult resolve_addr_uncached(BeesAddress addr);
void wait_for_balance();
BeesFileRange scan_one_extent(const BeesFileRange &bfr, const Extent &e);
void rewrite_file_range(const BeesFileRange &bfr);

View File

@@ -14,6 +14,7 @@ test: $(PROGRAMS:%=%.txt) Makefile
FORCE:
include ../makeflags
-include ../localconf
LIBS = -lcrucible -lpthread
LDFLAGS = -L../lib -Wl,-rpath=$(shell realpath ../lib)

View File

@@ -99,7 +99,7 @@ test_barrier(size_t count)
oss << "task #" << c;
Task t(
oss.str(),
[c, &task_done, &mtx, &cv, bl]() mutable {
[c, &task_done, &mtx, bl]() mutable {
// cerr << "Task #" << c << endl;
unique_lock<mutex> lock(mtx);
task_done.at(c) = true;
@@ -166,8 +166,9 @@ test_exclusion(size_t count)
oss << "task #" << c;
Task t(
oss.str(),
[c, &only_one, &mtx, &excl, bl]() mutable {
[c, &only_one, &excl, bl]() mutable {
// cerr << "Task #" << c << endl;
(void)c;
auto lock = excl.try_lock();
if (!lock) {
excl.insert_task(Task::current_task());