1
0
mirror of https://github.com/Zygo/bees.git synced 2025-08-02 13:53:28 +02:00

1 Commits

Author SHA1 Message Date
Zygo Blaxell
21ae937201 roots: reimplement transid_max_nocache using extent tree root
Commit 9a97699dd9 upstream.

This commit accidentally fixes a bug where we call btrfs_get_root_transid
with BTRFS_FS_TREE_OBJECTID instead of m_ctx->root_fd().  This leads
to storms of messages like this:

	crawl_transid[5334]: exception type std::system_error: BTRFS_IOC_INO_LOOKUP: rv = readlink(path.c_str(), buf, size + 1): No such file or directory at fs.cc:430: No such file or directory

The code was working before because BTRFS_FS_TREE_OBJECTID == 5.
bees is constantly opening files, and the Linux kernel fills in unused
fd numbers starting from 0, so it's quite likely that the process has fd
5 open to some existing file somewhere on the target btrfs filesystem
most of the time.  If fd 5 is closed, or if it is open to an orphan
file (one without an existing name), the ioctl in btrfs_get_root_id
(called by btrfs_get_root_transid) will fail and throw and exception.
The exception breaks out of the crawl_transid task before it can do any
scanning work, so bees will stop deduping until FD 5 is open again with
an existing file.  This can only happen if other threads are opening
files, so if bees is idle at the instant when this failure occurs,
it will never dedupe again until the process is terminated and restarted.

The remainder is the original commit message:

ROOT_TREE contains the ROOT_ITEM for EXTENT_TREE.  Every modification
(that we care about) to a btrfs must go through EXTENT_TREE, and must
modify the page in ROOT_TREE pointing to the root of EXTENT_TREE...
which makes that a very good source for the filesystem transid.

Remove the loop and the root lookups, and just look at one item for
max_transid.

Also note that every caller of transid_max_nocache() immediately
feeds the return value to m_transid_re.update(), so don't do that
inside transid_max_nocache().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2020-12-23 17:27:44 -05:00

View File

@@ -219,7 +219,6 @@ BeesRoots::transid_max_nocache()
while (true) {
sk.nr_items = 1024;
BEESTRACE("transid_max search sk " << sk);
sk.do_ioctl(m_ctx->root_fd());
if (sk.m_result.empty()) {
@@ -416,15 +415,13 @@ BeesRoots::crawl_thread()
BEESNOTE("tracking transid");
auto last_count = m_transid_re.count();
while (true) {
BEESTRACE("Measure current transid");
// Measure current transid
catch_all([&]() {
BEESTRACE("calling transid_max_nocache");
m_transid_re.update(transid_max_nocache());
});
BEESTRACE("Make sure we have a full complement of crawlers");
// Make sure we have a full complement of crawlers
catch_all([&]() {
BEESTRACE("calling insert_new_crawl");
insert_new_crawl();
});
@@ -492,24 +489,19 @@ BeesRoots::insert_new_crawl()
unique_lock<mutex> lock(m_mutex);
set<uint64_t> excess_roots;
for (auto i : m_root_crawl_map) {
BEESTRACE("excess_roots.insert(" << i.first << ")");
excess_roots.insert(i.first);
}
lock.unlock();
while (new_bcs.m_root) {
BEESTRACE("excess_roots.erase(" << new_bcs.m_root << ")");
excess_roots.erase(new_bcs.m_root);
BEESTRACE("insert_root(" << new_bcs << ")");
insert_root(new_bcs);
BEESCOUNT(crawl_create);
BEESTRACE("next_root(" << new_bcs.m_root << ")");
new_bcs.m_root = next_root(new_bcs.m_root);
}
for (auto i : excess_roots) {
new_bcs.m_root = i;
BEESTRACE("crawl_state_erase(" << new_bcs << ")");
crawl_state_erase(new_bcs);
}
}