1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00

roots: reimplement transid_max_nocache using extent tree root

Commit 9a97699dd9045715d9943cf98ca5573708eb1a53 upstream.

This commit accidentally fixes a bug where we call btrfs_get_root_transid
with BTRFS_FS_TREE_OBJECTID instead of m_ctx->root_fd().  This leads
to storms of messages like this:

	crawl_transid[5334]: exception type std::system_error: BTRFS_IOC_INO_LOOKUP: rv = readlink(path.c_str(), buf, size + 1): No such file or directory at fs.cc:430: No such file or directory

The code was working before because BTRFS_FS_TREE_OBJECTID == 5.
bees is constantly opening files, and the Linux kernel fills in unused
fd numbers starting from 0, so it's quite likely that the process has fd
5 open to some existing file somewhere on the target btrfs filesystem
most of the time.  If fd 5 is closed, or if it is open to an orphan
file (one without an existing name), the ioctl in btrfs_get_root_id
(called by btrfs_get_root_transid) will fail and throw and exception.
The exception breaks out of the crawl_transid task before it can do any
scanning work, so bees will stop deduping until FD 5 is open again with
an existing file.  This can only happen if other threads are opening
files, so if bees is idle at the instant when this failure occurs,
it will never dedupe again until the process is terminated and restarted.

The remainder is the original commit message:

ROOT_TREE contains the ROOT_ITEM for EXTENT_TREE.  Every modification
(that we care about) to a btrfs must go through EXTENT_TREE, and must
modify the page in ROOT_TREE pointing to the root of EXTENT_TREE...
which makes that a very good source for the filesystem transid.

Remove the loop and the root lookups, and just look at one item for
max_transid.

Also note that every caller of transid_max_nocache() immediately
feeds the return value to m_transid_re.update(), so don't do that
inside transid_max_nocache().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This commit is contained in:
Zygo Blaxell 2018-10-30 23:31:11 -04:00
parent 7283126e5c
commit 21ae937201

View File

@ -207,19 +207,15 @@ uint64_t
BeesRoots::transid_max_nocache()
{
uint64_t rv = 0;
uint64_t root = BTRFS_FS_TREE_OBJECTID;
BEESNOTE("Calculating transid_max (" << rv << " as of root " << root << ")");
BEESTRACE("Calculating transid_max...");
rv = btrfs_get_root_transid(root);
// XXX: Do we need any of this? Or is
// m_transid_re.update(btrfs_get_root_transid(BTRFS_FS_TREE_OBJECTID)) good enough?
BEESNOTE("Calculating transid_max");
BEESTRACE("Calculating transid_max");
// We look for the root of the extent tree and read its transid.
// Should run in O(1) time and be fairly reliable.
BtrfsIoctlSearchKey sk;
sk.tree_id = BTRFS_ROOT_TREE_OBJECTID;
sk.min_type = sk.max_type = BTRFS_ROOT_BACKREF_KEY;
sk.min_objectid = root;
sk.min_type = sk.max_type = BTRFS_ROOT_ITEM_KEY;
sk.min_objectid = sk.max_objectid = BTRFS_EXTENT_TREE_OBJECTID;
while (true) {
sk.nr_items = 1024;
@ -229,21 +225,18 @@ BeesRoots::transid_max_nocache()
break;
}
// We are just looking for the highest transid on the filesystem.
// We don't care which object it comes from.
for (auto i : sk.m_result) {
sk.next_min(i);
if (i.type == BTRFS_ROOT_BACKREF_KEY) {
if (i.transid > rv) {
BEESLOGDEBUG("transid_max root " << i.objectid << " parent " << i.offset << " transid " << i.transid);
BEESCOUNT(transid_max_miss);
}
root = i.objectid;
}
if (i.transid > rv) {
rv = i.transid;
}
}
}
m_transid_re.update(rv);
// transid must be greater than zero, or we did something very wrong
THROW_CHECK1(runtime_error, rv, rv > 0);
return rv;
}