From 5a8c655fc447c08772f01107a87e3364f093bb46 Mon Sep 17 00:00:00 2001 From: Zygo Blaxell Date: Sun, 1 Oct 2017 15:36:08 -0400 Subject: [PATCH] roots: filter out obsolete extents from extent refs When an extent ref is modified, all of the refs in the same metadata page get the same transid in the TREE_SEARCH_V2 header. All of the extents are rescanned by later subvol scans. This causes up to 80% overhead due to redundant reads of the same extents. A proper fix for this requires extent-based scanning instead of extent-ref-based scanning. Until that happens, filter out new references to old extents. Signed-off-by: Zygo Blaxell --- src/bees-roots.cc | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/src/bees-roots.cc b/src/bees-roots.cc index 9661243..be0ba92 100644 --- a/src/bees-roots.cc +++ b/src/bees-roots.cc @@ -742,13 +742,24 @@ BeesCrawl::fetch_extents() if (gen < get_state().m_min_transid) { BEESCOUNT(crawl_gen_low); ++count_low; - // We probably want (need?) to scan these anyway. - // continue; + // We want (need?) to scan these anyway? + // The header generation refers to the transid + // of the metadata page holding the current ref. + // This includes anything else in that page that + // happened to be modified, regardless of how + // old it is. + // The file_extent_generation refers to the + // transid of the extent item's page, which is + // a different approximation of what we want. + // Combine both of these filters to minimize + // the number of times we unnecessarily re-read + // an extent. + continue; } if (gen > get_state().m_max_transid) { BEESCOUNT(crawl_gen_high); ++count_high; - // This shouldn't ever happen + // This shouldn't ever happen...and so far, doesn't. // continue; }