1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-18 21:55:45 +02:00

166 Commits

Author SHA1 Message Date
Zygo Blaxell
b0ba4c4f38 bees: time tmpfile create and copy operations
Add time spent in file create and copy operations to the stats.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit f01c20f97269083175a74d1a1fd3ebaced2d9560)
2017-06-17 10:15:11 -04:00
Zygo Blaxell
74d256f0fe bees: handle trace functions that throw exceptions
A BEESTRACE closure could throw an exception.  Trap those so we don't
end up in terminate().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 59660cfc00b9ca233eeb1a7cdf6df34a45a2deba)
2017-06-17 10:15:11 -04:00
Zygo Blaxell
8cde833863 bees: make a thread note when we read data
Reads can block indefinitely due to bugs, low io priority, or poor
storage performance.  Record the block origin data in the thread state
so we can see which reads are problematic.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit f56f736d28970a0f03ee887a5bd5515cc749d413)
2017-06-17 10:15:11 -04:00
Zygo Blaxell
e0951ed4ba bees: use C++11 syntax for constant initializers
This lets us use more default constructors.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 8a932a632ff4602a0357ed5fbcd3f86b6bc50283)
2017-06-17 10:15:11 -04:00
Zygo Blaxell
c479b361cd bees: remove file open serialization mutex
It is no longer necessary.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 5c9104555768801701df15f512c4d3d7c579398b)
2017-06-17 10:15:11 -04:00
Zygo Blaxell
c6c3990d19 bees: types: improve serialization of byte ranges
Use () instead of [] when the respective end of the byte range touches
the beginning or end of the file.  Also omit the '0' at beginning of
file.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 3023b7f57a3003242bc770bcfe55f666227680ff)
2017-06-17 10:15:11 -04:00
Zygo Blaxell
3fdc217b4f bees: change formatting for physical bytenr ranges in dedup
Use a different character to make it easier to search for bytenr ranges
in the logs.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit d43199e3d6e6469264eb10de8b0a783f8573e0e8)
2017-06-17 10:15:08 -04:00
Zygo Blaxell
6c8d2bf428 bees: limit FD cache size explicitly
This will allow the default size limit for cache objects to be changed
with impunity.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 9daa51edaab44c02ce0917ff94b20683036d7594)
2017-06-17 10:15:08 -04:00
Zygo Blaxell
d6f97edf4a crucible: fs: keep ioctl buffer between runs
perf blames the SEARCH_V2 ioctl wrapper for a lot of time spent in malloc.
Use a thread_local buffer for ioctl results, and reuse it between runs.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit e509210428951e645d33916694a17aed1950991d)
2017-06-17 10:15:08 -04:00
Zygo Blaxell
312254a47b crucible: cache: no need to use explicit lock type
C++11 'auto' keyword is sufficient.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 44fedfc928a5ac6b8040fd94907954f0f2f806f1)
2017-06-17 10:14:25 -04:00
Zygo Blaxell
cc7b4f22b5 bees: trace calls to BeesResolver
This helps identify causes of the "same physical address in dedup"
exception.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:09:24 -04:00
Zygo Blaxell
a3d7032eda bees: drop unused constants
BLOCK_SIZE_MIN_EXTENT_DEFRAG, BLOCK_SIZE_MIN_EXTENT_SPLIT, and others
are no longer used.  Remove them.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:17 -04:00
Zygo Blaxell
f01c20f972 bees: time tmpfile create and copy operations
Add time spent in file create and copy operations to the stats.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:17 -04:00
Zygo Blaxell
59660cfc00 bees: handle trace functions that throw exceptions
A BEESTRACE closure could throw an exception.  Trap those so we don't
end up in terminate().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:17 -04:00
Zygo Blaxell
f56f736d28 bees: make a thread note when we read data
Reads can block indefinitely due to bugs, low io priority, or poor
storage performance.  Record the block origin data in the thread state
so we can see which reads are problematic.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:17 -04:00
Zygo Blaxell
8a932a632f bees: use C++11 syntax for constant initializers
This lets us use more default constructors.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:17 -04:00
Zygo Blaxell
5c91045557 bees: remove file open serialization mutex
It is no longer necessary.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:17 -04:00
Zygo Blaxell
3023b7f57a bees: types: improve serialization of byte ranges
Use () instead of [] when the respective end of the byte range touches
the beginning or end of the file.  Also omit the '0' at beginning of
file.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:16 -04:00
Zygo Blaxell
c1dbd30d82 bees: don't limit number of active crawlers
All testing so far incidates more crawlers go faster up to a limit
much larger than btrfs's performance limitations on subvols, even on
spinning rust.  Remove the artificial constraint.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:06:16 -04:00
Zygo Blaxell
d43199e3d6 bees: change formatting for physical bytenr ranges in dedup
Use a different character to make it easier to search for bytenr ranges
in the logs.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 09:50:59 -04:00
Zygo Blaxell
9daa51edaa bees: limit FD cache size explicitly
This will allow the default size limit for cache objects to be changed
with impunity.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 09:50:59 -04:00
Zygo Blaxell
e509210428 crucible: fs: keep ioctl buffer between runs
perf blames the SEARCH_V2 ioctl wrapper for a lot of time spent in malloc.
Use a thread_local buffer for ioctl results, and reuse it between runs.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 09:50:59 -04:00
Zygo Blaxell
235a3b6877 crucible: resource: optimize map cleanup
We were holding weak refs until the next time the resource ID was used.
This is a bad thing if resource IDs are sparse (e.g. pointers or hashes)
because we'll never see an ID twice.

To fix, determine whether we released the last instance of a resource,
and if so, free its weak ref immediately.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 09:50:59 -04:00
Zygo Blaxell
aa0b22d445 crucible: lockset: track lockers and use handle type
Keep track of the locking thread so we can see why we are deadlocked
in gdb.

Use a handle type for locks based on shared_ptr.  Change the handle type
name to flush out any non-auto local variables.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 09:50:59 -04:00
Zygo Blaxell
44fedfc928 crucible: cache: no need to use explicit lock type
C++11 'auto' keyword is sufficient.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 09:50:59 -04:00
Zygo Blaxell
b004b22e47 Merge branch 'master' into subvol-threads 2017-06-17 08:15:34 -04:00
Timofey Titovets
5350b0f113 Bees: fix [-Werror=implicit-fallthrough=]
In gcc 7+ warning: implicit-fallthrough has been added
In some places fallthrough is expectable, disable warning

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
v0.4
2017-06-13 18:05:38 +03:00
Zygo Blaxell
5a3f1be09e Merge git://github.com/Nefelim4ag/bees 2017-02-09 20:01:29 -05:00
Timofey Titovets
4b592ec2a3 Check: if disk with UUID are btrfs by blkid
Old check can't find btrfs fs, if fs not mounted

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
2017-02-09 11:56:31 +03:00
Zygo Blaxell
dc00dce842 context: purge FD cache every COMMIT_INTERVAL
Holding file FDs open for long periods of time delays inode destruction.
For very large files this can lead to excessive delays while bees dedups
data that will cease to be reachable.

Use the same workaround for file FDs (in the root_ino cache) that
is used for subvols (in the root cache):  forcibly close all cached
FDs at regular intervals.  The FD cache will reacquire FDs from files
that still have existing paths, and will abandon FDs from files that
no longer have existing paths.  The non-existing-path case is not new
(bees has always been able to discover deleted inodes) so it is already
handled by existing code.

Fixes: https://github.com/Zygo/bees/issues/18

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-02-08 22:01:00 -05:00
Timofey Titovets
82b3ba76fa Makefile: make service install compatible with debian systems
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
2017-01-30 05:29:28 +03:00
Zygo Blaxell
5a7f4f7899 makeflags: fix missing -D_FILE_OFFSET_BITS=64 in comment
Interesting things happen when blindly swapping the release-build CCFLAGS
with the debug-build commented-out CCFLAGS.  None of these things that
happen are good.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-26 22:09:17 -05:00
Zygo Blaxell
dc975f1fa4 crucible: resource: remove excess locking
The bugs in other parts of the code have been identified and fixed,
so the overprotective locks around shared_ptr can be removed.

Keep the other improvements to the Resource class.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-26 22:03:45 -05:00
Zygo Blaxell
99fe452101 context: raise limit on the number of concurrent ioctls to cpu_cores/2
This might improve performance on systems with more than 3 CPU cores...or
it might bring such a machine to its knees.

TODO:  find out which of those two things happens.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:18:05 -05:00
Zygo Blaxell
9cb48c35b9 crucible: lockset: add LockSet<T>::Lock make_lock
Before:  decltype(foo)::Lock lock(foo, key);

After:   auto lock = foo.make_lock(key);

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:18:03 -05:00
Zygo Blaxell
be1aa049c6 context: allow concurrent dedup
Dedup was spending a lot of time waiting for the ioctl mutex while it
was held by non-dedup ioctls; however, when dedup finally locked the
mutex, its average run time was comparatively short and the variance
was low.

With the various workarounds and kernel fixes in place, FILE_EXTENT_SAME
and bees play well enough together that we can allow multiple threads
to do dedup at the same time.  The extent bytenr lockset should be
sufficient to prevent undesirable results (i.e. dup data not removed,
or deadlocks on old kernels).

Remove the ioctl lock on dedup.

LOGICAL_INO and SEARCH_V2 (as used by BeesCrawl) remain under the ioctl
mutex because they can still have abitrarily large run times.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:18:03 -05:00
Zygo Blaxell
e46b96d23c context: lock extents by bytenr instead of globally prohibiting tmpfiles
This prevents two threads from attempting to dispose of the same physical
extent at the same time.  This is a more precise exclusion than the
general lock on all tmpfiles.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:18:03 -05:00
Zygo Blaxell
e7fddcbc04 hash: use the LockSet max_size to read hash table from only one thread at a time
This reduces disk thrashing at startup.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:18:03 -05:00
Zygo Blaxell
920cfbc1f6 crawl: put the current crawl state in the thread status
It's more useful than a generic "waiting for thread limit" status

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:18:02 -05:00
Zygo Blaxell
4f9c2c0310 roots: don't deadlock while deleting a crawl thread
BeesRoots::crawl_state_erase may invoke BeesCrawl::~BeesCrawl, which
will do a join on its crawl thread, which might be trying to lock
BeesRoots::m_mutex, which is locked by crawl_state_erase at the time.

Fix this by creating an extra reference to the BeesCrawl object, then
releasing the lock on BeesRoots::m_mutex, then deleting the reference.

The BeesCrawl object may still call methods on BeesRoots, but the only
such method is BeesRoots::crawl_state_set_dirty, and that method has
no dependency on the erased BeesCrawl shared_ptr.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:18:00 -05:00
Zygo Blaxell
4604f5bc96 crawl: remove the unused single-threaded crawl implementation
This is a TODO from "bees: process each subvol in its own thread"

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:17:59 -05:00
Zygo Blaxell
09ab0778e8 README: we have multiple worker threads now, so don't say that we don't
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:17:58 -05:00
Zygo Blaxell
b22b4ed427 bees: process each subvol in its own thread
This is yet another multi-threaded Bees experiment.

This time we are dividing the work by subvol:  one thread is created to
process each subvol in the filesystem.  There is no change in behavior
on filesystems containing only one subvol.

In order to avoid or mitigate the impact of kernel bugs and performance
issues, the btrfs ioctls FILE_EXTENT_SAME, SEARCH_V2, and LOGICAL_INO are
serialized.  Only one thread may execute any of these ioctls at any time.
All three ioctls share a single lock.

In order to simplify the implementation, only one thread is permitted to
create a temporary file during one call to scan_one_extent.  This prevents
multiple threads from racing to replace the same physical extent with
separate physical copies.

The single "crawl" thread is replaced by one "crawl_<root_number>"
for each subvol.

The crawl size is reduced from 4096 items to 1024.  This reduces the
memory requirement per subvol and keeps the data in memory fresher.
It also increases the number of log messages, so turn some of them off.

TODO:  Currently there is no configurable limit on the total number
of threads.  The number of CPUs is used as an upper bound on the number
of active threads; however, we still have one thread per subvol even if
all most of the threads do is wait for locks.

TODO:  Some of the single-threaded code is left behind until I make up
my mind about whether this experiment is successful.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:17:54 -05:00
Zygo Blaxell
4113a171be crucible: cache: clean up use of iterators
check_overflow() will invalidate iterators if it decides there are too
many cache entries.

If items are deleted from the cache, search for the inserted item again
to ensure the iterator is valid.

Increase size of timestamp to size_t.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:12:34 -05:00
Zygo Blaxell
5713fcd770 bees: clean up statistics class
Some whitespace fixes.  Remove some duplicate code.  Don't lock
two BeesStats objects in the - operator method.

Get the locking for T& at(const K&) right to avoid locking a mutex
recursively.  Make the non-const version of the function private.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:28 -05:00
Zygo Blaxell
db8ea92133 bees: fix further instances of copy-after-unlock bug
Before:

        unique_lock<mutex> lock(some_mutex);
        // run lock.~unique_lock() because return
        // return reference to unprotected heap
        return foo[bar];

After:

        unique_lock<mutex> lock(some_mutex);
        // make copy of object on heap protected by mutex lock
        auto tmp_copy = foo[bar];
        // run lock.~unique_lock() because return
        // pass locally allocated object to copy constructor
        return tmp_copy;

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:27 -05:00
Zygo Blaxell
6099bf0b01 crucible: fix further instances of copy-after-unlock bug
Before:

	unique_lock<mutex> lock(some_mutex);
	// run lock.~unique_lock() because return
	// return reference to unprotected heap
	return foo[bar];

After:

	unique_lock<mutex> lock(some_mutex);
	// make copy of object on heap protected by mutex lock
	auto tmp_copy = foo[bar];
	// run lock.~unique_lock() because return
	// pass locally allocated object to copy constructor
	return tmp_copy;

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:27 -05:00
Zygo Blaxell
c58e5cd75b crucible: cache: construct return value before releasing lock
If we release the lock first (and C++ destructor order says we do), then
the return value will be constructed from data living in an unprotected
container object.  That data might be destroyed before we get to the
copy constructor for the return value.

Make a temporary copy of the return value that won't be destroyed by any
other thread, then unlock the mutex, then return the copy object.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 12:15:07 -05:00
Paul Jones
123d4e83c5 Remove reference to *.c files in Makefile
On Gentoo it errors out because there is no *.c


Signed-off-by: Paul Jones <paul@pauljones.id.au>
2017-01-22 16:49:50 +11:00
Zygo Blaxell
5de3b15daa src: Update bees-version.c more often
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-18 22:17:03 -05:00