1
0
mirror of https://github.com/Zygo/bees.git synced 2025-06-16 17:46:16 +02:00
Commit Graph

105 Commits

Author SHA1 Message Date
5275249396 roots: trace transid_max calculation
transid_max calculations can take considerable time.  Report their
progress in more detail.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:45 -04:00
a07728bc7e tmpfiles: note that kernel race condition is not yet fixed
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:36 -04:00
732896b471 log: simplify output for dedup and scan
With many threads it is inconvenient to reassemble the elided parts of
the dedup src/dst and scan filenames output.  Simply output them
unconditionally, and balance the line lengths.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:30 -04:00
5cc5a44661 bees: drop unused BeesWorkQueue classes
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:30:22 -04:00
f6a6992ac9 README: update list of currently known kernel bugs
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:28:50 -04:00
ceda8ee6c3 Makefile: add test to PHONY list
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 17:27:36 -04:00
18ae15658e README: remove stray whitespace
No content changes.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 14:57:24 -04:00
339579096f roots: move flags check after file identity checks and make error message style consistent
If we lose a race and open the wrong file, we will not retry with the
next path if the file we opened had incompatible flags.  We need to keep
trying paths until we open the correct file or run out of paths.
Fix by moving the inode flag check after the checks for file identity.

Output attributes in hex to be consistent with other attribute error
messages.

There is no need to report root and file paths separately in the error
message for incompatible flags because we have confirmed the identity of
the file before the incompatible flag error is detected.  Other messages
in this loop still output root path and file_path separately because
the identity of 'rv' is unknown at the time these messages are emitted.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 14:49:09 -04:00
702a8eec8c bees: use ioctl_iflags_get and ioctl_iflags_set instead of opencoded versions
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 14:31:43 -04:00
5f18fcda52 crucible: add ioctl_iflags_set to complement ioctl_iflags_get
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-09-16 14:31:27 -04:00
088cbd24ff Merge branch 'master' of git://github.com/kakra/bees 2017-09-16 13:55:58 -04:00
8c9a44998d Verbatim Ubuntu build instructions
And link to work done so far on 14.04... (Doesn't work yet)
2017-09-16 09:40:52 +02:00
a5e2bdff47 Skip nocow files to speed up processing
If you have a lot of or a few big nocow files (like vm images) which
contain a lot of potential deduplication candidates, bees becomes
incredibly slow running through a lot "invalid operation" exceptions.

Let's just skip over such files to get more bang for the buck. I did no
regression testing as this patch seems trivial (and I cannot imagine any
pitfalls either). The process progresses much faster for me now.
2017-09-12 02:09:22 +02:00
703bb7c1a3 bees: use handle type for hash table extent locks
Fixes build breakage after "crucible: lockset: track lockers and use
handle type".

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-06-17 10:22:06 -04:00
4f66d1cb44 crucible: lockset: track lockers and use handle type [bees master branch edition]
Keep track of the locking thread so we can see why we are deadlocked
in gdb.

Use a handle type for locks based on shared_ptr.  Change the handle type
name to flush out any non-auto local variables.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit aa0b22d445)
2017-06-17 10:21:33 -04:00
3901962379 bees: trace calls to BeesResolver
This helps identify causes of the "same physical address in dedup"
exception.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit cc7b4f22b5)
2017-06-17 10:15:11 -04:00
48aac8a99a bees: drop unused constants
BLOCK_SIZE_MIN_EXTENT_DEFRAG, BLOCK_SIZE_MIN_EXTENT_SPLIT, and others
are no longer used.  Remove them.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit a3d7032eda)
2017-06-17 10:15:11 -04:00
b0ba4c4f38 bees: time tmpfile create and copy operations
Add time spent in file create and copy operations to the stats.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit f01c20f972)
2017-06-17 10:15:11 -04:00
74d256f0fe bees: handle trace functions that throw exceptions
A BEESTRACE closure could throw an exception.  Trap those so we don't
end up in terminate().

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 59660cfc00)
2017-06-17 10:15:11 -04:00
8cde833863 bees: make a thread note when we read data
Reads can block indefinitely due to bugs, low io priority, or poor
storage performance.  Record the block origin data in the thread state
so we can see which reads are problematic.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit f56f736d28)
2017-06-17 10:15:11 -04:00
e0951ed4ba bees: use C++11 syntax for constant initializers
This lets us use more default constructors.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 8a932a632f)
2017-06-17 10:15:11 -04:00
c479b361cd bees: remove file open serialization mutex
It is no longer necessary.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 5c91045557)
2017-06-17 10:15:11 -04:00
c6c3990d19 bees: types: improve serialization of byte ranges
Use () instead of [] when the respective end of the byte range touches
the beginning or end of the file.  Also omit the '0' at beginning of
file.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 3023b7f57a)
2017-06-17 10:15:11 -04:00
3fdc217b4f bees: change formatting for physical bytenr ranges in dedup
Use a different character to make it easier to search for bytenr ranges
in the logs.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit d43199e3d6)
2017-06-17 10:15:08 -04:00
6c8d2bf428 bees: limit FD cache size explicitly
This will allow the default size limit for cache objects to be changed
with impunity.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 9daa51edaa)
2017-06-17 10:15:08 -04:00
d6f97edf4a crucible: fs: keep ioctl buffer between runs
perf blames the SEARCH_V2 ioctl wrapper for a lot of time spent in malloc.
Use a thread_local buffer for ioctl results, and reuse it between runs.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit e509210428)
2017-06-17 10:15:08 -04:00
312254a47b crucible: cache: no need to use explicit lock type
C++11 'auto' keyword is sufficient.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 44fedfc928)
2017-06-17 10:14:25 -04:00
5350b0f113 Bees: fix [-Werror=implicit-fallthrough=]
In gcc 7+ warning: implicit-fallthrough has been added
In some places fallthrough is expectable, disable warning

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
v0.4
2017-06-13 18:05:38 +03:00
5a3f1be09e Merge git://github.com/Nefelim4ag/bees 2017-02-09 20:01:29 -05:00
4b592ec2a3 Check: if disk with UUID are btrfs by blkid
Old check can't find btrfs fs, if fs not mounted

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
2017-02-09 11:56:31 +03:00
dc00dce842 context: purge FD cache every COMMIT_INTERVAL
Holding file FDs open for long periods of time delays inode destruction.
For very large files this can lead to excessive delays while bees dedups
data that will cease to be reachable.

Use the same workaround for file FDs (in the root_ino cache) that
is used for subvols (in the root cache):  forcibly close all cached
FDs at regular intervals.  The FD cache will reacquire FDs from files
that still have existing paths, and will abandon FDs from files that
no longer have existing paths.  The non-existing-path case is not new
(bees has always been able to discover deleted inodes) so it is already
handled by existing code.

Fixes: https://github.com/Zygo/bees/issues/18

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-02-08 22:01:00 -05:00
82b3ba76fa Makefile: make service install compatible with debian systems
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
2017-01-30 05:29:28 +03:00
4113a171be crucible: cache: clean up use of iterators
check_overflow() will invalidate iterators if it decides there are too
many cache entries.

If items are deleted from the cache, search for the inserted item again
to ensure the iterator is valid.

Increase size of timestamp to size_t.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-23 21:12:34 -05:00
5713fcd770 bees: clean up statistics class
Some whitespace fixes.  Remove some duplicate code.  Don't lock
two BeesStats objects in the - operator method.

Get the locking for T& at(const K&) right to avoid locking a mutex
recursively.  Make the non-const version of the function private.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:28 -05:00
db8ea92133 bees: fix further instances of copy-after-unlock bug
Before:

        unique_lock<mutex> lock(some_mutex);
        // run lock.~unique_lock() because return
        // return reference to unprotected heap
        return foo[bar];

After:

        unique_lock<mutex> lock(some_mutex);
        // make copy of object on heap protected by mutex lock
        auto tmp_copy = foo[bar];
        // run lock.~unique_lock() because return
        // pass locally allocated object to copy constructor
        return tmp_copy;

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:27 -05:00
6099bf0b01 crucible: fix further instances of copy-after-unlock bug
Before:

	unique_lock<mutex> lock(some_mutex);
	// run lock.~unique_lock() because return
	// return reference to unprotected heap
	return foo[bar];

After:

	unique_lock<mutex> lock(some_mutex);
	// make copy of object on heap protected by mutex lock
	auto tmp_copy = foo[bar];
	// run lock.~unique_lock() because return
	// pass locally allocated object to copy constructor
	return tmp_copy;

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 22:00:27 -05:00
c58e5cd75b crucible: cache: construct return value before releasing lock
If we release the lock first (and C++ destructor order says we do), then
the return value will be constructed from data living in an unprotected
container object.  That data might be destroyed before we get to the
copy constructor for the return value.

Make a temporary copy of the return value that won't be destroyed by any
other thread, then unlock the mutex, then return the copy object.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-22 12:15:07 -05:00
123d4e83c5 Remove reference to *.c files in Makefile
On Gentoo it errors out because there is no *.c


Signed-off-by: Paul Jones <paul@pauljones.id.au>
2017-01-22 16:49:50 +11:00
5de3b15daa src: Update bees-version.c more often
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-18 22:17:03 -05:00
38fffa8e27 lib: add a version string
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-18 22:17:02 -05:00
50417e961f crucible: rework the Resource class
Get rid of the ResourceHolder class.

Fix GCC static template member instantiation issues.

Replace assert() with exceptions.

shared_ptr can't seem to do reference counting in a multi-threaded
environment.  The code looks correct (for both ResourceHandle and
std::shared_ptr); however, continual segfaults don't lie.

Carpet-bomb with mutex locks to reduce the likelihood of losing shared_ptr
races.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-18 22:09:18 -05:00
6cc9b267ef crucible: time: fix uninitialized member
Found by valgrind.  It was mostly harmless because the range of
usable values is limited by m_burst (which was initialized) and 0.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-16 22:02:14 -05:00
9f120e326b bees: fix deadlock in thread status reporting
"s_name" was a thread_local variable, not static, and did not require a
mutex to protect access.  A deadlock is possible if a thread triggers an
exception with a handler that attempts to log a message (as the top-level
exception handler in bees does).

Remove multiple unnecessary mutex locks.  Rename the thread_local variables
to make their scope clearer.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-15 01:55:34 -05:00
382f8bf06a hash: prevent eleventy-gigabyte core dumps
Add MADV_DONTDUMP to the list of advice flags.

There are now three flags which may or may not be supported by the
target kernel.  Try each one and log its success or failure separately.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-12 22:55:08 -05:00
5e91529ad2 hash: remove the unused m_prefetch_rate_limit
The hash table statistics calculation in BeesHashTable::prefetch_loop
and the data-driven operation of the extent scanner always pulls the
hash table into RAM as fast as the disk will push the data.  We never
use the prefetch rate limit, so remove it.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-11 21:15:12 -05:00
bddc07bd28 hash: make thread status message more consistent
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-11 21:15:12 -05:00
ffe2a767d3 crucible: extentwalker: add compressed() and bytenr() methods
Also use C++11 syntax for construction.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-11 21:15:11 -05:00
845267821c main: count arguments correctly
Replace one braindead mistake for another.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-10 01:10:38 -05:00
3138002a1f main: ArgList would silently drop the first argument
This fixes a bug where bees tries to process itself as a btrfs filesystem.
This is a species of bug that I only notice *after* pushing to a public
git repo.

Signed-off-by: Zygo Blaxell <bees@furryterror.org>
2017-01-09 23:42:02 -05:00
4a57c5f499 README: update copyright year, remove some obsolete statements 2017-01-09 23:32:33 -05:00