Use a static function instead of embedding side-effects in the constructor
of an unrelated class.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit 85106bd9a9fb3be2c574df1a3aed6674e3951465)
To make bees more friendly to use with syslog/systemd, we add an option
to omit timestamps from the log output.
Signed-off-by: Kai Krakow <kai@kaishome.de>
If you have a lot of or a few big nocow files (like vm images) which
contain a lot of potential deduplication candidates, bees becomes
incredibly slow running through a lot "invalid operation" exceptions.
Let's just skip over such files to get more bang for the buck. I did no
regression testing as this patch seems trivial (and I cannot imagine any
pitfalls either). The process progresses much faster for me now.
Keep track of the locking thread so we can see why we are deadlocked
in gdb.
Use a handle type for locks based on shared_ptr. Change the handle type
name to flush out any non-auto local variables.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
(cherry picked from commit aa0b22d445664409c36503c6fd808bc49b6816d0)
check_overflow() will invalidate iterators if it decides there are too
many cache entries.
If items are deleted from the cache, search for the inserted item again
to ensure the iterator is valid.
Increase size of timestamp to size_t.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Before:
unique_lock<mutex> lock(some_mutex);
// run lock.~unique_lock() because return
// return reference to unprotected heap
return foo[bar];
After:
unique_lock<mutex> lock(some_mutex);
// make copy of object on heap protected by mutex lock
auto tmp_copy = foo[bar];
// run lock.~unique_lock() because return
// pass locally allocated object to copy constructor
return tmp_copy;
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
If we release the lock first (and C++ destructor order says we do), then
the return value will be constructed from data living in an unprotected
container object. That data might be destroyed before we get to the
copy constructor for the return value.
Make a temporary copy of the return value that won't be destroyed by any
other thread, then unlock the mutex, then return the copy object.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Get rid of the ResourceHolder class.
Fix GCC static template member instantiation issues.
Replace assert() with exceptions.
shared_ptr can't seem to do reference counting in a multi-threaded
environment. The code looks correct (for both ResourceHandle and
std::shared_ptr); however, continual segfaults don't lie.
Carpet-bomb with mutex locks to reduce the likelihood of losing shared_ptr
races.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Found by valgrind. It was mostly harmless because the range of
usable values is limited by m_burst (which was initialized) and 0.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
Extend the LockSet class so that the total number of locked (active)
items can be limited. When the limit is reached, no new items can be
locked until some existing locked items are unlocked.
Signed-off-by: Zygo Blaxell <bees@furryterror.org>
This gets rid of some more big memsets. It may replace them
with a lot of tiny mallocs, though. If this turns out to be
a bad idea then at least we can easily revert the change.
We really do need some large buffers for BtrfsIoctlSearchKey in some
cases, but we don't need to zero them out first. Don't do that so we
save some CPU.
Reduce the default buffer size to 4K because most BISK users don't get
need much more than 1K. Set the buffer size explicitly to the product of
the number of items and the desired item size in the places that really
need a lot of items.
The current crc64 algorithm is a variant of the Redis implementation.
Change it to a variant of the Adler implementation as described
at https://matt.sh/redis-crcspeed
Test program at https://github.com/PeeJay/crc64-compare
Filesize: 1.1G
Asking crc64-redis to sum "/media/peejay/BTRFS/1/ubuntu-14.04.5-desktop-amd64.iso"...
Asking crc64-adler to sum "/media/peejay/BTRFS/1/ubuntu-14.04.5-desktop-amd64.iso"...
Redis CRC-64: f971f9ac6c8ba458
Adler CRC-64: f971f9ac6c8ba458
Adler throughput: 1659.913308 MB/s
Redis throughput: 437.284661 MB/s
Adler is 3.79x faster than Redis
Signed-off-by: Paul Jones <paul@pauljones.id.au>
It turns out we never use a value for m_buf_size that isn't the default,
and we also never ask for more than a few thousand items; however,
we do spend a ton of time memsetting the huge buffer to zero.
I don't know what the ideal size is, but 16K is a far better guess
than 1MB. Let's reduce it for some immediate CPU benefit, and determine
what the size should be later.
Reported at https://github.com/Zygo/bees/issues/11
I got a little too enthusiastic when redacting the code, and removed some
overloaded functions bees was using. C++ silently found replacements,
and the result was a bug that prevented any data from being persisted
from the hash table.
Fixes: https://github.com/Zygo/bees/issues/7
I accidentally did a pre-push verification on a 32-bit build host.
There were a surprisingly small number of problems, so fix them.
Bees now builds on a 32-bit host. Let's not update README just yet,
though: the 32-bit ioctl support fails immediately after startup on a
64-bit kernel.
I'm not surprised that GCC 6 doesn't let me send an ostream ref to itself,
even inside an uninstantiated template specialization. I am a little
surprised I was trying to, and 4.9 let me get away with it.
It's 2016. auto_ptr is deprecated now.
Some things were including vector that don't any more.
https://github.com/Zygo/bees/issues/1