1
0
mirror of https://github.com/Zygo/bees.git synced 2025-05-17 21:35:45 +02:00
bees/README.md
2016-11-17 12:12:14 -05:00

2.2 KiB

BEES

Best-Effort Extent-Same, a btrfs deduplicator.

TODO

Write some docs here:

  • copyright (Zygo Blaxell 2015-2016), license (GPL3+)
  • what it is
  • what it isn't
  • building it
  • what works
  • what doesn't work
  • a brief history of btrfs kernel bugs
  • things that could have been, and why they aren't
  • roadmap (and anti-roadmap)
  • how to report bugs
  • how to contribute

Build

Requirements: * C++11 compiler (I use GCC 4.9) * btrfs-progs (I've used 4.1..4.7) for /usr/include/btrfs/* * libuuid-dev (TODO: remove the one function we call from this library)

Build with make.

The build produces bin/bees and lib/libcrucible.so, which must be copied to somewhere in $PATH and $LD_LIBRARY_PATH on the target system respectively.

Setup

Create a directory for bees state files:

export BEESHOME=/some/path
mkdir -p "$BEESHOME"

Create an empty hash table (your choice of size, but it must be a multiple of 16M). This example creates a 1GB hash table:

truncate -s 1g "$BEESHOME/beeshash.dat"
chmod 700 "$BEESHOME/beeshash.dat"

Configuration

The only runtime configurable options are environment variables:

  • BEESHOME: Directory containing Bees state files:
  • beeshash.dat | persistent hash table (must be a multiple of 16M)
  • beescrawl.UUID.dat | state of SEARCH_V2 crawlers
  • beesstats.txt | statistics and performance counters
  • BEESSTATS: File containing a snapshot of current Bees state (performance counters and current status of each thread).

Other options (e.g. interval between filesystem crawls) can be configured in src/bees.h.

Running

We created this directory in the previous section.

export BEESHOME=/some/path

Use a tmpfs for BEESSTATUS, it updates once per second

export BEESSTATUS=/run/bees.status

bees can only process the root subvol of a btrfs. Use a bind mount, and let only bees access it.

mount -osubvol=/ /dev/<your-filesystem> /var/lib/bees/root

Let the bees fly!

bees /var/lib/bees/root >> /var/log/bees.log 2>&1

You'll probably want to arrange for /var/log/bees.log to be rotated periodically. You may also want to set umask to 077 to prevent disclosure of information about the contents of the filesystem through the log file.