diff --git a/.gitignore b/.gitignore index 91fdf51..7a91af7 100644 --- a/.gitignore +++ b/.gitignore @@ -3,6 +3,7 @@ *.new *.so* Doxyfile +README.html depends.mk doxygen_* html/ diff --git a/Makefile b/Makefile index ffb614c..80f2225 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -default install all: lib src test +default install all: lib src test README.html clean: git clean -dfx @@ -13,3 +13,7 @@ src: lib test: lib src $(MAKE) -C test + +README.html: README.md + markdown README.md > README.html.new + mv -f README.html.new README.html diff --git a/README.md b/README.md new file mode 100644 index 0000000..b5abeaf --- /dev/null +++ b/README.md @@ -0,0 +1,88 @@ +BEES +==== + +Best-Effort Extent-Same, a btrfs deduplicator. + +TODO +---- + +Write some docs here: + +* copyright (Zygo Blaxell 2015-2016), license (GPL3+) +* what it is +* what it isn't +* building it +* what works +* what doesn't work +* a brief history of btrfs kernel bugs +* things that could have been, and why they aren't +* roadmap (and anti-roadmap) +* how to report bugs +* how to contribute + +Build +----- + +Requirements: + * C++11 compiler (I use GCC 4.9) + * btrfs-progs (I've used 4.1..4.7) for /usr/include/btrfs/* + * libuuid-dev (TODO: remove the one function we call from this library) + +Build with `make`. + +The build produces `bin/bees` and `lib/libcrucible.so`, which must be +copied to somewhere in `$PATH` and `$LD_LIBRARY_PATH` on the target +system respectively. + +Setup +----- + +Create a directory for bees state files: + + export BEESHOME=/some/path + mkdir -p "$BEESHOME" + +Create an empty hash table (your choice of size, but it must be a multiple +of 16M). This example creates a 1GB hash table: + + truncate -s 1g "$BEESHOME/beeshash.dat" + chmod 700 "$BEESHOME/beeshash.dat" + +Configuration +------------- + +The only runtime configurable options are environment variables: + +* BEESHOME: Directory containing Bees state files: + * beeshash.dat | persistent hash table (must be a multiple of 16M) + * beescrawl.`UUID`.dat | state of SEARCH_V2 crawlers + * beesstats.txt | statistics and performance counters +* BEESSTATS: File containing a snapshot of current Bees state (performance + counters and current status of each thread). + +Other options (e.g. interval between filesystem crawls) can be configured +in src/bees.h. + +Running +------- + +We created this directory in the previous section. + + export BEESHOME=/some/path + +Use a tmpfs for BEESSTATUS, it updates once per second + + export BEESSTATUS=/run/bees.status + +bees can only process the root subvol of a btrfs. +Use a bind mount, and let only bees access it. + + mount -osubvol=/ /dev/ /var/lib/bees/root + +Let the bees fly! + + bees /var/lib/bees/root >> /var/log/bees.log 2>&1 + +You'll probably want to arrange for /var/log/bees.log to be rotated +periodically. You may also want to set umask to 077 to prevent disclosure +of information about the contents of the filesystem through the log file.