Zerotools - Zero out that garbage

Copyright 2006, 2007 Aleksandr Koltsoff

Introduction

Zerotools are a set of tools to aid keeping virtual disks clean (by filling binary zero to those regions which are no longer in "use"). This is done on-the-fly or manually, depending on which tool suits the needs best. In technical terms the on-the-fly mechanism uses LD_PRELOAD to wrap the unlink library call and zerounlink-wrapper provides a symlink based mechanism to target individual programs so that they will be run using the unlink wrapper. For comparison against other tools, please see the "Other solutions" section.

Please note that zerotools don't compete with various tools that rewrite file contents with "random" data in order to make data retrieval close to impossible. Do not use zerotools for this, or claim that they are good tools for this. They're not. (This text was added for people who only read introductions.)

Zerotools are Linux specific, implemented using C and licensed under GNU General Public License (v2). The tools are provided without a warranty of any kind, in the understanding that software bugs do exist and bad things can happen.

Contents

Document conventions

The following markup list will aid you in understanding the markup used in this document:

This document was prepared and generated using DOCSE, an automated document preparation system in fluid motion (not released to the public).

Rationale and history

Zerotools were born out of frustration of getting virtual machine disk images to compress better. Even with minimal Ubuntu installations they seemed overly large. One cause to the problem was package updates which were left on the disk image even if one blew away /var/cache/apt/archives/ and other files. Since removing a file doesn't "physically" remove its contents from disk, the contents were still part of the virtual disk image. Hence zerofile was born (one of the tools).

Another idea was integrating a run of zerofile into dpkg so that whenever a package would be upgraded or removed, the files belonging to that package would first be filled with binary zero and only then unlinked. I assumed that dpkg must have some external hook/script mechanism for this, but alas, no. Only apt has hooks and the mechanism was generally unsuitable for zerofile. Also administrators could still avoid zerofile by using dpkg directly. Another use for zerofile would have been localepurge, and integrating that too would have been very very tricky. Solution? A wrapper shared object that overrides the operation of the unlink-library call. Implementing this took couple of hours and was fun even. Cleaning up and handing all the error cases however took much longer (as always). End result zerounlink(.so). It is generic enough to work with dpkg and rm and probably anything that unlink()s files (all programs on UNIX/Linux).

Since using LD_PRELOAD manually each time one wants to use the wrapper object was a bit tedious, using a wrapper script was the next choice. However, this meant that each time a program would start, a shell would start just for that occasion and a lot of startup files would be read and executed and it would take extra time. This lead to development of zerounlink-wrapper, a sneaky program that will execute target programs via the wrapper automatically. One only needs to setup one symlink per target executable and everything else is automatic. It's almost scary. It also safeguards against circular PATH-operations which could happen with wrapper scripts as well as handles the case where LD_PRELOAD already contains existing elements.

Other solutions

Some people suggested just ignoring the whole issue. Those people can keep on ignoring the issue I guess. Other people suggested filling the virtual disk with a file (using dd) full of binary zero. While technically the end result (after removing the file) would be the same, in practice this has couple of problems. First, if the virtual disk is dynamic (storage allocated in host on-demand), the disk will grow to maximum size. This might be very large size too. Second problem is the time and load that the operation will cause and also that the time when disk space is scarce, some services might become unstable. The wrapper solution is "online", doesn't cause additional instability, but does cause some additional load. Since it only ever writes to the removed files, the additional I/O load is dependant on the size of the removed files. Since fine-grained targeting for the wrapper is possible, I don't consider this a big issue.

As for other solutions that do what zerounlink does, I'm not aware of any. There are programs that specialize in rewriting existing files with "random" data in order to make restoring their contents after deletion close to impossible, but they solve a different problem. Also most of them do not run as an LD_PRELOAD object, but instead are regular file-oriented programs (like zerofile).

libtrash is an LD_PRELOAD:able object that implements a "safe recycle bin" for unlink (solves a different problem).

The zum utility in perforate is a file resparser utility. Again, different problem space.

Building and doing simple tests

After downloading the newest relevant release (see below) you will want to build, test and install the tools on your system. The build system is based around a user configurable Makefile which contains comments on which parts affect what. By default, the tools will be installed under /usr/local/bin/ and /usr/local/lib/zerofile/ (the wrapper shared object). Using default paths will probably require root access during installation phase. User local installation is also supported, which makes root access unnecessary (covered below).

Following tools must be available in order to build the software: make, gcc, binutils. You will also need the glibc development headers (required to build any C software, so shouldn't be a big problem).

The wrapper shared object will do a run-time dynamic link against the C library on your system (or more specifically, it should automatically find the library that contains the real unlink code provided by your system). It is also possible to provide a static path (similar to /lib/libc.so.6) in the Makefile if this automatic detection doesn't work. zerounlink will tell you if this is the case and auto-detection fails. If it does fail, I'd appreciate feedback on the exact distribution, glibc version, kernel version and the architecture (and other relevant information).

 1  user@system:~$ tar xzvf zerotools-0.1.2.tar.gz
 2  zerotools-0.1.2/
 3  zerotools-0.1.2/zerounlink.c
 4  zerotools-0.1.2/zerounlink-wrapper.c
 5  zerotools-0.1.2/zerofile.c
 6  zerotools-0.1.2/Makefile
 7  zerotools-0.1.2/COPYING
 8  zerotools-0.1.2/README
 9  zerotools-0.1.2/CHANGELOG
10  user@system:~$ cd zerotools-0.1.2
11  user@system:~/zerotools-0.1.2$ make build
12  (DEP) zerofile.c -> zerofile.d
13  (DEP) zerounlink-wrapper.c -> zerounlink-wrapper.d
14  (DEP) zerounlink.c -> zerounlink.d
15   (CC) zerounlink.c -> zerounlink.o
16   (SO) zerounlink.o -> zerounlink.so
17   (CC) zerofile.c -> zerofile
18   (CC) zerounlink-wrapper.c -> zerounlink-wrapper
19  user@system:~/zerotools-0.1.2$ dd if=/dev/urandom of=testfile bs=2048 count=4
20  4+0 records in
21  4+0 records out
22  8192 bytes (8.2 kB) copied, 0.005386 seconds, 1.5 MB/s
23  user@system:~/zerotools-0.1.2$ ./zerofile testfile
24  zerofile: Zeroed and removed testfile
25  user@system:~/zerotools-0.1.2$ ls -la testfile
26  ls: testfile: No such file or directory

Listing 1: Extracting, building and testing zerotools

Configuration is covered a bit later on, but in the above session you see one example how to test that the zerofile utility works. Should you see any compiler or linker warnings during the build process, please contact the author with a listing of full output from make and relevant system configuration details.

It is possible to override the configuration variables inside the Makefile from the command line. This is mainly useful if you want to build and install the software with non-default settings or want to automate the process without modifying the Makefile.

 1  user@system:~/zerotools-0.1.2$ make build prefix=~/local dochown=no
 2  (DEP) zerofile.c -> zerofile.d
 3  (DEP) zerounlink-wrapper.c -> zerounlink-wrapper.d
 4  (DEP) zerounlink.c -> zerounlink.d
 5   (CC) zerounlink.c -> zerounlink.o
 6   (SO) zerounlink.o -> zerounlink.so
 7   (CC) zerofile.c -> zerofile
 8   (CC) zerounlink-wrapper.c -> zerounlink-wrapper
 9  user@system:~/zerotools-0.1.2$ make install prefix=~/local dochown=no
10  (INSTALL)
11  * (MKDIR) /home/user/local/lib/zerofile/ /home/user/local/bin/
12  * (CP) zerofile zerounlink-wrapper -> /home/user/local/bin/
13  * (CP) zerounlink.so -> /home/user/local/lib/zerofile/
14  user@system:~/zerotools-0.1.2$ make uninstall prefix=~/local dochown=no
15  (UNINSTALL)
16  (RM) /home/user/local/bin//zerofile
17  (RM) /home/user/local/bin//zerounlink-wrapper
18  (RM) /home/user/local/lib/zerofile//zerounlink.so

Listing 2: Building and installing into a user specified directory

The dochown variable was overridden above so that install doesn't try to change the ownership and access mode of resulting files after copying. By default ownership is switched to root.root at install phase, but this of course can be overridden as well.

Upgrading from previous release only requires a rebuild and reinstall. There is no need to setup the symlinks again (described below). If you used additional built-time parameters with previous version, you'll need to use them again in both rebuild and reinstall phases.

Configuring and running zerotools

In the following examples the default Makefile has been used when building and installing. This means that software will be installed under /usr/local/bin/ and /usr/local/lib/zerofile/. Writing to these directories will normally require root privileges, so sudo has been been used and is used each time an action requires root privileges.

If you use a different installation prefix, your results will obviously differ from the examples shown here. Also, you'll have to use the same command line variables that you use during building.

zerounlink-wrapper (covered shortly) uses a hard-coded path to zerounlink. For this reason, you cannot move zerounlink after installing it. If you need to install it to another location, you need to uninstall the previous version, rebuild zerotools with the new settings and install with new settings. This has been done in order to reduce uncertainties in the preloading process that zerounlink-wrapper does.

 1  user@system:~/zerotools-0.1.2$ sudo make install
 2  (INSTALL)
 3  * (MKDIR) /usr/local//lib/zerofile/ /usr/local//bin/
 4  * (CP) zerofile zerounlink-wrapper -> /usr/local//bin/
 5  * (CP) zerounlink.so -> /usr/local//lib/zerofile/
 6  * (CHOWN) root.root
 7  * (CHMOD) r-xr-xr-x
 8  user@system:~/zerotools-0.1.2$ which zerofile
 9  /usr/local/bin/zerofile
10  user@system:~/zerotools-0.1.2$ ls -la /usr/local/lib/zerofile/
11  total 20
12  drwxr-sr-x  2 root root  4096 2007-01-02 13:03 .
13  drwxrwsr-x  5 root staff 4096 2007-01-02 13:03 ..
14  -r-xr-xr-x  1 root root  8516 2007-01-02 13:03 zerounlink.so

Listing 3: Installing and verifying installation

There is also a Makefile target called uninstall. If you used any prefix options when building the software, you'll have to use the same options at install (and uninstall) phase.

Using zerofile is pretty simple. It will process each command line parameter in turn and attempt to zerofill and unlink it if it is not a directory.

 1  user@system:~$ cd /tmp/
 2  user@system:/tmp$ zerofile
 3  USAGE: zerofile path[s]
 4  zerofile will process each file in turn and
 5  overwrite the contents of all regular files specified
 6  with binary zero.
 7  It will unlink all non-directories (even filled regulars)
 8  user@system:/tmp$ dd if=/dev/urandom of=hello.random bs=1234 count=1
 9  1+0 records in
10  1+0 records out
11  1234 bytes transferred in 0.000642 seconds (1921849 bytes/sec)
12  user@system:/tmp$ zerofile hello.random
13  zerofile: Zeroed and removed hello.random
14  user@system:/tmp$ ls -la hello.random
15  ls: hello.random: No such file or directory

Listing 4: Running zerofile

So that one doesn't need to use LD_PRELOAD on the command line each time you want to zerofill unlinked files, we'll configure zerounlink-wrapper. It is a simple wrapper script replacement that makes it simple to select which programs are run with zerounlink. It works by automatically starting the program whose name is used to start zerounlink-wrapper. So, if you have a symbolic link rm which points to the wrapper, it will run rm.

zerounlink-wrapper will process the PATH that it gets when invoked and search for the target executable using the same rules as the shell. It will also skip those PATH entries which would lead it to start itself (so that it doesn't end up in an infinite loop). This additional protection utilizes the block device and inode number of zerounlink-wrapper so that it is not confused by any symbolic links which might lead to aliasing. Unfortunately this feature requires support for /proc/ and makes this program Linux-specific.

An example will hopefully make things clear. An environmental variable ZEROFILE_VERBOSE is used to increase the verbosity level of zerounlink, so that we see what is going on. Normally you would run with lower verbosity (level 2) or no verbosity at all (1 for only errors, 0 for no output even for errors). If ZEROFILE_VERBOSE is not set, it equals to level 0 (no output).

 1  user@system:~/zerotools-0.1.2$ cd /usr/local/bin
 2  user@system:/usr/local/bin$ sudo ln -s zerounlink-wrapper rm
 3  user@system:/usr/local/bin$ ls -la rm
 4  lrwxrwxrwx  1 root root 18 2007-01-02 13:04 rm -> zerounlink-wrapper
 5  user@system:/usr/local/bin$ export ZEROFILE_VERBOSE=3
 6  user@system:/usr/local/bin$ rm foo
 7  zerounlink: SETUP: Running verbose, starting init-phase for process 921)
 8  zerounlink: SETUP: Starting critical
 9  zerounlink: SETUP: (doing init as the first thread)
10  zerounlink: SETUP: Opening libc to setup a pointer to real unlink
11  zerounlink: SETUP: Finding library owning the symbol 'div'
12  zerounlink: SETUP:  Owner: '/lib/libc.so.6'
13  zerounlink: SETUP: Getting address of real unlink
14  zerounlink: SETUP:  Real unlink at 0x4010a850
15  zerounlink: SETUP: Preparing zero-buffer
16  zerounlink: SETUP: Ready for operation
17  zerounlink: Invoked with ('foo')
18  zerounlink: Failed to stat 'foo', proceeding with normal unlink
19  zerounlink: Real unlink returns -1
20  rm: cannot remove `foo': No such file or directory
21  user@system:/usr/local/bin$ cd /tmp/
22  user@system:/tmp$ dd if=/dev/zero of=testfile bs=1024 count=5
23  5+0 records in
24  5+0 records out
25  5120 bytes transferred in 0.000482 seconds (10621218 bytes/sec)
26  user@system:/tmp$ export ZEROFILE_VERBOSE=2
27  user@system:/tmp$ rm testfile
28  zerounlink: ZEROING testfile (5120 bytes)
29  user@system:/tmp$ ls -la testfile
30  ls: testfile: No such file or directory

Listing 5: Configuring and testing zerounlink-wrapper

We next disable the zerounlink for the rm command. This is done on purpose so that you have to decide yourself whether you want to force such functionality on all users of the system when they run rm. It is also important to realize that a lot of background system activity will run ordinary system commands (from within scripts or via the shell), so it is not always easy to foretell exactly which program is executed and by whom. Since programs started by background services will not normally have a terminal at all, no amount of verbosity will be seen from zerounlink, since it writes all messages to stderr.

As zerounlink is generic, enabling or disabling the functionality on a program by program basis is purely an administrative decision. You can still bypass zerounlink by executing commands using their real path (/bin/rm in this case) so in specific cases this shouldn't be a big problem.

It is also possible to force zerounlink on all dynamically linked programs by using a feature of ld.so. Because this is a potentially dangerous operation with unforeseen side effects, it is not documented here. The manual page for the dynamic linker contains the necessary information on preloading specific files for all executables that it will start. Forcing zerounlink on statically linked programs is not possible (since they're not started with ld.so).

1  user@system:/tmp$ cd /usr/local/bin
2  user@system:/usr/local/bin$ sudo rm rm
3  user@system:/usr/local/bin$ ls -la rm
4  ls: rm: No such file or directory

Listing 6: Disabling zerounlink-wrapper functionality for a program

The next examples are purely of informational value. They demonstrate how one would integrate zerounlink (via zerounlink-wrapper) with Debian package management tools. As the tools are used for many other Linux distributions as well (like Ubuntu), this might make it easier to start using zerounlink. Please read and understand the previous examples in order to decide whether zerounlink is really a tool that you want to use. Unless you plan to do block-level backups or delta-comparisons at block level, zerounlink probably is not the tool that you're looking for. It is not an security enhancing tool in any way, since there are already better tools for this.

We aim to integrate apt-get and dpkg to zerounlink. We also integrate localepurge as it is often used and useful utility to go with apt-get. Note that localepurge is normally run automatically on any new package installation via apt-get hook scripts mechanism. There is no hook mechanism for dpkg (and this is partly the cause that zerotools exist now). In a perfect world all of the aforementioned tools would have some flag to enable zerofilling, but adding a very specific feature in all UNIX programs that unlink files at some point seems to be the wrong solution. Most of them already contain rarely used features.

If you don't have localepurge, feel free to skip the commands which are related to it.

 1  user@system:/tmp$ cd /usr/local/bin
 2  user@system:/usr/local/bin$ sudo ln -s zerounlink-wrapper apt-get
 3  user@system:/usr/local/bin$ sudo ln -s zerounlink-wrapper dpkg
 4  user@system:/usr/local/bin$ sudo which localepurge
 5  /usr/sbin/localepurge
 6  user@system:/usr/local/bin$ cd ../sbin
 7  user@system:/usr/local/sbin$ sudo ln -s /usr/local/bin/zerounlink-wrapper localepurge
 8  user@system:/usr/local/sbin$ which localepurge
 9  user@system:/usr/local/sbin$ sudo which localepurge
10  /usr/local/sbin/localepurge

Listing 7: Configuring zerounlink for apt-get, dpkg and localepurge

Since localepurge is normally installed in a directory not included in the regular user PATH, we create the symbolic link in the non-regular place (/usr/local/sbin/). This way we try to mirror the policy decision of the distribution and localepurge won't be readily available to regular users. They don't have the necessary privileges to run it properly anyway.

After "installing" zerounlink for the tools, we take the tools for a spin.

  1  user@system:/usr/local/sbin$ sudo su -
  2  system:~# export ZEROFILE_VERBOSE=2
  3  system:~# echo $ZEROFILE_VERBOSE
  4  2
  5  system:~# apt-get --purge remove nfs-kernel-server
  6  zerounlink: ZEROING /var/cache/apt/pkgcache.bin (5591390 bytes)
  7  Reading Package Lists... Done
  8  Building Dependency Tree... Done
  9  The following packages will be REMOVED:
 10    nfs-kernel-server*
 11  0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
 12  Need to get 0B of archives.
 13  After unpacking 274kB disk space will be freed.
 14  Do you want to continue? [Y/n] y
 15  (Reading database ... 20056 files and directories currently installed.)
 16  Removing nfs-kernel-server ...
 17  Stopping NFS kernel daemon: mountd nfsd.
 18  Unexporting directories for NFS kernel daemon...done.
 19  zerounlink: ZEROING /usr/share/doc/nfs-kernel-server/changelog.Debian.gz (5505 bytes)
 20  zerounlink: ZEROING /usr/share/doc/nfs-kernel-server/changelog.gz (16567 bytes)
 21  zerounlink: ZEROING /usr/share/doc/nfs-kernel-server/NEWS.Debian.gz (277 bytes)
 22  zerounlink: ZEROING /usr/share/doc/nfs-kernel-server/copyright (457 bytes)
 23  zerounlink: ZEROING /usr/share/doc/nfs-kernel-server/README (1162 bytes)
 24  zerounlink: ZEROING /usr/share/man/man8/nfsd.8.gz (703 bytes)
 25  zerounlink: ZEROING /usr/share/man/man8/exportfs.8.gz (2539 bytes)
 26  zerounlink: ZEROING /usr/share/man/man8/mountd.8.gz (1728 bytes)
 27  zerounlink: ZEROING /usr/share/man/man7/nfsd.7.gz (2790 bytes)
 28  zerounlink: ZEROING /usr/share/man/man5/exports.5.gz (7161 bytes)
 29  zerounlink: ZEROING /usr/sbin/rpc.nfsd (5148 bytes)
 30  zerounlink: ZEROING /usr/sbin/rpc.mountd (60280 bytes)
 31  zerounlink: ZEROING /usr/sbin/exportfs (35928 bytes)
 32  zerounlink: ZEROING /var/lib/dpkg/info/nfs-kernel-server.postinst (577 bytes)
 33  zerounlink: ZEROING /var/lib/dpkg/info/nfs-kernel-server.prerm (265 bytes)
 34  zerounlink: ZEROING /var/lib/dpkg/info/nfs-kernel-server.conffiles (74 bytes)
 35  zerounlink: ZEROING /var/lib/dpkg/info/nfs-kernel-server.md5sums (1050 bytes)
 36  Purging configuration files for nfs-kernel-server ...
 37  zerounlink: ZEROING /etc/exports (127 bytes)
 38  zerounlink: ZEROING /etc/default/nfs-kernel-server (88 bytes)
 39  zerounlink: ZEROING /etc/init.d/nfs-kernel-server (2356 bytes)
 40  zerounlink: Failed to stat '/var/lib/nfs/etab', proceeding with normal unlink
 41  zerounlink: Failed to stat '/var/lib/nfs/rmtab', proceeding with normal unlink
 42  zerounlink: Failed to stat '/var/lib/nfs/xtab', proceeding with normal unlink
 43  zerounlink: ZEROING /var/lib/dpkg/info/nfs-kernel-server.list (74 bytes)
 44  zerounlink: ZEROING /var/lib/dpkg/info/nfs-kernel-server.postrm (192 bytes)
 45  zerounlink: ZEROING /var/lib/dpkg/status-old (195484 bytes)
 46  zerounlink: ZEROING /var/lib/dpkg/updates/0000 (951 bytes)
 47  zerounlink: ZEROING /var/lib/dpkg/updates/0001 (985 bytes)
 48  zerounlink: ZEROING /var/lib/dpkg/updates/0002 (984 bytes)
 49  zerounlink: ZEROING /var/lib/dpkg/updates/0003 (982 bytes)
 50  zerounlink: ZEROING /var/lib/dpkg/updates/0004 (954 bytes)
 51  zerounlink: ZEROING /var/lib/dpkg/updates/0005 (891 bytes)
 52  zerounlink: ZEROING /var/lib/dpkg/updates/0006 (767 bytes)
 53  zerounlink: ZEROING /var/lib/dpkg/updates/0007 (767 bytes)
 54  zerounlink: ZEROING /var/lib/dpkg/updates/0008 (767 bytes)
 55  zerounlink: ZEROING /var/lib/dpkg/updates/0009 (109 bytes)
 56  zerounlink: ZEROING /var/lib/dpkg/available-old (174497 bytes)
 57  zerounlink: ZEROING /var/lib/dpkg/updates/tmp.i (4608 bytes)
 58  system:~# apt-get install vlock
 59  Reading Package Lists... Done
 60  Building Dependency Tree... Done
 61  The following NEW packages will be installed:
 62    vlock
 63  0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
 64  Need to get 0B/13.4kB of archives.
 65  After unpacking 86.0kB of additional disk space will be used.
 66  zerounlink: Failed to stat '/var/lib/dpkg/reassemble.deb', proceeding with normal unlink
 67  Selecting previously deselected package vlock.
 68  (Reading database ... 20037 files and directories currently installed.)
 69  Unpacking vlock (from .../archives/vlock_1.3-8_i386.deb) ...
 70  zerounlink: ZEROING control (574 bytes)
 71  zerounlink: ZEROING /var/lib/dpkg/status-old (195548 bytes)
 72  zerounlink: ZEROING /var/lib/dpkg/updates/0000 (124 bytes)
 73  zerounlink: ZEROING /var/lib/dpkg/updates/0001 (650 bytes)
 74  zerounlink: ZEROING /var/lib/dpkg/updates/0002 (643 bytes)
 75  zerounlink: ZEROING /var/lib/dpkg/available-old (174497 bytes)
 76  zerounlink: ZEROING /var/lib/dpkg/updates/tmp.i (4608 bytes)
 77  Setting up vlock (1.3-8) ...
 78  zerounlink: ZEROING /var/lib/dpkg/status-old (194704 bytes)
 79  zerounlink: ZEROING /var/lib/dpkg/updates/0000 (643 bytes)
 80  zerounlink: ZEROING /var/lib/dpkg/updates/0001 (664 bytes)
 81  zerounlink: ZEROING /var/lib/dpkg/updates/0002 (671 bytes)
 82  zerounlink: ZEROING /var/lib/dpkg/updates/0003 (665 bytes)
 83  zerounlink: ZEROING /var/lib/dpkg/available-old (174497 bytes)
 84  zerounlink: ZEROING /var/lib/dpkg/updates/tmp.i (4608 bytes)
 85  system:~# apt-get --purge remove vlock
 86  zerounlink: ZEROING /var/cache/apt/pkgcache.bin (5591390 bytes)
 87  Reading Package Lists... Done
 88  Building Dependency Tree... Done
 89  The following packages will be REMOVED:
 90    vlock*
 91  0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
 92  Need to get 0B of archives.
 93  After unpacking 86.0kB disk space will be freed.
 94  Do you want to continue? [Y/n] y
 95  (Reading database ... 20045 files and directories currently installed.)
 96  Removing vlock ...
 97  zerounlink: ZEROING /usr/share/man/man1/vlock.1.gz (944 bytes)
 98  zerounlink: ZEROING /usr/share/doc/vlock/changelog.Debian.gz (1796 bytes)
 99  zerounlink: ZEROING /usr/share/doc/vlock/README.gz (2990 bytes)
100  zerounlink: ZEROING /usr/share/doc/vlock/changelog.gz (331 bytes)
101  zerounlink: ZEROING /usr/share/doc/vlock/copyright (1544 bytes)
102  zerounlink: ZEROING /usr/bin/vlock (9944 bytes)
103  zerounlink: ZEROING /var/lib/dpkg/info/vlock.conffiles (17 bytes)
104  zerounlink: ZEROING /var/lib/dpkg/info/vlock.md5sums (431 bytes)
105  Purging configuration files for vlock ...
106  zerounlink: ZEROING /etc/pam.d/vlock (42 bytes)
107  zerounlink: ZEROING /var/lib/dpkg/info/vlock.list (17 bytes)
108  zerounlink: Failed to stat '/var/lib/dpkg/info/vlock.postrm', proceeding with normal unlink
109  zerounlink: ZEROING /var/lib/dpkg/status-old (195248 bytes)
110  zerounlink: ZEROING /var/lib/dpkg/updates/0000 (663 bytes)
111  zerounlink: ZEROING /var/lib/dpkg/updates/0001 (691 bytes)
112  zerounlink: ZEROING /var/lib/dpkg/updates/0002 (690 bytes)
113  zerounlink: ZEROING /var/lib/dpkg/updates/0003 (688 bytes)
114  zerounlink: ZEROING /var/lib/dpkg/updates/0004 (666 bytes)
115  zerounlink: ZEROING /var/lib/dpkg/updates/0005 (645 bytes)
116  zerounlink: ZEROING /var/lib/dpkg/updates/0006 (604 bytes)
117  zerounlink: ZEROING /var/lib/dpkg/updates/0007 (604 bytes)
118  zerounlink: ZEROING /var/lib/dpkg/updates/0008 (604 bytes)
119  zerounlink: ZEROING /var/lib/dpkg/updates/0009 (99 bytes)
120  zerounlink: ZEROING /var/lib/dpkg/available-old (174497 bytes)
121  zerounlink: ZEROING /var/lib/dpkg/updates/tmp.i (4608 bytes)

Listing 8: Testing zerounlink with package operations

Since we're using Debian (although the examples will also work on Ubuntu), we next proceed to limit some of the unnecessary files that are normally present. It is important that you understand what the effects of these actions are before you implement them. Disabling cache files for apt-cache will make it (and apt-get) slower, especially when you have many repositories.

In order to stop apt-get and apt-cache from creating the cache files, you'll need to modify your apt configuration. That is not shown here, as it is assumed that you're clever enough to be able to edit files (and make backups).

After editing the apt configuration, we remove the cache files manually using zerofile. The files would normally be recreated, but our new configuration will stop that from happening.

We then proceed to disable source lists (assuming we're not going to need source packages) and then misuse apt-get in order to remove the source list files.

In the last step we finish up by removing all the downloaded package files. Again, you should know what the ramifications of this action are. If in doubt, you shouldn't be doing any of the commands in the next example.

Since we have integrated apt-get with zerounlink, you'll notice that we don't need to use rm at all, but instead use the mechanisms that one would normally use to clean up a Debian system.

 1  system:~# cat /etc/apt/apt.conf.d/99-zerounlink
 2  // disable apt-cache from writing the cache files to disk
 3  // do this only if you have a moderately fast system and
 4  // don't use apt-cache very often. probably not suitable
 5  // for regular desktops at all
 6  Dir::Cache::pkgcache ;
 7  // unless you're doing development work with .deb packages
 8  // you might just as well disable the sources cache as well
 9  // (/etc/apt/sources.list will be modified shortly to comment
10  //  out the source repository entries)
11  Dir::Cache::srcpkgcache ;
12  system:~# zerofile /var/cache/apt/{pkgcache,srcpkgcache}.bin
13  zerofile: Zeroed and removed /var/cache/apt/pkgcache.bin
14  zerofile: Zeroed and removed /var/cache/apt/srcpkgcache.bin
15  system:~# # do this step only if you understand what disabling
16  system:~# # the source cache means.
17  system:~# sed -i 's/^deb-src/#deb-src/' /etc/apt/sources.list
18  system:~# # get rid of the source list files
19  system:~# apt-get update
20  Hit http://ftp.fi.debian.org stable/main Packages
21  Hit http://ftp.fi.debian.org stable/main Release
22  Hit http://security.debian.org stable/updates/main Packages
23  Hit http://security.debian.org stable/updates/main Release
24  Reading Package Lists... Done
25  system:~# ls -la /var/lib/apt/lists/*source*
26  ls: /var/lib/apt/lists/*source*: No such file or directory
27  system:~# du -s /var/cache/apt/archives/
28  14224   /var/cache/apt/archives/
29  system:~# apt-get clean
30  system:~# du -s /var/cache/apt/archives/
31  12      /var/cache/apt/archives/

Listing 9: Finishing up integration with Debian

That pretty much covers the normal operation of zerotools. More examples might be added with time and user participation (only examples captured from real sessions please).

Bugs and limitations

There are no known bugs. Known to me at least. Bugs will be fixed based on intelligent reporting by end users or automatically if I have extra time. My email address is at the start of each source file should you feel the need to send patches.

Thread-protection is not yet tested properly. Building or running zerotools on older Linux systems has not been attempted. User participation in testing is also appreciated.

Adding LFS support in 0.1.1 exposed a bug in zero filling write loop that has been fixed in 0.1.1 (0.1 has the bug).

Version 0.1.2 fixes a "mis-feature" in which writes to unlinked files are never written to disk if all/some the writes are still in the Linux kernel page cache. This means that disk space wasn't actually being zeroed out when operating with small files. Version 0.1.2 fixes this by doing an explicit fdatasync before the final close. This also means that the tools will now run slower (or rather, run at the speed that they were supposed to run originally). /me goes looking for a brown paper bag.

Version 0.1.2 adds a chmod for files which zerotools fail to open with O_WRONLY. In these cases the previous versions couldn't fill the files with zeroes. This exposes a small window of attack into the file contents (write direction only) from external processes. However, all other access is disabled (the bits are 0222 if this code path is taken) and the file is removed shortly after the chmod (i.e. the attack vector is not useful).

After 0.1.2 fixes a virtual machine (Feisty/x86-server install, with full system upgrade and old-kernel removal and apt-get clean) compresses properly. Without zerotools the compressed (tar czvf) vm is 328 MiB. With zf-integration it is 167 MiB.

Not using RTLD_NEXT with dlsym is not a bug (implementing both ways would have been ideal but not really critical).

Not using O_DIRECT is not a bug. Since 0.1.2 fdatasync is used for each file and it is more suited for synchronization.

Testing has been done on many Linux distributions running kernels (2.4 and 2.6) and with glibc versions (2.3.2 to 2.5.3, with vendor patches, so this is advisory information only). Architectures that were tested are x86, x86_64, ia_64, and ppc (32-bit). Other architecture test results are welcome.

zerounlink and zerounlink-wrapper do not work with statically linked executables. This is a limitation in the technique and there are no workarounds (known to me). Pushing zerounlink functionality into kernel would be one possible solution (see below why this hasn't been done).

One mis-feature exists in zerounlink: when checking whether a file should be zeroed or not, zerounlink doesn't take into account sparse files. If the file is sparse, it would make sense to overwrite only the parts of file which have non-zero content. This means that zerounlink would have to read through all of the original file and search for the non-zero content first. This would also mean that instead of only writing to the unlinked file, zerounlink would have to read it in as well. This would cause extra I/O operations. Not sure whether special sparse-file support is worth the extra effort and complexity so this feature has not been implemented. Drop me a note if you think you need it. (The reason for having this feature is to do with how most dynamic virtual disks are implemented. They will allocate real space when any data is written to an vdisk area, even if the writes consist of only binary zeroes. It's a bit silly, but that's how most of them work. This is the same reason why filling the filesystem with a file containing only binary zeroes (with dd or other tool) is not really a good solution to clean up the vdisk.)

When reporting bugs about listings on this web page, please include the listing number and line number of the problematic bit to speed up fixes. For other web-related fixes, try to include two to three word phrase which can be grepped with. Do not send diffs against xhtml.

Problem cases

There is one scenario where you should avoid using zerounlink (or any LD_PRELOAD-based unlink-triggered action). This is a corner case that has to do with guaranteed BSD VFS-based operation semantics. Namely: a single inode can be held open by a process even when its on-disk representation is unlinked. This is sometimes used by programs to create a disk-backed shared memory segments and after all processes have mapped the region, the on disk backing up is "removed" by unlinking it. Of course the on-disk area won't be actually deallocated, but there won't be any name that one can use to access it. The same happens when you rm a logfile while a daemon is still keeping it open in append mode: the file doesn't actually disappear anywhere, but you lose the name that you can access the contents with.

Since zerounlink will overwrite the old contents on unlink-action, it will cause problems for processes which rely on the above semantics. zerounlink will happily go on and fill the current file contents with binary zero, then do the unlink, but the process will still keep the target file open. For a logfile this means that previous log entries are replaced with binary zeroes (not that critical) but for processes that actually read the file later, it will cause data corruption. Fixing this properly would require integrating zerounlink functionality at kernel level and after spending couple of nights on the subject I decided to "move along". There is no easy way to associate file object closures to specific processes in Linux (easy in this case would be fast as well). There are slow ways of doing it but that would mean that using zerounlink would slow down each unlink operation in the system, even if one only wants to target the zero filling to specific processes (or programs). Some time was spent on discussing what would be the best way of doing "in kernel block device area releasing" in which case the virtualizing environment would get signals that specific ranges of blocks are no longer required by the virtual machine. Even in this case associating the released blocks based on file object closes becomes a problem that cannot be fixed without modifying each filesystem driver separately (i.e. no generic VFS-level solution exists that I know of).

Seeing as zerounlink works for me (and a number of other people), I leave the in-kernel integration to other people who might be able to solve the above problem properly. I only use zerounlink with rm and apt so the corner case isn't a problem for me. You've been warned however :-). The above case is the reason why you shouldn't force the zerounlink wrapper to be used with all loaded programs and this was the original reason why system wide integration of zerounlink isn't documented.

Obtaining the source code

For release tar-balls please consult the release directory. The most recent changelog is also included there.

You can also browse the source code online.

If you're bothered by the amount of comments in the code, feel free to remove them from your own copy. The comments exist in order for the source code to be of some educative value. Same probably goes with this page, although I will be positively surprised if someone finds the software useful.