I don’t know if I actually know enough to write this post. But I want to record what little I do know about this.
The symptom is that my tarragondata volume on this system, tarragon, claims to be out of space. This is a btrfs volume, about which there are other posts. It contains most of the dynamic parts of the system. The root volume ‘/’ is very small, about 20GB. Just enough to install the Centos code and keep a few little things. The great majority of the information needed to run the system is symlinked out of /, which includes /home, mail, databases, websites and their data, the repositories, certificates, local scripts etc.
This is a 180GB disk, and it currently is running about 55% full, i.e. almost 100GB used. Among the information on this disk are snapshots of all the tarragondata, every night for 30 days. This isn’t disaster backup/disk failure backup (which is elsewhere), this is “operator error” backup.
A couple of weeks ago I began to experience a new kind of failure. In the middle of the night, suddenly this btrfs volume would report that it was out of space – usage 100%, although the amount of storage in use was, still the roughtly 100GB that it normally uses. It manifestly was not actually out of space.
When this disk is out of space, everything dies pretty quickly: mysqld, dovecot, postfix, and apache all start throwing errors. Mail delivery stops, people can’t get authenticate into imap, wailing and gnashing of teeth ensues.
I know a little more about this now, though by no means do I claim to have a full understanding. But this has to do with the way that btrfs manages space.
Here is the ordinary system-level df:
[root@tarragon ~]# df /tarragon5data
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvdf1 178256896 97931924 80189804 55% /tarragon5data
But here is what btrfs’ private version of df says:
[root@tarragon ~]# btrfs fi df /tarragon5data
Data, single: total=164.99GiB, used=88.51GiB
System, single: total=4.00MiB, used=48.00KiB
Metadata, single: total=5.01GiB, used=4.56GiB
GlobalReserve, single: total=326.34MiB, used=0.00B
The thing to note is that for the type Metadata, the used is almost all of the total. In addition, the btrfs total usage, shown by the fi show was:
[root@tarragon ~]# btrfs fi show
Label: 'tarragondata' uuid: xxx
Total devices 1 FS bytes used 93.08GiB
devid 1 size 170.00GiB used 170.00GiB path /dev/xvdf1
Notice in this display that btrfs has allocated all of the space on the device for use, and internally it seems that it initially allocated only 5.01GiB for Metadata, and all the rest has been (internally) already allocated to the Data spaces. When it suddenly needs more space for metadata, there is no way to allocate blocks for that, because all the blocks are already in use.
In order to fix this problem, what one has to do is instruct btrfs to free up some of its blocks which are currently assigned to data, to be available for metadata. I encountered two different commands which can help with this – both are variants of the btrfs balance command.
[root@tarragon]# btrfs fi balance start -dusage=5 /tarragon5data
Done, had to relocate 0 out of 172 chunks
[root@tarragon]# btrfs fi balance start -dusage=10 /tarragon5data
Done, had to relocate 10 out of 172 chunks
This tells btrfs to move stuff around de-allocating blocks that are 5% of less full. It freed no blocks at 5% so I did it again at 10%, and it freed some up. This may have done the trick, but I looked at the btrfs fi df, expecting to see a change, and when I didn’t I read some more and found another articles which advised:
[root@tarragon]# btrfs fi balance start -dlimit=3 /tarragon5data
Done, had to relocate 3 out of 163 chunks
The first may actually have done the job, but although the btrfs fi df didn’t change, had I thought to look at the btrfs fi show, and it says:
[root@tarragon]# btrfs fi show
Label: 'tarragondata' uuid: xxx
Total devices 1 FS bytes used 93.07GiB
devid 1 size 170.00GiB used 158.02GiB path /dev/xvdf1
Now btrfs has space available to allocate for metadata.