{"id":1126,"date":"2020-05-23T10:55:31","date_gmt":"2020-05-23T16:55:31","guid":{"rendered":"https:\/\/wmbuck.net\/blog\/?p=1126"},"modified":"2020-05-23T10:55:31","modified_gmt":"2020-05-23T16:55:31","slug":"out-of-space-on-btrfs","status":"publish","type":"post","link":"https:\/\/wmbuck.net\/blog\/?p=1126","title":{"rendered":"Out of space on btrfs"},"content":{"rendered":"\n<p>I don&#8217;t know if I actually know enough to write this post. But I want to record what little I do know about this. <\/p>\n\n\n\n<p>The symptom is that my tarragondata volume on this system, tarragon, claims to be out of space. This is a btrfs volume, about which there are other posts. It contains most of the dynamic parts of the system. The root volume &#8216;\/&#8217; is very small, about 20GB. Just enough to install the Centos code and keep a few little things. The great majority of the information needed to run the system is symlinked out of \/, which includes \/home, mail, databases, websites and their data, the repositories, certificates, local scripts etc. <\/p>\n\n\n\n<p>This is a 180GB disk, and it currently is running about 55% full, i.e. almost 100GB used. Among the information on this disk are snapshots of all the tarragondata, every night for 30 days. This isn&#8217;t disaster backup\/disk failure backup (which is elsewhere), this is &#8220;operator error&#8221; backup. <\/p>\n\n\n\n<p>A couple of weeks ago I began to experience a new kind of failure. In the middle of the night, suddenly this btrfs volume would report that it was out of space &#8211; usage 100%, although the amount of storage in use was, still the roughtly 100GB that it normally uses. It manifestly was not actually out of space. <\/p>\n\n\n\n<!--more-->\n\n\n\n<p>When this disk is out of space, everything dies pretty quickly: mysqld, dovecot, postfix, and apache all start throwing errors. Mail delivery stops, people can&#8217;t get authenticate into imap, wailing and gnashing of teeth ensues. <\/p>\n\n\n\n<p>I know a little more about this now, though by no means do I claim to have a full understanding. But this has to do with the way that btrfs manages space. <\/p>\n\n\n\n<p>Here is the ordinary system-level df: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@tarragon ~]# df \/tarragon5data\nFilesystem     1K-blocks     Used Available Use% Mounted on\n\/dev\/xvdf1     178256896 97931924  80189804  55% \/tarragon5data<\/code><\/pre>\n\n\n\n<p>But here is what btrfs&#8217; private version of df says: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@tarragon ~]# btrfs fi df \/tarragon5data\nData, single: total=164.99GiB, used=88.51GiB\nSystem, single: total=4.00MiB, used=48.00KiB\nMetadata, single: total=5.01GiB, used=4.56GiB\nGlobalReserve, single: total=326.34MiB, used=0.00B<\/code><\/pre>\n\n\n\n<p>The thing to note is that for the type Metadata, the used is almost all of the total. In addition, the btrfs total usage, shown by the fi show was: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@tarragon ~]# btrfs fi show\nLabel: 'tarragondata'  uuid: xxx\n\tTotal devices 1 FS bytes used 93.08GiB\n\tdevid    1 size 170.00GiB used 170.00GiB path \/dev\/xvdf1<\/code><\/pre>\n\n\n\n<p>Notice in this display that btrfs has allocated all of the space on the device for use, and internally it seems that it initially allocated only 5.01GiB for Metadata, and all the rest has been (internally) already allocated to the Data spaces. When it suddenly needs more space for metadata, there is no way to allocate blocks for that, because all the blocks are already in use. <\/p>\n\n\n\n<p>In order to fix this problem, what one has to do is instruct btrfs to free up some of its blocks which are currently assigned to data, to be available for metadata. I encountered two different commands which can help with this &#8211; both are variants of the btrfs balance command. <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@tarragon]# btrfs fi balance start -dusage=5 \/tarragon5data\nDone, had to relocate 0 out of 172 chunks\n&#91;root@tarragon]# btrfs fi balance start -dusage=10 \/tarragon5data\nDone, had to relocate 10 out of 172 chunks<\/code><\/pre>\n\n\n\n<p>This tells btrfs to move stuff around de-allocating blocks that are 5% of less full. It freed no blocks at 5% so I did it again at 10%, and it freed some up. This may have done the trick, but I looked at the btrfs fi df, expecting to see a change, and when I didn&#8217;t I read some more and found another articles which advised: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@tarragon]# btrfs fi balance start -dlimit=3 \/tarragon5data\nDone, had to relocate 3 out of 163 chunks<\/code><\/pre>\n\n\n\n<p>The first may actually have done the job, but although the btrfs fi df didn&#8217;t change, had I thought to look at the btrfs fi show, and it says: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#91;root@tarragon]# btrfs fi show\nLabel: 'tarragondata'  uuid: xxx\n\tTotal devices 1 FS bytes used 93.07GiB\n\tdevid    1 size 170.00GiB used 158.02GiB path \/dev\/xvdf1<\/code><\/pre>\n\n\n\n<p>Now btrfs has space available to allocate for metadata. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>I don&#8217;t know if I actually know enough to write this post. But I want to record what little I do know about this. The symptom is that my tarragondata volume on this system, tarragon, claims to be out of space. This is a btrfs volume, about which there are other posts. It contains most &hellip; <a href=\"https:\/\/wmbuck.net\/blog\/?p=1126\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Out of space on btrfs<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[28,54,47,4],"tags":[],"class_list":["post-1126","post","type-post","status-publish","format-standard","hentry","category-backup","category-btrfs","category-centos","category-linux"],"_links":{"self":[{"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1126"}],"collection":[{"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1126"}],"version-history":[{"count":1,"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1126\/revisions"}],"predecessor-version":[{"id":1127,"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1126\/revisions\/1127"}],"wp:attachment":[{"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1126"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1126"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wmbuck.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1126"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}