macOS / OS X disk compression

While reading about AppNap and the fun tricks Apple employs to improve battery life, I happened across something I never knew: the HFS+ filesystem supports transparent compression. It seems like Apple’s intention was for this to be used for shipping system files, and not applied to user files, as evidenced by the fact that it’s virtually impossible to even figure out if it’s enabled, much less a checkbox to simply turn it on for the filesystem.

But, enter afsctool. (Also available through Homebrew.) Wanting to try it out, I for some reason decided to try to compress the 1.5GB git repo I spend most of my time in at work. Let’s look at compression stats (-v, indirectly) for PHP files (-t php) and then the whole directory:

$ afsctool -vt php .
/Users/Matthew.Wagner/workrepo/.:

File content type: public.php-script
File extension(s): php
Number of HFS+ compressed files: 4895
Total number of files: 4904
File(s) size (uncompressed; reported size by Mac OS 10.6+ Finder): 22705078 bytes / 34.8 MB (megabytes) / 33.2 MiB (mebibytes)
File(s) size (compressed - decmpfs xattr; reported size by Mac OS 10.0-10.5 Finder): 1616066 bytes / 2.1 MB (megabytes) / 2 MiB (mebibytes)
File(s) size (compressed): 5948389 bytes / 6.4 MB (megabytes) / 6.1 MiB (mebibytes)
Compression savings: 73.8%
Approximate total file(s) size (files + file overhead): 9050906 bytes / 9.1 MB (megabytes) / 8.6 MiB (mebibytes)

Number of HFS+ compressed files: 17951
Total number of files: 21797
Total number of folders: 3706
Total number of items (number of files + number of folders): 25503
Folder size (uncompressed; reported size by Mac OS 10.6+ Finder): 1697530893 bytes / 1.75 GB (gigabytes) / 1.63 GiB (gibibytes)
Folder size (compressed - decmpfs xattr; reported size by Mac OS 10.0-10.5 Finder): 1615203435 bytes / 1.62 GB (gigabytes) / 1.51 GiB (gibibytes)
Folder size (compressed): 1628829083 bytes / 1.64 GB (gigabytes) / 1.52 GiB (gibibytes)
Compression savings: 4.0%
Approximate total folder size (files + file overhead + folder overhead): 1648262029 bytes / 1.65 GB (gigabytes) / 1.54 GiB (gibibytes)

The results really shouldn’t be surprising. The .git/ files are already stored compressed, so there’s not much to be gained there, hence an overall savings of only 4% in the repo. PHP files averaged a 73.8% reduction in size thanks to compression… Saving me approximately 30MB on a 512GB disk. Hardly worthwhile, and I have to imagine this is going to come bite me down the road. (Why would I even think that compressing stuff I use hundreds of times a day was a good idea?!) afsctool -d will de-compress a directory, though, so, assuming you don’t corrupt anything, it’s easy enough to roll back things that you compress for no good reason. (You could also use something like ascftool -c -s10 dirname/ to skip files unless they could be compressed more than 10%, to avoid doing the nonsense I did and “compressing” already-compressed files.)

As I referenced in the beginning, I see no way to enable this for newly created files. You can compress the contents of an existing directory, but new stuff written there won’t benefit. It’s possible I haven’t found it yet, but it really feels like compression was intended for Apple to use at install-time.

With that in mind, here was a perhaps-more-reasonable thing to do:

# afsctool -s 10 -vc /Applications/Adobe\ Photoshop\ CC\ 2015
/Applications/Adobe Photoshop CC 2015:
Number of HFS+ compressed files: 7286
Total number of files: 8218
Total number of folders: 2499
Total number of items (number of files + number of folders): 10717
Folder size (uncompressed; reported size by Mac OS 10.6+ Finder): 1887395578 bytes / 1.91 GB (gigabytes) / 1.78 GiB (gibibytes)
Folder size (compressed - decmpfs xattr; reported size by Mac OS 10.0-10.5 Finder): 1030642518 bytes / 1.04 GB (gigabytes) / 991.6 MiB (mebibytes)
Folder size (compressed): 1035466738 bytes / 1.04 GB (gigabytes) / 996.2 MiB (mebibytes)
Compression savings: 45.1%
Approximate total folder size (files + file overhead + folder overhead): 1049476316 bytes / 1.05 GB (gigabytes) / 1000.9 MiB (mebibytes)

I shrunk Photoshop by 45%. Note that some things, like Apple’s iMovie, are already compressed. (Also: why is iMovie on my work laptop?!) It seems like most non-Apple applications are uncompressed, and on average drop somewhere close to 50% in size.

Of course, it’s very clear that, armed with this new hammer, every directory is a nail. Compressing rarely-used applications like Photoshop is probably reasonable; attempting to compress my working git repo was clearly not. Still, a neat tool for your toolbox.

5 indispensable bash tricks

Don’t mind the lame Buzzfeed title… Here are a few handy bash tricks and tips that people either use every day or never knew existed. Hopefully I can help move some of you into the first camp!

Introductory notes

A few of these commands involve working with bash history. On the advice of a coworker, I dropped this in my .bashrc to keep tons of history:

HISTSIZE=100000 # keep 100k commands in a session history (memory)
HISTFILESIZE=200000 # store 200k commands in my history file (on disk)

Disk space is cheap, as is memory. The number of times (prior to this change) that I wanted a command that had aged out of my bash history is much greater than the number of times I’ve found bash cumbersome because my history file is almost 1MB in size (when I have a 500GB SSD and 16GB RAM in my 2-year-old laptop).

Meta key

A number of bash commands reference a Meta key. In general, on a Mac, the Escape key fills that roll. On Linux, it’s generally the Alt key. You can change that, but if you’ve done so, you don’t need me to tell you about it. My examples will use Esc for these commands, but if you’re on a Linux box, you will likely want to substitute Alt for it.

Esc-. | Insert last argument

Described in the docs as insert-last-argument (M-., M-_), this keyboard shortcut will spit out the last argument to the previous command. On a Mac, the Meta key is Escape; on Linux, it’s often Alt.

Example usage:

$ mkdir -p long/directory/name/that/would_suck_to/type
$ cd Esc .

The Esc + . will be expanded into long/directory/name/that/would_suck_to/type.

Note that Esc + _ is bound to the same function, but is a bit tougher to type.

Ctrl+R | Reverse history search

This one is tough to explain, but magical. Have you ever hit the up arrow a bunch of times to scroll through history, trying to find something you ran recently? Ctrl + r will open up an interactive search, or reverse-i-search in bash parlance.

Recently used vim on a file with a long filename? Press Ctrl + r and start typing vim. The most recent command matching vim will be showed. Keep typing to make your search more specific, or press Ctrl + r again to scroll to the next-newest one. When you find what you want, press Enter to run it, or the right arrow to start moving the cursor through the command. (Or something like Ctrl+E to jump to the end of the line.)

If you want to be really nutty, you can start commenting your commands at the end. vim /etc/X11/xorg.conf # fix video settings will allow Ctrl + r + video to match a search, for example. I’ve been known to throw in random keywords I think I might try looking for later on.

cd – | Return to previous directory

pushd and popd are awesome and you should use them. But sometimes you forget. bash has got your back. cd - will return you to the previous directory you were in. (This is stored in the OLDPWD environment variable.)

git checkout – | Switch back to the previous branch

If you use git, you’ll be delighted to know that it does something similar. git checkout - will check out the previous branch you were on. I’m often bad at cleaning up topic branches, and will git checkout master to do some catching up, and then realize I don’t remember what my topic branch was called. Sure, it would probably take me all of 30 seconds to figure it out, but checking out - is so much easier.

!! | Re-run the previous command

!! will re-run the command you just ran. Why not just hit the up arrow? Because !! can be combined. The most command usage:

$ cat /root/whatever
Permission denied
$ sudo !!
sudo cat /root/whatever
whatever


Hope you learned something useful! What other neat tricks should I know about?

Counting open files by process

A site I host is offline, throwing the error “Too many open files.” The obvious solution would be to bounce the webserver to release all the file handles, but I wanted to figure out what was using all of them and see if I could figure out why they were leaking in the first place.

I had a few hunches, so I ran lsof -p PID on a few of them. But none of them had excessive files open. After a couple of minutes of guessing, I realized this was stupid, and set out to script things.

I hacked this quick-and-dirty script together:


pids = Dir.entries('/proc/').select{|p| p.match(/\d+/)}
puts "Found #{pids.size} processes..."
pfsmap = {}
pids.each do |pid|
files = Dir.entries("/proc/#{pid}/fd").size
cmdline = File.read("/proc/#{pid}/cmdline").strip
pfsmap[pid] = {
:files => files,
:name => cmdline
}
end

puts pfsmap.sort{|a,b| a[files] <-> b[files]}

There’s got to be a better way to get a process list from procfs than regexp matching directories in /proc that are only numeric. But I do that, and, for each process, count how many entires are in /proc/PID/fd and sort by that. So that the output isn’t just a giant mess of numbers, I also read the /proc/PID/cmdline.

This is hardly a polished script, but it did the job — it identified a script that was hitting the default 1024 FD limit. I was then able to lsof that and find… that they’re all UNIX sockets, so it’s anyone’s guess what they go to. So I just rolled Apache like a chump. Oh well. Maybe it’ll help someone else—or maybe someone knows of a less-ugly way to do some of this?

Fixing Gluster’s replicate0: Unable to self-heal permissions/ownership error

I helped recover a Gluster setup that had gone bad today, and wanted to write up what I did because there’s precious little information out there on what’s going on. Note that I don’t consider myself a Gluster expert by any stretch.

The problem

The ticket actually came to me as an Apache permissions error:

[Thu Mar 03 07:37:58 2016] [error] [client 192.168.1.1] (5)Input/output error: file permissions deny server access: /var/www/html/customer/header.jpg

(Full disclosure: I’ve mucked with all the logs here to not reveal any customer data.)

We suspected that it might have something to do with Gluster, which turned out to be correct. This setup is a pair of servers running Gluster 3.0.

We looked at Gluster’s logs, where we saw a ton of stuff like this:

[2016-03-03 15:54:53] W [fuse-bridge.c:862:fuse_fd_cbk] glusterfs-fuse: 97931: OPEN() /customer/banner.png => -1 (Input/output error)
[2016-03-03 16:00:26] E [afr-self-heal-metadata.c:566:afr_sh_metadata_fix] replicate0: Unable to self-heal permissions/ownership of '/customer/style.css' (possible split-brain). Please fix the file on all backend volumes

Those are separate errors for separate files, but both share a good degree of brokenness.

The problem

For reasons I haven’t yet identified, but that I’m arbitrarily assuming is a Gluster bug, Gluster got into a split-brain situation on the metadata of those files. Debugging this was a bit of an adventure, because there’s little information out there on how to proceed.

getfattr

After a lot of digging and hair-pulling, I eventually came across this exchange on gluster-users that addresses an issue that looked like ours. The files appeared the same and to have the same permissions, but Gluster thought they mismatched.

Gluster stores some information in extended attributes, or xattrs, on each file. This happens on the brick, not on the mounted Gluster filesystem. You can examine that with the getfattr tool. Some of the attributes are named trusted.afr.<brickname> for each host. As Jeff explains in that gluster-users post:

The [trusted.afr] values are arrays of 32-bit counters for how many updates we believe are still pending at each brick.

In that example, their values were:

[root at ca1.sg1 /]# getfattr -m . -d -e hex
/data/gluster/lfd/techstudiolfc/pub getfattr: Removing leading '/' from
absolute path names # file: data/gluster/lfd/techstudiolfc/pub 
trusted.afr.shared-client-0=0x000000000000000000000000 
trusted.afr.shared-client-1=0x000000000000001d00000000
trusted.gfid=0x3700ee06f8f74ebc853ee8277c107ec2


[root at ca2.sg1 /]# getfattr -m . -d -e hex
/data/gluster/lfd/techstudiolfc/pub getfattr: Removing leading '/' from
absolute path names # file: data/gluster/lfd/techstudiolfc/pub 
trusted.afr.shared-client-0=0x000000000000000300000000
trusted.afr.shared-client-1=0x000000000000000000000000 
trusted.gfid=0x3700ee06f8f74ebc853ee8277c107ec2

Note that their values disagree. One sees shared-client-1 with a “1d” value in the middle; the other sees shared-client-0 with a “03″ in the middle. Jeff explains:

Here we see that ca1 (presumably corresponding to client-0) has a count of 0x1d for client-1 (presumably corresponding to ca2). In other words, ca1 saw 29 updates that it doesn’t know completed at ca2. At the same time, ca2 saw 3 operations that it doesn’t know completed at ca1. When there seem to be updates that need to be propagated in both directions, we don’t know which ones should superseded which others, so we call it split brain and decline to do anything lest we cause data loss.

Red Hat has a knowledgebase article on this, though it’s behind a paywall.

If you run getfattr and have no output, you’re probably running it on the shared Gluster filesystem, not on the local machine’s brick. Run it against the brick.

Fixing it

Don’t just skip here; this isn’t a copy-and-paste fix. :)

To fix this, you want to remove the offending xattrs from one of the split-brain node’s bricks, and then stat the file to get it to automatically self-heal.

Use the trusted.afr.whatever values. I unset all of them, one per brick—but remember, only do this on one node! Don’t run it on both!

In our case, we had one node that looked like this:

trusted.afr.remote1315012=0x000000000000000000000000
trusted.afr.remote1315013=0x000000010000000100000000

And the other looked like this:

trusted.afr.remote1315012=0x000000010000000100000000
trusted.afr.remote1315013=0x000000000000000000000000

(Note here that the same two values appears on both hosts, but not for the same keys. One has the 1′s on remote1315013, and one sees them on remote1315012.)

Since it’s not like one is ‘right’ and the other is ‘wrong’, on one of the two nodes, I unset both xattrs, using setfattr:

setfattr -x trusted.afr.remote1315012 /mnt/brick1315013/customer/header.jpg
setfattr -x trusted.afr.remote1315013 /mnt/brick1315013/customer/header.jpg

I ran the getfattr command again to make sure the attributes had disappeared. (Remember: this is on the brick, not the presented Gluster filesystem.)

Then, simply stat the file on the mounted Gluster filesystem on that node, and it should automatically correct the missing attributes, filling them in from the other node. You can verify again withgetfattr against the brick.

If this happens for a bunch of files, you can simply script it.