A site I host is offline, throwing the error “Too many open files.” The obvious solution would be to bounce the webserver to release all the file handles, but I wanted to figure out what was using all of them and see if I could figure out why they were leaking in the first place.
I had a few hunches, so I ran lsof -p PID
on a few of them. But none of them had excessive files open. After a couple of minutes of guessing, I realized this was stupid, and set out to script things.
I hacked this quick-and-dirty script together:
pids = Dir.entries('/proc/').select{|p| p.match(/\d+/)}
puts "Found #{pids.size} processes..."
pfsmap = {}
pids.each do |pid|
files = Dir.entries("/proc/#{pid}/fd").size
cmdline = File.read("/proc/#{pid}/cmdline").strip
pfsmap[pid] = {
:files => files,
:name => cmdline
}
end
puts pfsmap.sort{|a,b| a[files] <-> b[files]}
There’s got to be a better way to get a process list from procfs than regexp matching directories in /proc that are only numeric. But I do that, and, for each process, count how many entires are in /proc/PID/fd
and sort by that. So that the output isn’t just a giant mess of numbers, I also read the /proc/PID/cmdline
.
This is hardly a polished script, but it did the job — it identified a script that was hitting the default 1024 FD limit. I was then able to lsof
that and find… that they’re all UNIX sockets, so it’s anyone’s guess what they go to. So I just rolled Apache like a chump. Oh well. Maybe it’ll help someone elseāor maybe someone knows of a less-ugly way to do some of this?
I wondered about that before too!
The last time I checked iterating over /proc is the way to do things.
Both, the matching Ruby Gem and even ps seem to just parse /proc
One thing that I always really liked was using wildcards in the Dir constructor.
This would e.g. give you the PID + the amount of FDs
irb(main):008:0> Dir['/proc/*/fd/*'].select{|s| s[/\d+\/fd/]}.group_by{|s| s[/(\d+)\/fd/,1]}.map{|k,v| [k,v.size]}
=> [["1", 40], ["161", 22], ["196", 14], ["260", 16], ["261", 11], ["265", 4], ["763", 22], ["787", 5], ["9383", 11], ["9385", 17], ["9387", 7], ["9392", 4], ["9407", 11]]
irb(main):009:0>