Saturday, February 4, 2012

Optimize Your Caching Scheme to Eliminate “Too many links” Errors on Linux

Hello there! If you are new here, you might want to subscribe to the RSS feed to receive updates.
Say you have an application running on a Linux server using the ext2 or ext3 file systems. You set up a caching scheme in your application to store files like so:
/cache/stamps/1/main_content.cache
/cache/stamps/1/comments.cache

/cache/stamps/2/main_content.cache
/cache/stamps/2/comments.cache

/cache/stamps/3/main_content.cache
/cache/stamps/3/comments.cache
...
Eventually, you’re going to run into a problem: Your caching will stop working since your application won’t be able to write to the “stamps” directory any more. Instead, you’ll get an error message like this:
Couldn't create cache directory: /stamps/41134/main_content (Too many links - /var/www/your-app/tmp/cache/stamps/41134)
At that point, you’ll try to search for “too many links”, and probably you won’t find much information that is actually comprehensible to normal human beings. Fortunately, it really is quite simple to repair once you understand what’s going on.

Wikipedia on ext3:
Since ext3 aims to be backwards compatible with the earlier ext2, many of the on-disk structures are similar to those of ext2. Because of that, ext3 lacks a number of features of more recent designs, such as extents, dynamic allocation of inodes, and block suballocation.[9] There is a limit of 31998 sub-directories per one directory, stemming from its limit of 32000 links per inode.[10]
As I understand it, this is what is happening: Each subdirectory in a Linux parent directory contains 2 links: “.” and “..” ext3 can have 32,000 links per inode, but you have to subtract 2 for “.” and “..” – which gives us 32,000 – 2 = 31,998.
“..” is actually a link back to the parent directory (as in “cd ..” to go up one directory). So in our example above, when our /cache/stamps/ directory hits entry number 31999 (/cache/stamps/31999/...), we’ve got big problems. The number of links (“..”) from each numbered subdirectory back to the “stamps” parent directory has now exceeded the maximum allowed by the file system. Oops!
At the very least, we now know one thing for sure: You can’t have more than 31,998 subdirectories in any given directory on an ext3 file system.
Of course, you could just put ALL your cache files into one directory like /cache/stamps/, and forget about all this subdirectory nonsense. That would work a little better, but it isn’t terribly wise in terms of file system performance. Imagine the file system having to find one file in a directory with thousands and thousands of cache files! Bad juju.
You could also simply implement cache expiration via a cron job. The cron job would run a script that would just expire any cache files older than a certain time/date, or perhaps it could get the id of the newest stamp, and then expire all cache files than are not within, say, 10,000 of the latest id number in your stamps table. In both cases, a large number of users on your site may mean that your expiration script won’t run soon enough, and your app will still be back to the same old “Too many links” error. At that point, your caching scheme becomes useless for any stamps that don’t already have cache files, and your site may slow to a crawl anyway! Also bad juju…
If you ask me, systems that expire cache files after a certain period of time even if the content hasn’t been updated are terribly stupid. Once a cache file is generated, it shouldn’t be touched unless the content is updated or deleted. That is the best way to achieve maximum performance for the end user. Besides, why make your server regenerate cache files when it doesn’t have to?!
What to do?
The solution is actually rather clever, and it is apparently used by the likes of Wiki software and postfix.
I’m going to give the solution in Ruby, but it can be implemented just as easily in PHP or any other language.
In order to maximize file system performance and eliminate the 31,998 subdirectory bottleneck, we’re going to create a whole boatload of nested subdirectories in which we put our cache files. This may seem counterintuitive, but bear with me.
First, we use an identifier to generate an MD5 hash. For the stamp with id=31999, we could simply generate the MD5 hash of 31999. Each MD5 hash will consist of 16 bytes of data in hexadecimal format (i.e. 32 characters, each with a value of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, or f).
MD5(31999) = 5547d0f7f6110b2d0572695a61bfdfea
Next, we change our hash into a directory path 16 levels deep, like so:
55/47/d0/f7/f6/11/0b/2d/05/72/69/5a/61/bf/df/ea
We can then use this path to store our cache files. To continue with our stamps example, the path for the cache files for the stamp with id=31999 would be:
/cache/stamps/55/47/d0/f7/f6/11/0b/2d/05/72/69/5a/61/bf/df/ea/
instead of:
/cache/stamps/31999/
Why are we doing this? Well, we don’t want to have ANY directory with anywhere near 31,998 subdirectories. Using the scheme above, each stamp’s cache files will be stored in a subdirectory of /cache/stamps/ that is 16 levels deep. Each of those 16 directories has 256 possible subdirectories itself, from 0×00 to 0xff. Since none of the nested subdirectories will have more than 256 subdirectories themselves, our system will be far, far away from the 31,998 subdirectory limit. And, instead of being limited to only 31,998 stamps, we can now store cache files for 25616 stamps, or 3.403 × 1038. That’s a LOT of stamps!
If that didn’t make much sense, maybe this flowchart will help:
Click to Enlarge
Click to Enlarge
The only requirement we now have is that whenever we’re generating or reading a cache file, we have to be able to find the appropriate path to get to the right cache files. In our original scheme, we just used the stamp’s id number. Now, all we have to do is write a simple function that will take the stamp’s id number, use it to generate an MD5 hash, and then break up the hash into a 16-level deep file system path. In Ruby, you can do this like so:
require 'openssl'

def get_cache_path(id)

# OpenSSL's MD5 algorithm is 10 times faster than Ruby's!!
the_hash = OpenSSL::Digest::MD5.hexdigest("--stamp--#{id}--")
return the_hash.scan(/.{2}/).join('/')
end
First of all, I’ve used OpenSSL’s MD5 algorithm because it’s supposed to be much faster than the MD5 in Ruby. Next, the string passed to OpenSSL’s MD5 algorithm can be anything you want; the important part is that you include the id you pass into the function. That way, each hash generated will be unique and give a unique path to a particular stamp’s cache files. The last line of the function does two things:
  1. scan(/.{2}/) breaks apart the hash using RegEx into an array with 16 elements, each containing 1 byte of data (i.e. 0×55, 0×47, etc.)
  2. join('/')then iterates through the array output by scan, joining each element with a forward slash
Voila! You’ve now got the path to your cache files for a particular stamp’s id number.
If you’re using Rails, you can generate a fragment cache by wrapping your view code like so:
<% cache('stamps/' + get_cache_path(params[:id].to_i) + '/main_content') do -%>
[Your stamp page's main content here!]
<% end -%>
And you can expire the cache in a sweeper – after, say, updating a stamp – like so:
def after_update(stamp)
cachepath = get_cache_path(stamp.id)
expire_fragment("stamps/#{cachepath}/main_content")
end
The only problem with this scheme is if you need to manually delete some cache files for a particular stamp. In that case, you’ll have to manually generate the MD5 hash from the stamp id (or whatever string you used in your get_cache_path method) and then navigate through the 16 levels of subdirectories to find your desired cache files. Obviously, if this happens to you often enough, it might not be a bad idea to create an administrative function that will do all the hard work for you!
Have fun!
UPDATE: One thing I just noticed is that since each cache file is stored in 16 nested subdirectories, and each subdirectory takes up 4kB on disk, you have to remember that a 100kB cache file will now take up 100kB + (16 x 4kB) = 164kB of disk space. That’s not a huge problem with modern storage systems, but it is something to keep in mind if one day you start to run low on disk space…
Another thing to keep in mind is that MD5 is rather processor-intensive – especially in Ruby! If you don’t have a super-powered server and/or your caching scheme is serving up thousands of pages, you may very well have a whole load of MD5 computations bringing your server to a screeching halt. One way around this is to compute the MD5 when each cache file is generated, but not when the cache file is viewed. This will depend entirely on your app. The important point is that how you implement your caching scheme can make a huge difference in server load.

0 comments:

Post a Comment