HAML Caching CGI

Mike Zillion asked about how to make HAML a processor (of haml files) for Apache on the HAML Group on Google. That inspired me to write a proper wrapper with caching that will Hamlize templates into HTML and cache those for speedy access on subsequent requests.

This is what I came up with:

#!/bin/env ruby

exit if ARGV[0].nil?
exit unless File.exists?(ARGV[0])

CACHE_DIR_NAME='cache'

haml_file = ARGV[0]
haml_time = File.stat(haml_file).mtime

html_file = CACHE_DIR_NAME + '/' + haml_file.sub(/aml$/,'tml')
if File.exists?(html_file)
  html_time = File.stat(html_file).mtime

  if html_time > haml_time
    output = File.read(html_file)
  end
end

if output.nil?
  require 'rubygems'
  require 'haml'
  template = File.read(haml_file)
  haml_engine = Haml::Engine.new(template)
  output = haml_engine.to_html()

  # cache the output
  Dir.mkdir(CACHE_DIR_NAME) unless File.directory?(CACHE_DIR_NAME)
  html_file_io = File.open(html_file,"w")
  html_file_io.print(output)
  File.utime(Time.now, haml_time, html_file)
end

# unbuffer output
$stdout.sync = true

require 'cgi'
ENV['SERVER_SOFTWARE'] ||= 'not set'
cgi = CGI.new('html3')
print cgi.header(
        'type'     => 'text/html',
        'charset'  => 'UTF-8',
        'length'   => output.length,
        'server'   => ENV['SERVER_SOFTWARE'],
        'expires'  => Time.now + 10*3600*24, # 10 days
        'Pragma'   => 'no-cache',
        'Last-Modified' => haml_time,
        'Cache-Control' => 'no-cache'
)
print output

And as Mike suggested, adding a couple lines to your Apache configuration makes all the difference:

  AddType text/haml .haml
  AddHandler haml-file .haml
  Action haml-file /dev/bin/haml_cache_cgi.rb
  Action text/haml /dev/bin/haml_cache_cgi.rb

4 thoughts on “HAML Caching CGI

  1. I had a hard time getting this to work, so I thought I would share some bits of wisdom I gained.

    First of all, I had a consistent but cryptic host of error messages that all went away the moment I changed haml_cache_cgi.rb to haml_cache_cgi.cgi (well, eventually haml_cache.cgi, ‘cuz the former is repetitive). Note that permissions were always correct, i.e. this was not a permissions issue that I accidentally solved at the same time (they produce similar error behaviors).

    Second, in the script, I needed to either prefix haml_file with ../../, or just put haml_cache.cgi in the root (read: web root, etc) directory. I did the latter, as well as making it a hidden file for later convenience (.haml_cache.cgi).

    Then, another problem was that ARGV[0] is meant to get at the first argument passed to haml_cache.cgi. Thus, I assumed, say, index.haml would make a call along the lines of “>haml_cache.cgi index.haml”. What I found was that haml_cache.cgi was being passed no argument; rather, it seems (on my environment at least) that the requested file is meant to be accessed via ENV[‘REQUEST_URI’] (see http://www.ruby-doc.org/stdlib/libdoc/cgi/rdoc/CGI.html for a full list of cgi-specific environment variables). Thus, I changed ARGV[0] to ENV[‘REQUEST_URI’], and all was almost good. Actually, ENV[‘REQUEST_URI’] gives a preceding slash (ex. “/index.html”), but a little bit of gsub-ing fixed that like so: ENV[‘REQUEST_URI’].gsub(/^//, ”). To sum up, then, your lines 3 and 4 are for me:

    exit if ENV[‘REQUEST_URI’].nil?
    haml_file = ENV[‘REQUEST_URI’].gsub(/^//,”)
    exit unless File.exists?(haml_file)

    (Note to readers: line 8 would be gone too).

    As a final note, I’m hosting on dreamhost (yeah, I know… I’ve actually become a dreamhost deployment ninja (‘cuz you have to), so it’s not a bad staging environment). For those who don’t know dreamhost, they’re dirt cheap shared hosting. Anyways, like most shared hosts, you can’t install gems yourself except locally, so I had to also change “require ‘haml'” to “require ‘/home/username/.gems/gems/haml-1.7.0/lib/haml'”.

    Well, that about sums it up. I spent the better part of my (supposed) work day getting this working, mostly due to lack of understanding (read: when cut’n’paste goes wrong). Hope it helps!

  2. Line 11 has a serious security flaw – directory traversal, which in line 29 can lead to disaster.
    The “output” is an unknown object on line 20, unless there’s a cached file already. Either define it or use defined?(output)
    I’d also remove the Pragma, Cache-control and Expires headers, keeping only the Last-Modified. It makes sense, think about it.

  3. I appreciate the effort, and the concerns that were raised about the security. I would still love to see something like this work. I’ve been completely spoiled by HAML.

  4. I had a similar attept at rendering HAML from within Apache. I was able to do it using a persistent Ruby instance using mod_ruby. I realize this is an old post, but thought you may be interested to see my approach:

Leave a Reply

Your email address will not be published. Required fields are marked *