UPDATE, 8/31/2010: I'm going with a simplified approach these days, as posted here.
I've worked with Apache mod_rewrite quite a bit in the past, but never got into any terribly complex rewrite rules. On average, I've probably leveraged about 2% of the incredible potential power that mod_rewrite offers. More recently I finally had the opportunity to learn a bit more and leverage this handy tool. My rewrite rules evolved over the past couple months as I got into testing out the Railo CFML engine a bit more, and then while setting up my first Mura CMS powered site. I thought I'd share what I've learned, as others may be likely to use some or all of this type of Apache virtual host configuration.
Here's a run-down of the "end goal," if you will, which I will walk through step by step to recap the problems and solutions I discovered along the way:
- Apache will proxy CFML requests to Tomcat servlet engine, for Railo to handle (while Apache handles all other static content, PHP, etc.)
- Tomcat will properly have Railo handle this common SES style URL (leveraging cgi.path_info): http://host/index.cfm/path-info/
- Protect Railo Admin URLs from public access (well, actually hide them to mitigate hacking attempts)
- Run a Mura CMS site with super SEO friendly URLs (e.g., http://host/a-cms-page/ instead of the default http://host/siteid/index.cfm/a-cms-page/), while still allowing for standard CFML requests (for custom app integration under same host as CMS)
Please note that the following Apache modules must be activated in your Apache configuration: mod_rewrite, mod_proxy and mod_proxy_ajp.
Apache Proxy CFML Request to Tomcat/Railo
I'll simply direct you to existing resources for this first bit and then show my basic example virtual hosting configuration. First, I'm running Railo in a multi-web setup on a single instance of Tomcat, which Sean Corfield has nicely outlined here (including a kind nod to a prior blog post of mine :). I learned a very nice means of proxying CFML requests from Apache to Tomcat thanks to this Sean Corfield post, which was prompted by some helpful comments from Barney Boisvert on Sean's prior post.
So here is an example Apache virtual host:
<VirtualHost *:80>
ServerName railocmstest
DocumentRoot /var/www/railocmstest/webroot
DirectoryIndex index.cfm
<Proxy *>
Allow from 127.0.0.1
</Proxy>
ProxyPreserveHost On
ProxyPassReverse / ajp://railocmstest:8009/
RewriteEngine On
# If it's a CFML (*.cfc or *.cfm) request, just proxy it to Tomcat:
RewriteRule ^(.+\.cf[cm])$ ajp://%{HTTP_HOST}:8009$1 [P]
</VirtualHost>
SES URLs with path_info
Okay, the above gets Apache proxying standard *.cfm/*.cfc requests to Tomcat for Railo to handle, but a request to http://railocmstest/index.cfm/some-path-info/ will simply throw a 404 error, because Apache finds neither file nor directory at /var/www/railocmstest/webroot/index.cfm/some-path-info/. Plus, the current rewrite rule is not even proxying this type of request to Tomcat, because the regular expression used only looks for any URI that ends with ".cfc" or ".cfm".
Simply change this line:
RewriteRule ^(.+\.cf[cm])$ ajp://%{HTTP_HOST}:8009$1 [P]
...to this:
RewriteRule ^(.+\.cf[cm])(/.*)?$ ajp://%{HTTP_HOST}:8009$1$2 [P]
Okay, now if we reload Apache the http://railocmstest/index.cfm/some-path-info/ will indeed be proxied to Tomcat, however, now we just get the 404 error from Tomcat instead of Apache! Not a problem -- I picked up another trick from a comment by Tony Garcia on, again, this Sean Corfield post. In either our Tomcat's conf/server.xml file or in our Web root's WEB-INF/web.xml (you can create one if it doesn't exist) file, add the following servlet mapping:
CFMLServlet
/index.cfm/*
Just be sure the servlet name you use here matches the one used for Railo, which is CFMLServlet by default, but I change mine to GlobalCFMLServlet in my multi-web setup as explained here.
We can now restart our Tomcat service and http://railocmstest/index.cfm/some-path-info/ is now good to go -- no more 404 error and cgi.path_info will now properly return /some-path-info/.
You may be inclined to try a URL pattern of /*.cfm/* in your servlet mapping, as I was, but it will not work as expected. If you also need URLs ending with /other.cfm/path-info/ and /app/index.cfm/path-info/, for example, you'll have to add two more servlet mappings with URL patterns /other.cfm/* and /app/index.cfm/*.
Hide Railo Admin URLs
A simple way to hide your Railo Admin URLs is to use one rewrite rule to forbid access to any paths beginning with /railo-context/admin/ and use some unusual URL to proxy in its place:
RewriteRule ^/railo-context/admin/(.*) - [F]
RewriteRule ^/SOMETHING-DIFFICULT-TO-GUESS/admin/(.*\.cf[cm])$ ajp://%{HTTP_HOST}:8009/railo-context/admin/$1 [P]
After an Apache reload, the above rewrite rules will cause a 403 (Forbidden) error to be thrown at http://railocmstest/railo-context/admin/index.cfm, but you can find what you're looking for at http://railocmstest/SOMETHING-DIFFICULT-TO-GUESS/admin/index.cfm with no problem. In this Railo Google Groop thread Sean Corfield also suggests using a separate virtual host to access your admin, which can listen on an unusual port. You could also use SSL to encrypt transmissions at the admin URLs.
Super SES CMS URLs
"Out of the box," Mura CMS has relatively friendly, SES URLs of the form http://host/siteid/index.cfm/something-friendly/, but we can do better. In the helpful Mura forums I quickly found the information I needed to make very quick changes to the Mura code to get rid of the /siteid/index.cfm portion of the URLs generated. Since there were two steps, in two different threads, plus one other tweak I found necessary, I posted a summary thread to a Mura forum. The rewrite rule example I provide in the form thread is more generic, intended to work without consideration for Tomcat proxying, Railo Admin hiding, etc., so here's what I've actually done for my first Mura CMS powered site...
I found that I needed to cover a few extra use cases. These four situations must be covered:
- If there's a trailing slash and it resolves to an existing physical directory, then we'll assume there should be an index.cfm in there and proxy to Tomcat.
- If it's an existing file, just let Apache serve it -- this takes care of all static and non-CFML requests.
- If rewrite rules are still processing, then we must be looking for a Mura CMS powered URL, which must have a trailing slash, so we'll permanently redirect with an appended trailing slash if it's missing.
- Finally, if rewrite rules are still processing, then we'll just prepend "index.cfm" in front of the REQUEST_URI.
...and here it is:
# If trailing slash and real directory, then append index.cfm and proxy it to Tomcat/Railo:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^(.+/)$ ajp://%{HTTP_HOST}:8009%{REQUEST_URI}index.cfm [P]
# If it's a real file (and we haven't proxied to Tomcat, so it must be static), just serve it:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f
RewriteRule . - [L]
# Require trailing slash at this point:
RewriteRule ^(.+[^/])$ $1/ [R=301,L]
# Everything else must be a CMS URL path, which is rewritten and proxied to Tomcat/Railo:
# MUST COME AFTER ANY OTHER FIXED/EXPECTED REWRITES!
RewriteRule . ajp://%{HTTP_HOST}:8009/index.cfm%{REQUEST_URI} [NE,P]
If you're wondering why I have rewrite conditions with %{DOCUMENT_ROOT}%{REQUEST_URI} instead of %{REQUEST_FILENAME} when checking for file/directory existence, it's because the latter just didn't work for me! All examples I've seen suggest the latter, but it just didn't work -- possibly something Ubuntu-specific, or something that I changed elsewhere in my Apache configuration? I don't know, but what I've used should work on any setup.
UPDATE (2009-06-04)
I've discovered a little bug in Mura CMS regarding the handling of would-be 404 pages, so the last two rewrite rules from above must be changed to be less broad and an additional final rule can be added to still leverage the friendly/skinned Mura 404 page (rather than Apache's default 404):
# Require trailing slash at this point, if otherwise valid CMS URL:
RewriteRule ^([a-zA-Z0-9/-]+[^/])$ $1/ [R=301,L]
# Valid CMS URL path is proxied to Tomcat/Railo:
# MUST COME AFTER ANY OTHER FIXED/EXPECTED REWRITES!
RewriteRule ^([a-zA-Z0-9/-]+)$ ajp://%{HTTP_HOST}:8009/index.cfm%{REQUEST_URI} [NE,P]
# Anything else must be a 404 error:
RewriteRule . ajp://%{HTTP_HOST}:8009/index.cfm/this-will-force-a-404/ [NE,P]
I've also updated the following Complete Package section to reflect these updates...
The Complete Package
So, here is the complete virtual host example, incorporating all of the above:
<VirtualHost *:80>
ServerName railocmstest
DocumentRoot /var/www/railocmstest/webroot
DirectoryIndex index.cfm
<Proxy *>
Allow from 127.0.0.1
</Proxy>
ProxyPreserveHost On
ProxyPassReverse / ajp://railocmstest:8009/
RewriteEngine On
# Forbid access to Railo Admin URLs:
RewriteRule ^/railo-context/admin/(.*) - [F]
# Proxy "secret" Railo Admin URLs to "real" Railo Admin URLs on Tomcat:
RewriteRule ^/SOMETHING-DIFFICULT-TO-GUESS/admin/(.*\.cf[cm])$ ajp://%{HTTP_HOST}:8009/railo-context/admin/$1 [P]
# If it's a CFML (*.cfc or *.cfm) request, just proxy it to Tomcat:
RewriteRule ^(.+\.cf[cm])(/.*)?$ ajp://%{HTTP_HOST}:8009$1$2 [P]
# If trailing slash and real directory, then append index.cfm and proxy it to Tomcat/Railo:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^(.+/)$ ajp://%{HTTP_HOST}:8009%{REQUEST_URI}index.cfm [P]
# If it's a real file (and we haven't proxied to Tomcat, so it must be static), just serve it:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f
RewriteRule . - [L]
# NOTE: Everything else must be a CMS URL path (letters/numbers/hyphens/slashes only), or a 404...
# Require trailing slash at this point, if otherwise valid CMS URL:
RewriteRule ^([a-zA-Z0-9/-]+[^/])$ $1/ [R=301,L]
# Valid CMS URL path is proxied to Tomcat/Railo:
# MUST COME AFTER ANY OTHER FIXED/EXPECTED REWRITES!
RewriteRule ^([a-zA-Z0-9/-]+)$ ajp://%{HTTP_HOST}:8009/index.cfm%{REQUEST_URI} [NE,P]
# Anything else must be a 404 error:
RewriteRule . ajp://%{HTTP_HOST}:8009/index.cfm/this-will-force-a-404/ [NE,P]
</VirtualHost>
Cheers!