URL Rewrite Goodies for Apache, Tomcat, Railo and Mura CMS
UPDATE, 8/31/2010: I'm going with a simplified approach these days, as posted here.
I've worked with Apache mod_rewrite quite a bit in the past, but never got into any terribly complex rewrite rules. On average, I've probably leveraged about 2% of the incredible potential power that mod_rewrite offers. More recently I finally had the opportunity to learn a bit more and leverage this handy tool. My rewrite rules evolved over the past couple months as I got into testing out the Railo CFML engine a bit more, and then while setting up my first Mura CMS powered site. I thought I'd share what I've learned, as others may be likely to use some or all of this type of Apache virtual host configuration.
Here's a run-down of the "end goal," if you will, which I will walk through step by step to recap the problems and solutions I discovered along the way:
- Apache will proxy CFML requests to Tomcat servlet engine, for Railo to handle (while Apache handles all other static content, PHP, etc.)
- Tomcat will properly have Railo handle this common SES style URL (leveraging cgi.path_info): http://host/index.cfm/path-info/
- Protect Railo Admin URLs from public access (well, actually hide them to mitigate hacking attempts)
- Run a Mura CMS site with super SEO friendly URLs (e.g., http://host/a-cms-page/ instead of the default http://host/siteid/index.cfm/a-cms-page/), while still allowing for standard CFML requests (for custom app integration under same host as CMS)
Please note that the following Apache modules must be activated in your Apache configuration: mod_rewrite, mod_proxy and mod_proxy_ajp.
Apache Proxy CFML Request to Tomcat/Railo
I'll simply direct you to existing resources for this first bit and then show my basic example virtual hosting configuration. First, I'm running Railo in a multi-web setup on a single instance of Tomcat, which Sean Corfield has nicely outlined here (including a kind nod to a prior blog post of mine :). I learned a very nice means of proxying CFML requests from Apache to Tomcat thanks to this Sean Corfield post, which was prompted by some helpful comments from Barney Boisvert on Sean's prior post.
So here is an example Apache virtual host:
ServerName railocmstest
DocumentRoot /var/www/railocmstest/webroot
DirectoryIndex index.cfm
<Proxy *>
Allow from 127.0.0.1
</Proxy>
ProxyPreserveHost On
ProxyPassReverse / ajp://railocmstest:8009/
RewriteEngine On
# If it's a CFML (*.cfc or *.cfm) request, just proxy it to Tomcat:
RewriteRule ^(.+\.cf[cm])$ ajp://%{HTTP_HOST}:8009$1 [P]
</VirtualHost>
SES URLs with path_info
Okay, the above gets Apache proxying standard *.cfm/*.cfc requests to Tomcat for Railo to handle, but a request to http://railocmstest/index.cfm/some-path-info/ will simply throw a 404 error, because Apache finds neither file nor directory at /var/www/railocmstest/webroot/index.cfm/some-path-info/. Plus, the current rewrite rule is not even proxying this type of request to Tomcat, because the regular expression used only looks for any URI that ends with ".cfc" or ".cfm".
Simply change this line:
...to this:
Okay, now if we reload Apache the http://railocmstest/index.cfm/some-path-info/ will indeed be proxied to Tomcat, however, now we just get the 404 error from Tomcat instead of Apache! Not a problem -- I picked up another trick from a comment by Tony Garcia on, again, this Sean Corfield post. In either our Tomcat's conf/server.xml file or in our Web root's WEB-INF/web.xml (you can create one if it doesn't exist) file, add the following servlet mapping:
Just be sure the servlet name you use here matches the one used for Railo, which is CFMLServlet by default, but I change mine to GlobalCFMLServlet in my multi-web setup as explained here.
We can now restart our Tomcat service and http://railocmstest/index.cfm/some-path-info/ is now good to go -- no more 404 error and cgi.path_info will now properly return /some-path-info/.
You may be inclined to try a URL pattern of /*.cfm/* in your servlet mapping, as I was, but it will not work as expected. If you also need URLs ending with /other.cfm/path-info/ and /app/index.cfm/path-info/, for example, you'll have to add two more servlet mappings with URL patterns /other.cfm/* and /app/index.cfm/*.
Hide Railo Admin URLs
A simple way to hide your Railo Admin URLs is to use one rewrite rule to forbid access to any paths beginning with /railo-context/admin/ and use some unusual URL to proxy in its place:
RewriteRule ^/SOMETHING-DIFFICULT-TO-GUESS/admin/(.*\.cf[cm])$ ajp://%{HTTP_HOST}:8009/railo-context/admin/$1 [P]
After an Apache reload, the above rewrite rules will cause a 403 (Forbidden) error to be thrown at http://railocmstest/railo-context/admin/index.cfm, but you can find what you're looking for at http://railocmstest/SOMETHING-DIFFICULT-TO-GUESS/admin/index.cfm with no problem. In this Railo Google Groop thread Sean Corfield also suggests using a separate virtual host to access your admin, which can listen on an unusual port. You could also use SSL to encrypt transmissions at the admin URLs.
Super SES CMS URLs
"Out of the box," Mura CMS has relatively friendly, SES URLs of the form http://host/siteid/index.cfm/something-friendly/, but we can do better. In the helpful Mura forums I quickly found the information I needed to make very quick changes to the Mura code to get rid of the /siteid/index.cfm portion of the URLs generated. Since there were two steps, in two different threads, plus one other tweak I found necessary, I posted a summary thread to a Mura forum. The rewrite rule example I provide in the form thread is more generic, intended to work without consideration for Tomcat proxying, Railo Admin hiding, etc., so here's what I've actually done for my first Mura CMS powered site...
I found that I needed to cover a few extra use cases. These four situations must be covered:
- If there's a trailing slash and it resolves to an existing physical directory, then we'll assume there should be an index.cfm in there and proxy to Tomcat.
- If it's an existing file, just let Apache serve it -- this takes care of all static and non-CFML requests.
- If rewrite rules are still processing, then we must be looking for a Mura CMS powered URL, which must have a trailing slash, so we'll permanently redirect with an appended trailing slash if it's missing.
- Finally, if rewrite rules are still processing, then we'll just prepend "index.cfm" in front of the REQUEST_URI.
...and here it is:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^(.+/)$ ajp://%{HTTP_HOST}:8009%{REQUEST_URI}index.cfm [P]
# If it's a real file (and we haven't proxied to Tomcat, so it must be static), just serve it:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f
RewriteRule . - [L]
# Require trailing slash at this point:
RewriteRule ^(.+[^/])$ $1/ [R=301,L]
# Everything else must be a CMS URL path, which is rewritten and proxied to Tomcat/Railo:
# MUST COME AFTER ANY OTHER FIXED/EXPECTED REWRITES!
RewriteRule . ajp://%{HTTP_HOST}:8009/index.cfm%{REQUEST_URI} [NE,P]
If you're wondering why I have rewrite conditions with %{DOCUMENT_ROOT}%{REQUEST_URI} instead of %{REQUEST_FILENAME} when checking for file/directory existence, it's because the latter just didn't work for me! All examples I've seen suggest the latter, but it just didn't work -- possibly something Ubuntu-specific, or something that I changed elsewhere in my Apache configuration? I don't know, but what I've used should work on any setup.
UPDATE (2009-06-04)
I've discovered a little bug in Mura CMS regarding the handling of would-be 404 pages, so the last two rewrite rules from above must be changed to be less broad and an additional final rule can be added to still leverage the friendly/skinned Mura 404 page (rather than Apache's default 404):
RewriteRule ^([a-zA-Z0-9/-]+[^/])$ $1/ [R=301,L]
# Valid CMS URL path is proxied to Tomcat/Railo:
# MUST COME AFTER ANY OTHER FIXED/EXPECTED REWRITES!
RewriteRule ^([a-zA-Z0-9/-]+)$ ajp://%{HTTP_HOST}:8009/index.cfm%{REQUEST_URI} [NE,P]
# Anything else must be a 404 error:
RewriteRule . ajp://%{HTTP_HOST}:8009/index.cfm/this-will-force-a-404/ [NE,P]
I've also updated the following Complete Package section to reflect these updates...
The Complete Package
So, here is the complete virtual host example, incorporating all of the above:
ServerName railocmstest
DocumentRoot /var/www/railocmstest/webroot
DirectoryIndex index.cfm
<Proxy *>
Allow from 127.0.0.1
</Proxy>
ProxyPreserveHost On
ProxyPassReverse / ajp://railocmstest:8009/
RewriteEngine On
# Forbid access to Railo Admin URLs:
RewriteRule ^/railo-context/admin/(.*) - [F]
# Proxy "secret" Railo Admin URLs to "real" Railo Admin URLs on Tomcat:
RewriteRule ^/SOMETHING-DIFFICULT-TO-GUESS/admin/(.*\.cf[cm])$ ajp://%{HTTP_HOST}:8009/railo-context/admin/$1 [P]
# If it's a CFML (*.cfc or *.cfm) request, just proxy it to Tomcat:
RewriteRule ^(.+\.cf[cm])(/.*)?$ ajp://%{HTTP_HOST}:8009$1$2 [P]
# If trailing slash and real directory, then append index.cfm and proxy it to Tomcat/Railo:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^(.+/)$ ajp://%{HTTP_HOST}:8009%{REQUEST_URI}index.cfm [P]
# If it's a real file (and we haven't proxied to Tomcat, so it must be static), just serve it:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f
RewriteRule . - [L]
# NOTE: Everything else must be a CMS URL path (letters/numbers/hyphens/slashes only), or a 404...
# Require trailing slash at this point, if otherwise valid CMS URL:
RewriteRule ^([a-zA-Z0-9/-]+[^/])$ $1/ [R=301,L]
# Valid CMS URL path is proxied to Tomcat/Railo:
# MUST COME AFTER ANY OTHER FIXED/EXPECTED REWRITES!
RewriteRule ^([a-zA-Z0-9/-]+)$ ajp://%{HTTP_HOST}:8009/index.cfm%{REQUEST_URI} [NE,P]
# Anything else must be a 404 error:
RewriteRule . ajp://%{HTTP_HOST}:8009/index.cfm/this-will-force-a-404/ [NE,P]
</VirtualHost>
Cheers!
I'm glad you found this post helpful. In my example I am just assuming that index.cfm is the only directory index file. In fact, in my first production implementation like this, I decided not to bother with directory defaults at all -- there are only one or two required, and they're related to accessing an admin, so I'm just explicitly using the index.cfm in my requests. It seemed a waste to have mod_rewrite cause an I/O hit on performance with every single request, when 99% of requests are not calling for a real directory, but just a friendly alias to be rewritten for the CMS.
That said, if you did want to account for other directory defaults, the implementation would be influenced by how many others there are, which handler should process each, and maybe even how often they might be accessed...
For example, if you'd like to allow for both index.cfm and index.htm defaults, and assuming index.htm should just be handled normally by Apache, you could change these lines from my example (also be sure index.htm is in your DirectoryIndex directive):
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^(.+/)$ ajp://%{HTTP_HOST}:8009%{REQUEST_URI}index.cfm [P]
...to:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}index.cfm -s
RewriteRule ^(.+/)$ ajp://%{HTTP_HOST}:8009%{REQUEST_URI}index.cfm [P]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}index.htm -s
RewriteRule ^(.+/)$ - [L]
The above revision would be a good "catch all" approach, but considering performance again, if you have only a couple directories that will contain index.htm files, you'd be better off using an explicit rewrite rule early on, which does not incur an I/O hit. For example, the following would work if you just need to take care of /static1/index.htm and /static2/index.htm:
RewriteRule ^/(static1|static2)/$ - [L]
Hope that helps!
It's usefull if you want to use flex i.c.w. muracms
# flex remoting
RewriteRule ^/(flex2gateway|openamf|flashservices)(.*)$ ajp://%{HTTP_HOST}:8009%{REQUEST_URI} [P]
I have used this conf on win for many months but now on ubuntu simply do not wotk anymore:
<Proxy *>
Allow from 127.0.0.1
</Proxy>
ProxyPreserveHost On
ProxyPassReverse / ajp://localhost:8009/
RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/$1 -f
RewriteRule ^/(railo-context/admin/.*)$ ajp://localhost:8009/$1 [P]
RewriteRule ^/mapping-tag/(.*\.cf[cm]l?)$ ajp://localhost:8009/mapping-tag/$1 [P,QSA]
RewriteRule ^/(.*\.cf[cm]l?)(/.*)?$ ajp://localhost:8009/$1$2 [P]
Ses url are passed to Tomcat that throw a 404 error.....
Any suggestion ??
Andrea
CFMLServlet /index.cfm/*
Just a wild guess.
the issue is that teh servlet mapping is in place ....
<servlet-mapping>
<servlet-name>GlobalCFMLServlet</servlet-name>
<url-pattern>*.cfm</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>GlobalCFMLServlet</servlet-name>
<url-pattern>/index.cfm/*</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>GlobalCFMLServlet</servlet-name>
<url-pattern>*.cfc</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>GlobalCFMLServlet</servlet-name>
<url-pattern>*.cfml</url-pattern>
</servlet-mapping>
So that is so weird.
Andrea
http://www.se3.org/se3/index.cfm/portfolio/
I'm running Apache 2.2 and Tomcat 6.0.20, Railo 3.1.2.001.
It appears you may be missing a proper url-pattern for the Railo CFML servlet-mapping, because your link works fine without any path_info following the index.cfm.
You'll probably need to add a url-pattern for:
/se3/index.cfm/*
Depending upon your Railo deployment, this can go either in (tomcat)/conf/web.xml or (webroot)/WEB-INF/web.xml.
Sean Corfield explains this here:
http://corfield.org/blog/post.cfm/Railo_for_Dummie...
cheers.
Thanks for this very detailed information. I’m not using Mura CMS in my local configuration and the issue I’m having are 1. While accessing http://railotestsite/path-not-exist/" target="_blank">http://railotestsite/path-not-exist/ or http://railotestsite path-not-exist /myimage.img or anything else which are not exist other then .cfm or .cfc files are redirected to the 404errorpage.cfm page as defined in the Rewrite rule but I’m not able to get the CGI.PATH_INFO to log or check which files or paths are accessed. 2. If I try to access a .cfm file which does not exist then I’m getting a Railo – Missing Include error. Is that possible to get the same 404errorpage.cfm here as well whenever we tried to access .cfm or .cfc files which are not exist without using the OnMissingTemplate method?
Here is the virtual host entry.
<VirtualHost *:80>
ServerName railotestsite
DocumentRoot "D:\devsites \railotestsite"
DirectoryIndex index.cfm
ErrorLog "logs/railotestsite-error.log"
CustomLog "logs/railotestsite-access.log" common
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
ProxyPreserveHost On
ProxyPass / ajp://railotestsite:8009/
ProxyPassReverse / ajp://railotestsite:8009/
RewriteEngine On
RewriteRule ^(.+\.cf[cm])(/.*)?$ ajp://%{HTTP_HOST}:8009$1$2 [P]
RewriteRule . ajp://%{HTTP_HOST}:8009/404.cfm [NE,P]
ErrorDocument 404 /404.cfm
</VirtualHost>
I’m using the current stable version of Railo 3.1.2.001.
It might simplify things slightly to just use the 404 template handler in Railo. You can define the template in Railo Web Admin/Settings/Error, or with the cferror tag.
As for cgi.path_info not being available, I'm not too sure why that would be the case. That is generally passed along when using an AJP proxy, but not sure how the 404 situation impacts this.
Another option may be the Application.cfc:onMissingTemplate() method, which was introduced in ColdFusion 8 (http://livedocs.adobe.com/coldfusion/8/htmldocs/he...). I presume it's available with Railo 3.1.2, but I've never used it.
Hope that helps!
Thanks for your reply. Yes we can handle it using the CFError with missingInclude exception type and also on Application.cfc onMissingTemplate method both are working well without any issues. I’m just curious to know that whether we can handle it through Apache/Tomcat without using the 404 error methods on Railo side. I didn't tried the Railo Web Admin settings let me give a try as well on that.
The CGI variables are coming as empty and other Apache specific custom CGI variable as mentioned here http://httpd.apache.org/docs/2.0/custom-error.html... are not defined because it seems as a known issue with Railo 3.1.2.001/Tomcat/Apache configuration and it fixed in the current 3.2.0.001 https://jira.jboss.org/browse/RAILO-1006.
- Akbar