Skip to main content

Apache mod_rewrite to remove index.php or index.cfm from SES Urls

The subtitle to this post might be "or how I beat my brain into submission to understand what mod_rewrite needs from me."

Background:
The premise of this story is simple:  web guy uses a CMS or framework to make the website or application totally kick butt.  The only downside is that you have URLs that tell the world what technology you are using and it makes for ugly URLs.

For example:

www.mysite.com/myapp/index.php/do/something/awesome

While your app may in fact do something awesome at that URL, it also does something blatantly not-so-awesome, which is, it tells everybody who uses it that you use PHP.  Bad thing? some may say no, so long as you are crazy meticulous about security on your server.  Others may say that anonymity is better.  I am one of the latter.  Better for would-be hackers to not even have a clue what you are using behind-the-scenes.
This same scenario applies whether you are using PHP, java server pages, ColdFusion or Railo or any other scripting language where you pass in Search Engine Safe (SES) URLs for application processing.


The Problem:

So I wanted to write a simple rule with mod_rewrite so that when people go to this URL:

www.mysite.com/myapp/do/something/awesome

The server actually acts like this URL:

www.mysite.com/myapp/index.php/do/something/awesome

(note the desired removal of "index.php")

Further explanation for the confused:  The bottom URL is what currently works with our app, but we really really want to use the top URL.  We want somehow for the server to accept the top URL and interpret as the bottom URL instead.

Turns out that if you don't work with mod_rewrite all the time, it isn't super clear from documentation about exactly what you need to configure to do this.  I wrote countless variations of rules to try to make it work, but only after pulling out a large volume of my own hair did I piece together the things missing from all of the online tutorials.  And yes, you get to derive the benefits of my hair loss regimen.

By the way, the tutorial here (http://www.easymodrewrite.com/) is actually quite good, and before you continue with my post, its worth a read so you understand how to configure the Apache server, etc. which isn't part of my post.  My post just fills in the missing bits that make everything else magically work.  So the site above is pretty good, but still didn't fill in ALL the gaps necessary for me to get a working rewrite rule.

The Super Magic Secret Sauce

Ok, now that you have read the easymodrewrite site and configured your Apache server (you did that right?), now is the time to make it all come together.

I opted to enable overrides so that I can use .htaccess files throughout my site.  I like this idea mainly because I can change settings without restarting the Apache server.  Do whichever makes you feel warm and fuzzy.

Using this URL as the example:

www.mysite.com/myapp/index.php/do/something/awesome

The file we want to remove from the URL ("index.php") resides in the "/myapp" folder, so the changes we want to make are in the same directory.  Note: Don't be daft, if your URL is in a different directory, substitute your own directory, filename, etc.

In the "/myapp" directory, create an ".htaccess" file.

This file should contain the following lines:

RewriteEngine  on
RewriteBase /myapp/
RewriteRule ^(.*)$ index.php/$1 [PT,L]

What that's it??  To quote Tolkien's Lord of the Rings, "It is a strange fate we should suffer so much fear and doubt… over so small a thing."

Yes, these three lines get the job done.  Lets look at each of these lines:

RewriteEngine  on

This turns on the engine that allows URLs to be rewritten.  Yeah, you need it.

RewriteBase /myapp/


Ok, here is the first magic tidbit.  The RewriteBase should match the path of where you are, and it should include the trailing slash, exactly like you see above.  If you are working in the ROOT directory of your site or app, then the RewriteBase should just be "/".

Clarification for the confused:  If you are in the "/myapp" directory, then the .htaccess file is in the same directory, and the RewriteBase also specifies the same directory.

RewriteRule ^(.*)$ index.php/$1 [PT,L]

Ok this last line does the heavy lifting.  This rule is broken into 3 parts:

^(.*)$

This part defines the pattern we are matching -- i.e. what the URL looks like to the outside world.  It is a regular expression, or "regex."  (If you are already confused, head on over to http://blog.themeforest.net/screencasts/regular-expressions-for-dummies/ for some help).  Note that since we defined the RewriteBase to be our directory, now we don't have to calculate that part of the URL anymore.  Now all we care about is what comes AFTER the RewriteBase. 

The "^" char defines the beginning of the URL, and the "$" determines the end.  The parentheses basically capture the entire URL.  This is regex 101 stuff, so if you are lost on that, head back to the URL I mentioned just above for a refresher.

This match rule essentially says "match all URLs" -- so anything that comes in will get caught by this rule.

index.php/$1

The second part of the RewriteRule says how to transform the URL.  The "$1" is a back-reference to whatever was in the parentheses from what we matched.  There was one set of parentheses, so you refer to it as $1.

Side note for those who are completely with me so far:  If you have a matching rule with more than one set of parentheses, then each subsequent parentheses pair is a new back-reference, so $1 for the first, $2 for the second, etc.

This rule says (in plain english): Any URL that comes in, put "index.php/" in front of it.

[PT,L]

This last part was particularly troublesome for me.  You may need to tinker with the combination of flags to work in your particular environment.

PT in this scenario stands for "pass-through" and enables other parts of the Apache server to continuing processing of the request in its altered (modified) form.  The "L" signifies that it is the last rule.  So obviously rules are comma delimited.  For a real brain cruncher, you can look through all the available flags on Apache's site here: http://httpd.apache.org/docs/current/mod/mod_rewrite.html (scroll down to the RewriteRule directive reference).

One Complication:

One thing I haven't gone into is the "RewriteCond" directive.  This post is meant to start you in the right direction, so I would be a hypocrite if I didn't give you extra tools to help you on your way.  In the above example, every request gets processed, including images, css files, directories, etc. which may not be desirable.

The way to handle this is to use RewriteCond to specify the conditions under which your RewriteRule applies.  For that you will need to do some additional research here: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewritecond

Once you have that figured out, you can selectively rewrite URL requests by checking if the requested item is a real file, directory, or match any other criteria.  It is a minor addition to the above once you get over the hurdle of understanding the relationship between the regex rules and working directory.

Putting It All Together:

Once you drop things together, every URL request that comes in will get processed and changed to append "index.php" ahead of the request, and then it is passed through to whatever next process will handle it, thus working transparent to your application.

The real key to getting things to work is understanding how the settings relate to the current working directory, and how URLs are interpreted to match the regular expressions.  After you get it to work once, its not that hard to duplicate.

This technique works great for PHP sites and CMS's, and I've personally used it with Railo and some frameworks such as FW/1, Taffy and Mura.

If this was helpful, let me know about it!

Popular posts from this blog

Making Macbook Air with 128GB SSD usable with Bootcamp

I recently got a new Macbook Air 11" (the 2012 version) and loaded it with goodies like 8GB ram and 2GHz Core i7.  What I DIDN'T upgrade was the internal SSD.  My config came with 128GB SSD and I refused to pay $300+ to upgrade it to 256GB.  Yeah I know, some call me cheap, but SSds cost $75-$150 for 240GB, so adding another 128GB for $300 seemed way too steep for me.  I figured "ok, I'm going to make 128G work!"

Here is the story of how that went...

Installing python 3.4.x on OSX El Capitan

I love "brew" package manager, but sometimes being too progressive breaks things.  I have several python apps that I maintain that get deployed to AWS using Elastic Beanstalk.  AWS eb can deploy with python 2.7 or 3.4.  Any recent 'brew install python3" will get 3.5.1. #annoying

Dell XPS M1330 + Snow Leopard Hackintosh

I have been working with a Dell XPS M1330 laptop for a few years now.  It doesn't quite match up to the newest notebooks in terms of performance, but it certainly still has some life in it.  I had previously installed OSX 10.5.x on it as an experiment, and had moderate success.  I decided to revisit this idea again to install Snow Leopard (OSX 10.6) on the Dell M1330, and keep some notes for those of you brave enough to Hackintosh your own machine...