In this post I try to explain about the how the mod_rewrite module of Apache2 works.
For example we are taking here the mod_rewrite part from .htaccess delivered with Drupal, but other CMS’s has something similar in the .htaccess file:
# Various rewrite rules.
<IfModule mod_rewrite.c>
RewriteEngine on
# Rewrite URLs of the form 'x' to the form 'index.php?q=x'._
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !=/favicon.ico
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
</IfModule>
Rewrite engine processing
The rewrite engine loops through the rulesets. A ruleset is the RewriteRule in combination with the RewriteCond. It is not mandatory to have a RewriteCond. When a rule matches, the rewrite engine checks the corresponding conditions. The conditions if the rule has to be executed or not is given by the RewriteCond directive.
So first a RewriteRule match and after that the corresponding RewriteCond? That is the opposite of what you read in the .htaccess file! That is true and good to now when you want to modify the rewrite rule sets.
RewriteRule
The rewrite engine starts with the directive RewriteRule. The arguments to this directive is:
RewriteRule Pattern Substitution In our example we have ^(.*)$ as a pattern. This Pattern means:
^ | start of line anchor |
$ | end of line anchor |
() | makes a backreference of the containing text and puts it in $1 |
. | any single character |
* | 0 or N of the preceding text (N > 0) |
.* | a text with at least one character |
The Substition index.php?q=$1 in our example means:
The part “index.php?q=” is given by Drupal. Drupal likes to have the path to the content in the q argument and puts it with ? in index.php.
The $1 is the path to the content, which is filled by the text between the brackets “(.*)” in the Pattern.
In our example we have also some flags between the try squares []. The [L] means ‘Last rule’. After this rule the rewrite processing has to stop. The [QSA] means ‘Query String Append’. This flag forces the rewriting engine to append a query string part in the substitution string to the existing one instead of replacing it.
RewriteCond
After a match of the RewriteRule, the corresponding conditions are checked.
RewriteCond Teststring_ CondPattern_
In our example we have %{REQUEST_FILENAME} and %{REQUEST_URI} as a Teststring. The %{NAME_OF_VARIABLE} means that it is a server-variable.
The used server-variables:
REQUEST_FILENAME | The full local filesystem path to the file or script matching the request |
REQUEST_URI | This is the part of the URL without the domain name (example: /index.html) |
In the used CondPatterns we have the following items:
! | Inverts the result of the condition |
-d | If the request is for a real directory on the server, execute the coresponding rule |
-f | If the request is for a real file on the server, execute the coresponding rule |
!-d | If the request is for a real directory on the server, **do not** execute the coresponding rule |
!-f | If the request is for a real file on the server, **do not** execute the coresponding rule |
!=/favicon.ico | If the URL contains the path to the favicon.ico file, **do not** execute the coresponding rule |
Making modifications to the rewrite rules
If you want to change the current rewrite behaviour of the .htaccess file, you can modify it. All this rewrite engine modification has to be done after the RewriteEngine on directive and the next . Remember that you can have have multiple RewriteRule directives and that the order of this directives is how it is executed. After a [L] the execution will be stopped.
<IfModule mod_rewrite.c>
RewriteEngine on
...
</IfModule>
Some modifications we are going to describe now.
Remove www for Drupal multi-site
In the default Drupal file is an example to redirect users to the site without the ‘www.’ prefix.
RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]
But this is only for a single domain, example.com. What to do if you have a Drupal mult-site configuration
RewriteCond %{HTTP_HOST} ^www\.(._+)$ [NC]
RewriteRule ^(.*)$ http://%1%{REQUEST_URI} [L,R=301]
The used construction here is that the part of the domain name without ‘www.’ (.+) is backreferenced into %1
Remark 1: This redirection has after the RewriteRule a [L] and execution will be stopped. But the redirection will reopen the URL without the ‘www.’, then will the rule with index.php be executed.
Remark 2: If you use this multi-site redirect in a subdirectory like example.com/drupal, place the RewriteBase above this directives.
Redirection after you remove the language prefix
When you had a language prefix after you domain-name and you want to redirect this prefix to the root domain. For example a redirect from example.com/en to example.com
RewriteCond %{REQUEST_URI} ^/en/ [OR]
RewriteCond %{REQUEST_URI} ^/cs/ [OR]
RewriteCond %{REQUEST_URI} ^/nl/
RewriteRule ^../(.*)$ http://example.com/$1 [L,R=301]
Flag [OR]: RewriteCond OR-ing
The documentation of mod_rewrite can you find here: httpd.apache.org/docs/2.2/mod/mod_rewrite.html