Skip to page content or Skip to Accesskey List.

Work

Main Page Content

Search Engine Friendly Urls With Iis And Classic Asp

Rated 3.89 (Ratings: 0)

Want more?

  • More articles in Code
 

Marcel Feenstra

Member info

User since: 20 Apr 2007

Articles written: 2

A few years ago, I needed a Content Management System (CMS) for my site Voor Beginners and its English counterpart For Beginners. One of the requirements was, that the CMS should use "search engine friendly" URLs. This is fairly easy to accomplish with Linux and Apache; however, another requirement was that the CMS should run on the Windows platform... In this article, I will show how you can "simulate" the effects of .htaccess and mod_rewrite using Microsoft's Internet Information Server (IIS) and classic ASP.

The Problem

A typical CMS stores its content in a database for easy maintenance. When an end user visits a web page, the content for that page must be retrieved from the database so that it can be displayed. So how does the system "know" which database record should be retrieved for a particular page? The answer is that the URL for that page contains a query string or "parameter" that uniquely identifies the content, e.g.:

http://www.example.com/showitem.php?id=12345

In this example, there is only a single parameter ("id"), but it is quite possible to have URLs with two or more parameters. For example, if you have a piece of clothing that is available in different colors and sizes, you could have something like:

http://www.example.com/showitem.php?id=12345&col=12&siz=34

Unfortunately, search engines have a problem indexing URLs with parameters (A.K.A. "dynamic URLs"); and that's especially true for URLs with multiple parameters. Therefore, if we want all of our pages to be indexed, we need a mechanism that hides the parameters from the search engines and turns them into "static" URLs that look something like:

http://www.example.com/showitem/12345/12/34/

Two earlier articles by garrett and bheerssen show how you can achieve the desired effect with Apache using .htaccess and mod_rewrite. However, .htaccess and mod_rewrite are not available if you use IIS, so we need a "trick" to simulate their effect on the Windows platform...

Step One: 404 Error Handler

Let's use the ASP version of the earlier example with three parameters. In other words, when end users request the page:

http://www.example.com/showitem/12345/12/34/

we will act as if they had requested the page:

http://www.example.com/showitem.asp?id=12345&col=12&siz=34

The first thing to notice is that the "parameterless page" does not really exist, so when end users request it, they will actually trigger an error ("404 File Not Found").

This leads us to the idea that we can use a custom error handler to deal with the problem.

To specify a custom error handler for our web site in IIS, we go to the "Custom Errors" tab of your site's Properties. The default handler for the 404 error will be of type "File" and will point to a file called "404b.htm" somewhere in your Windows directory. We click on "Edit" to specify a new error handler. First, we change "Message type" from "File" to "URL". Next, we enter the absolute URL of the ASP file that will act as our 404 error handler; i.e., we enter "/my404.asp" rather than "my404.asp". Finally, we click "OK" to confirm.

We have now stated that there will be a file called "my404.asp" in the root directory of our site that will deal with "file not found" errors, so our next step is to create one.

How do we know which (non-existent) file has been requested by an end user? Fortunately, that is something that we can easily find out by looking at "Request.QueryString".

If someone requests "http://www.example.com/showitem/12345/12/34", Request.QueryString will contain "404;http://www.example.com:80/showitem/12345/12/34", i.e., the error code "404" followed by a semicolon and the requested URL. (By the way, notice that the URL includes the port number ":80"!)

Now all we have to do is "parse" the URL to find the three "hidden" parameters, and then we can "translate" the requested URL into the actual URL that we will send back to the browser.

A first, extremely "naive" version of our code could be something like:

Dim RQ, P, ID, Color, Size

RQ = Request.QueryString

P = Instr(RQ,"showitem/")

If P > 0 Then

RQ = Mid(RQ,P+9) ' The string "showitem/" contains 9 characters!

P = Instr(RQ, "/")

ID = Left(RQ,P-1)

RQ = Mid(RQ,P+1)

P = Instr(RQ, "/")

Color = Left(RQ,P-1)

RQ = Mid(RQ,P+1)

P = Instr(RQ, "/")

Size = Left(RQ,P-1)

Response.Write "ID: " & ID & ", Color: " & Color & ", Size: " & Size

End If

In reality, we would need much better error handling; what, for example, if the URL does not contain the required number of parameters, or if it does not contain a trailing slash?

For the sake of simplicity, we will respond to these cases by sending a status code of 404 to the browser and stop further processing; we'll do the same when someone requests a completely unrelated (non-existent) page (e.g., http://www.example.com/nosuchpage.htm). This can be done with the following code:

Dim RQ, P, ID, Color, Size, ErrorFound

RQ = Request.QueryString

ErrorFound = False

P = Instr(RQ,"showitem/")

If P > 0 Then

RQ = Mid(RQ,P+9) ' The string "showitem/" contains 9 characters!

P = Instr(RQ, "/")

If P > 0 Then

ID = Left(RQ,P-1)

RQ = Mid(RQ,P+1)

P = Instr(RQ, "/")

If P > 0 Then

Color = Left(RQ,P-1)

RQ = Mid(RQ,P+1)

P = Instr(RQ, "/")

If P > 0 Then

Size = Left(RQ,P-1)

Else

ErrorFound = True

End If

Else

ErrorFound = True

End If

Else

ErrorFound = True

End If

Else

ErrorFound = True

End If

If Not ErrorFound Then

Response.Write "ID: " & ID & ", Color: " & Color & ", Size: " & Size

Else

Response.Status = "404 File Not Found"

Response.End

End If

Step Two: Server.Transfer

So far, we have responded to a (well-formed) URL request by displaying the three parameters ID, Color, and Size. In reality, however, we want to return the page:

http://www.example.com/showitem.asp?id=12345&col=12&siz=34

This can easily be accomplished using Server.Transfer:

Server.Transfer "/showitem.asp?id=" & ID & "&col=" & Color & "&siz=" & Size

(We have to make sure, however, that the file "showitem.asp" itself uses absolute, rather than relative, URLs for graphics, style sheets, etc., otherwise it will point to items in a non-existent directory!)

A Flexible Alternative

The example above deals with a single type of page (clothing items) with three parameters (ID, Color, and Size). Of course, we could expand the code so that it can handle different page types and (perhaps variable) numbers of parameters. As a result, we would be able to use URLs like:

http://www.example.com/showbook/9876/

to display information on books (that have no color or size, just an ID), or:

http://www.example.com/showitem/12345/12/34/5/

for clothing items that have a fourth parameter (e.g., material). However, as you can imagine, the required code could quickly get very messy and hard to debug...

As I was thinking about a way to improve upon this idea, the following thought struck me. What if we were to use the entire query string (after some "basic cleaning", perhaps, like converting it to lower case and removing extraneous characters) to retrieve the associated content from a database; something like:

SQL = "SELECT * FROM MyContent WHERE MyTitle = '" & CleanQueryString & "'"

' ...

This would provide us with a very flexible way to display content from our database! While I haven't (yet) implemented this idea myself, it may well be worth exploring... Happy coding!

Marcel holds a Master's degree in Comparative Literature from Utrecht University, one in International Relations from The Fletcher School (Tufts University), and an MBA from The Tuck School of Business at Dartmouth College. He is the owner of AfterImage Internet Consultants in The Hague, a company specializing in search engine optimization for SME in The Netherlands and Belgium.

In his spare time, he works on Voor Beginners, a collection of sites in Dutch containing information about a large (and growing) number of topics, and its English counterpart For Beginners; he is also an editor at the Open Directory Project. Other hobbies and interests include chess, computers, classical music, and literature. Marcel, his wife and their two daughters currently live in The Hague.

The access keys for this page are: ALT (Control on a Mac) plus:

evolt.org Evolt.org is an all-volunteer resource for web developers made up of a discussion list, a browser archive, and member-submitted articles. This article is the property of its author, please do not redistribute or use elsewhere without checking with the author.