Google

Monday, October 29, 2007

Search engine friendly URL and navigation

Search engine friendly URL and navigation

A search engine friendly URL doesn't contain a question mark followed by a list of variables and their values. A search engine friendly URL is short and contains the keywords describing the page's content best, separated by hyphens. This does not only help with rankings, it helps visitors and especially bookmarkers too.

However, it's not always possible to avoid query strings. All the major search engines have learned to crawl dynamic pages, but there are limits:

· Search engine spiders dislike long and ugly URLs. They get indexed from very popular sites, but dealing with small web sites spiders usually don't bother fetching the page.

· Links from dynamic pages seem to count less than links from static pages when it comes to ranking based on link popularity. Also, some crawlers don't follow links from dynamic pages more than one level deep.

· To reduce server loads, search engine spiders crawl dynamic content slower than static pages. On large sites, it's pretty common that a huge amount of dynamic pages buried in the 3rd linking level and below never get indexed.

· Most search engine crawlers ignore URLs with session IDs and similar stuff in the query string, to prevent the spiders from fetching the same content over and over in infinite loops. Search engine robots do not provide referrers and they do not accept cookies, thus every request gets a new session ID assigned. Each variant of a query string creates a new unique URL.

· Keywords in variables and their values are pretty useless for ranking purposes, if they count at all. If you find a page identified by the search term in its query string on the SERPs, in most cases the search term is present as visible or even invisible text too, or it was used as anchor text of inbound links.

· There are still search engine crawlers out there which refuses dynamic links.


Thumb Rules On search engine friendly URL's:

· Keep them short. Less variables gain more visibility.

· Keep your variable names short, but do not use 'ID' or composites of entities and 'ID'.

· Hide user tracking from search engine crawlers in all URLs appearing in (internal) links. That's tolerated cloaking, because it helps search engines. Ensure to output useful default values when a page gets requested without a session ID and the client does not accept cookies.

· Keep the values short. If you can, go for integers. Don't use UUIDs/GUIDs and similar randomly generated stuff in query strings if you want the page indexed by search engines. Exception: in forms enabling users to update your database use GUIDs/UUIDs only, because integers encourage users to play with them in the address bar, which leads to unwanted updates and other nasty effects.

Consider providing static looking URLs, for example on Apache use mod_rewrite to translate static URLs to script URLs + query string. Ensure your server does not send a redirect response (302/301) then. Or, on insert of tuples in a 'pages' database table, you can store persistent files for each dynamic URL, calling a script on request.

For example a static URL like http://www.yourDomain.com/nutrition/vitamins-minerals-milk-4711.htm can include a script parsing the file name to extract the parameter(s) necessary to call the output script. In this example the keywords were extracted from the page's title and the pageID '4711' makes the URL unique within the domain's namespace.

The ideal internal link looks like

keyword-phrase


http://www.lifewithoutcolour.blogspot.com/