May 2015



RSS Atom
Powered by InsaneJournal

Apr. 26th, 2012

The Absolute Moron's Guide to Search Engines

Quick, let's do a little thought exercise- When, say, Google returns a list of businesses in the area after you search for, say, “takeout,” how do those results get there? Did some drone at Google put them there, painstakingly searching through the vast void of the Internet just to find your local Chipotle? Of course not, that's ridiculous. That would take ages, and it's not efficent. For IT companies, time is money, and the longer it takes for the user to get their search results back, the more money they lose.

Search engines use specialized programs called web crawlers, or spiders, to trawl through the Series of Tubes looking for keywords and content, based on the search criteria that the user inputs. So if the user isn't really sure what they're looking for, or how to optimize their search, they might not get the exact results they want. Another drawback to the search engine is that it can't read your mind, nor can it read the mind of the website creators. If you want to search for “ice cream” to find a really good ice cream shop, most of the time it wll work. But let's pretend there's a really good ice cream store called Snowflake, just to make this explanation simple. This is the text of Snowflake's website:

Welcome to Snowflake, home of the world's best frozen treats! These milky, creamy confections are a delight to adults and kids alike. Stop by our location in Springfield, USA today and try some!
Tel: 123-456-7890

Notice that the body text of the web site does not include the exact words “ice cream,” although a reasonable person could probably figure out that is what the site was talking about. But a web crawler isn't a person, it's a computer program, and computer programs can't do everything that a person can (duh). However, just because the body content of the website doesn't contain the words “ice cream” doesn't mean that it's not included in the code of the site at all.


Let me explain: the HTML language and it's offspring, XML and XHTML, contain something called “meta tags.” These tags do not appear anywhere on the website at all, but they are very important. These tags are specifically for indexing by search engines. So theoretically, our fictional ice cream shop would be able to never type the words “ice cream” on their website at all, but put the words in their meta tag and still be able to be indexed by Google or whatever.

Here's an example of meta tag code, which goes in the <head> portion of your website. The meta tag codes are in purple.

<title>Snowflake Frozen Confectionery</title>
<meta name="description" content="Springfield, USA's best frozen milk confections">
<meta name="keywords" content="Snowflake, frozen confections, milk, freeze, sweets, confectionery, Springfield, treats, Snowflake shop">

Notice that in the above code, I never actually put the words “ice cream” in there. This, coupled with the lack of those same words in the actual, user-visible content of the page, means that a web crawler would not pick up on this site for a search for ice cream.

Now, the web developers for Snowflake Frozen Confections would be really dumb if they didn't do this, and if they're that incompetent, I kind of doubt that Snowflake Frozen Confections would be in business much longer. But again, this is for demonstration purposes.

What prompted me to write this? In a few days, I've seen about nine different articles bitching about how Siri doesn't find the exact things the phone's owner wants when they want it, and a lot of these issues could be resolved if people actually knew how the technology they're bitching about actually worked.