►◄ Reverse Zone


Reverse Zone, weblog on urban planning, sustainability, and technology.

Martin Laplante

to an RSS feed of this weblog.

A few favourite links.

Recent posts

 Complete List of Posts

Top Posts

Add to Technorati Favorites

Real Estate Top Blogs

Sustainability Web Ring 
control panel

Tue, 11 Aug 2009

Lateral Semantic Indexing

Like everyone else with a web site, I get articles from reputable sources that explain to me how to make web sites rank highly in the search engines. I find most of them hilarious because so many people make their living entirely from incorrect, misunderstood fourth-hand information about search engine algorithms, and then with a perfectly sincere approach sell you some incantations to protect your web site from the evil eye or put some sort of a spell on it that will fool the jaded Googlebot.

The latest hilarious advice based on a misunderstanding of real algorithms was telling me about "lateral semantic indexing" and telling me to use a lot plurals or related variations of keywords, the more the better. I don't know who first read an article about latent semantic indexing, got the first word wrong and then wrote an article about his misunderstanding of the concept. These people all steal from each other and we get a written form of what used to be called an oral tradition.

I won't give a lesson on the math of LSI, I am sure you can all look up reliable sources. If your eyes glaze over when you read about tf-idf or singular value decomposition, the math breaks down to this: if a page is written with perfectly normal prose using the same words and style that other people are using when discussing a particular topic, the search engine will rank it more highly than if you use an unnatural distribution of words, for instance by repeating the same word over and over again. The same way that you and I can tell a crazy person (or a politician with talking points) apart from a sane person.

It's true that a sane person will sometimes use a term in the plural, with the ratio of plural to singular varying depending on the subject and the exact meaning of the word. So for instance if you are using the term "right" in a document that is discussing the legal context, the plural form will probably come up often. If you're using it in a political sense, there will be no plural even though it's a noun, but words like "conservative" or "left" are likely to be there. So their advice about using other forms as often as possible is very likely to get the search engine to dismiss your text as psychotic ravings.

There is a bit of a problem when everyone starts using SEO techniques. The "average" principal word vectors calculated from the actual internet pages will start to be heavily weighted toward repetition of keywords. If the insane ravings that the SEO pseudo-experts recommend that you use start becoming the norm, then people who write normally will start to seem like the crazy persons from the point of view of search ranking algorithms.


[] permanent link Comments: 0