| ►◄ Reverse Zone | |||||
|
About
Subscribe
Links
Recent posts
2008/03 2008/02 2008/01 2007/12 2007 2006 2005 Complete List of Posts
|
Tue, 08 Jan 2008
Wikia Search Launches, Minus the Unique Features
Wikia Search, the latest Google Killer has launched in Alpha, with much hype. What launched seems to have none of the features that the hype is about. Wisdom of crowds? Social driven social search? Not there in the alpha as far as I can tell. But what is there is interesting. They recycle some well-known components. Good old Grub, a distributed crawler that is now apparently open source, one that is so annoying that webmasters regularly ban it from their web sites, and Lucene/Nutch, a relatively unsophisticated open source search engine. Ho hum, just another amateur search engine start up. But Wikia does some unique things which I quite appreciate. It lets you download the source code for the search engine. And for every search, it lets you peek at most of the calculations and weights that result in the ranking of the web pages. The algorithm is pretty standard tf-idf stuff. But it tells you the term frequencies and the document frequencies it is using. For instance on one page of one of my sites, it had document frequencies like "24", while the ranking of a different site was based on tens of thousands of documents for the same term. It tells you all the factors it considers and all of the weights and exponents. So for instance the tf-idf score of search terms found in the title is raised to the power of 1.5, while the weight in the url is raised to an amazing power of 4, and another power of 2 for the keyword in the hostname. Now this "explain" facility does not explain the entire entire ranking. There are some unexplained differences between the explained and the actual ranking and some ability for community members to participate. Sounds interesting. When I look at the participation so far, it seems pretty idiosyncratic. Lots of open source type sites receive favourable bias. The input is signed, including Jimmy Wales. I decided to give a boost to the site of a complete stranger whose site ranks poorly and looks terrible but has good content, just for fun. Google Killer? Not by a long shot. I wouldn't trust a search engine that is so easy for people like me to manipulate. The algorithms are still too rudimentary to be used in public. It doesn't have the basic protection against SEO techniques and I'm not sure that relying on people with time on their hands to manually re-rank queries is a reliable and scaleable solution. Still, it gives some interesting insights into why some sites rank highly in other search engines. Tags: Wikia Search Search Engines Information Retrieval |
||||