Monday, December 01, 2008

Google Breaks Speed Record for Processing Data Online



Google has recently announced new strides in its race to process the world's information - they've been able to sort 1 terabyte of data across 1 thousand computers in a only 68 seconds. The previous record of the the same data (1TB) was sorted across 910 computers and took 209 seconds. For one petabyte of data, it took Google 6 hours and 2 minutes to do the sorting, and the sorted data gets moved to 48,000 hard drives!

Google also acknowledges that they had triple redundancy and the data was backed up on 3 hard drives. Could it be that if you reduced that redundancy, this would all be much quicker? Not necessarily, since storing may not be part of the sorting algorithm that was recorded (and storing the data could have still been ongoing after the sort had been completed).

What's next? I guess the exabyte would be a huge step towards instant live data sorting...

Friday, May 30, 2008

The Rarest Words - Semantic SEO Project by Lone Russian Programmer

Semantic SEO, a new wave of extreme Skiing? A daring court stenographer? Nope. It's a new website that challenges the way information is categorized.

It's interesting when you come across a beta website like this. The site, therarestwords.com, scans whatever website you place for words it deems opportunistic from a search engine marketing perspective. While these rare words are indeed "rare" and theoretically offer the possibility of being opportunities, the reality may be quite different.

That being said, the site is fun and can be quite useful - most of its usefullness depends on its future applications. My favorite feature is the "SEO fight" where you compare one site to another to see who wins the round for unique and rare words. This is a pretty gamable idea, knowing how SEO's tend to take opportunities like this and make them theirs. Why do I say this? Simple. The site encourages users to add terms that describe the sites or to edit what is already described. Simply click and edit any floating box to tag words and add descriptions, thus boosting "rare" words and adding to the database of terms and descriptions that therarestwords.com has for use on collected websites.

Another reason SEO's will take advantage of this tool is that they have incentive. The top 50 websites with the largest collection of rarest words gets a spot on its homepage and a free link that Google, Yahoo! and MSN adore.

I originally came on this article by techcrunch and got interested enough to try the tools.

I do like the site, and it is impressive that one lone Russian programmer built this, but not quite sure how feasible the concept is just yet. Try it out and let me know what you think. Oh and by the way, the Russian programmer has wanted to lay low without exposing his name or any details about him... Sensationalism? Anticipation or suspense building? Who knows.