YuviSense: Codin Kid

Yuvi, a 17 year old wannabe geek from India.
  • rss
  • Blog
  • Photography
  • Indians on Twitter
  • About Me
  • Contact

Humor in the Making: Scraping Alexa

February 17, 2007 | 11:52 am

I was looking through Alexa’s HTML, trying to scrape out the Site Rank. And, I found this:

 

<!–Did you know? Alexa offers this data programmatically. Visit http://aws.amazon.com/awis for more information about the Alexa Web Information Service.–><awT@Gf.X2><Email:><budf@opif.org><budf@opif.org>4</budf@opif.org></budf@opif.org></Email:></awT@Gf.X2><RO4><Reach per>3</Reach per></RO4><Reach><tqy><Traffic><DwE@Gg.aB>9</DwE@Gg.aB></Traffic></tqy></Reach><zxja><u5><u5><y3e5>,</y3e5></u5></u5></zxja><tprp>8</tprp><Page Views rank:><Reach per>4</Reach per></Page Views rank:><sss><Rank><RO4><pyp>1</pyp></RO4></Rank></sss></span>

Humor? The mangled, tangled spaghetti of tag soup here would confuse and kick out most HTML parsers and certainly every XML parser, but ofcourse, to the determined Scrapper, Regexes are always there for the rescue.:D

P.S. I would have used the webservice, but it costs. Once I start making money, I’ll gladly pay that, but till then…

Categories
Tech
Comments rss
Comments rss
Trackback
Trackback

« StatBot visits The Old New Thing List of WPF Applications »

3 responses

:~). What can stop a determined geek? Nothing.

Anand Sankaran | February 17, 2007 | 1:45 pm

:~). What can stop a determined geek? Nothing. :)

Happy birthday :D

Aswin Anand | February 18, 2007 | 11:30 am

Happy birthday :D

uobjpwgv wisqnx djfikz zdepsrvt clxi ikgq adhqgzuot

sgczw kmptgfwnr | September 7, 2008 | 3:11 am

uobjpwgv wisqnx djfikz zdepsrvt clxi ikgq adhqgzuot

Leave a comment

You can use these tags : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Contact Me

Email: yuvipanda@gmail.com
IM: yuvipanda@msn.com
GTalk: yuvipanda

My Photos


View my Flickr Page

My Badge


IndiBlogger - Where Indian Blogs Meet

Archives

  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox