YuviSense: Codin Kid

Yuvi, a 17 year old wannabe geek from India.
  • rss
  • Blog
  • Photography
  • Indians on Twitter
  • About Me
  • Contact

StatBot: Analysing Digg Part II - Sites on the Frontpage

October 2, 2007 | 5:43 pm

Update: Digg this here.

This is Part II of the Analysis of Digg, where I’m going to analyze the sites that are dugg the most. Note that when I say ‘most dugg sites’, I mean the number of times the site has been to Digg’s frontpage. Part I is here.

But before you start

I’m seriously short of processing power, and am not able to run many linguistic analysis because of that. You can help me earn a new computer by buying little tidbits of analysis for your blog (or Digg) from me. Very reasonable, small fees. See the bottom of the post for more information. Thanks folks!

Diversity

The 61,608 stories on the Digg frontpage are spread out around 14,338 sites(or ‘high level domains’, as Jeff Clark (from whom I stole this method) likes to call it) at an average of just 4.29 stories per site. That first looked like incredible diversity to me. And, make no mistake, it is incredible diversity. But, it’s not as diverse as it seems. Here, look at this chart:

image001

The Top 100 (or a ridiculous 0.007%) sites make up 41% of the stories! So, a relatively small number of sites make up a large number of frontpage stories, forming some sort of a “core”, while a very large number of sites(14,338 sites, or 99.993%) contribute the remaining. A good balance, I’d say.

Here’s the chart splitting up sites by the number of stories they have contributed to the front page:

image003

As you can see,71% of sites that make it to Digg’s frontpage make it there only once, while 25% make it 2-10 times. Only 4% ever managed to get to the frontpage more than 25 times, and 1% over 100 times. Excel is being optimistic here: It’s just 79 sites that got more than 100 stories on the frontpage. If your site does make it to the Digg frontpage once, there’s a 71% chance that it won’t go there again :)

Also, here’s another chart, showing how much the sites with just 1 story contribute vs. the bigger ones:

image005

So, those 1% of sites (just 79 sites, actually: Excel is bad at math) which had more than 100 stories on the frontpage, make up 39% of the frontpage, while the 71% of the sites that contributed just one story make up 16% of the frontpage.  On the whole, this is a very fairly divided pie, and things are pretty much very balanced at Digg. The frontpage is not monopolized by a few sites, nor scattered across the web.

  • A total of 14,338 sites have been on Digg’s frontpage at least once.
  • The top 100 sites (just 0.007%) make up 41% of Digg’s frontpage stories.
  • 71% of the sites that make it to the frontpage never get another story on the frontpage.
  • There are a total of 79 sites with more than 100 stories hitting the frontpage
  • Those 79 “mainstream” sites contribute 39% of all of Digg’s frontpage stories.

Top Ten Sites on the frontpage

Here is the list of the top ten sites that made it to the frontpage:

Rank

Site

Stories

1

blogspot.com

1224

2

yahoo.com

1217

3

arstechnica.com

1155

4

engadget.com

897

5

cnn.com

871

6

com.com

811

7

bbc.co.uk

748

8

nytimes.com

731

9

wired.com

646

10

youtube.com

616

 

Note that this is not exactly absolute: For example, all the BlogSpot blogs are glued together! Let’s examine each one in detail:

Hosted Blogs (BlogSpot, Wordpress, Typepad)

As you should’ve guessed by now, it’s ALL the BlogSpot blogs combined together that take the first spot. A total of 789 BlogSpot blogs made it to the frontpage. The top 3 most dugg BlogSpot blogs are the Google Blog, the Google Operating System Blog, and the old Digg Blog. However, most of the BlogSpot stories which are Dugg come from those almost unknown blogs. Here’s the chart showing that:

image007

As you can see, the Top 10 ranked blogspot blogs (Top 13 actually, as 10th place was split between Labnol, GoNext & TopMac (but, with just 9 stories)) contribute only 22% of the total. Diversity here, but it also means that if your BlogSpot blog gets to the frontpage for the first time, then there’s a 78% chance that it has gotten there for the last time as well. This is just a tad higher than the overall just-a-single-story percentage of 71%. Also, each story gets an average of 925 Diggs, which is about 150 diggs higher than the overall average of 763 diggs per site. So, even if you get there only once, you get more than the average number of Diggs!

Here’s a chart comparing Blogger to the other free hosted blogging sites, Wordpress and Typepad. Windows Live Spaces isn’t included because there are only 8 stories ever from MSN Spaces, while MySpace has 11, most of which are to services and announcements about the site rather than actual MySpace pages (I don’t really consider them comparable to Blogger or Wordpress though)

image009

As you can see, BlogSpot is several times as big as Wordpress and Typepad. I think this is primarily because Blogger is older than the other two, while the fact that both Wordpress & MovableType can be self hosted more easily can also contribute this.  However, Wordpress.com blogs get an average of 1005 diggs a post, which is higher than that of BlogSpot, while TypePad gets a much lower 740 Diggs per post. Here’s that comparison chart:

image011

Both Blogger and Wordpress are well above average, while Typepad is just a tad below it. Heck even Seth Godin didn’t make it to the frontpage once!

Yahoo.com

News, which makes up 1015 or 83% of the 1217 stories from Yahoo. The rest are just spread out among Yahoo Business, Yahoo Sports, Yahoo Finance, and a bunch of stuff that Yahoo gave a burst of life to and then promptly forgot(like Yahoo Pipes, which has 3 stories to it).  I can’t really think of any graph to put up here.

has a dismal 643 Diggs per story, well below the average.

ArsTechnica

Ars Technica, with 1,155 stories on the frontpage is actually the individual site with the most number of stories on Digg’s frontpage! Here’s the chart showing from where on ArsTechnica the stories are coming from:

image013

Majority of them come from the News section, while a good part comes from the Journals section as well. Only a small number of featured articles are present here, though that is probably because featured articles aren’t really that frequent. However, it gets a lesser than average 704 diggs per story. Breaking up the Diggs per Story by the section,

image015

Features and Guides get quite lesser than average Diggs per Story, while News and Journals are just about average.

Engadget

Engadget, the most linked to blog in the world, comes in at No 4 with 897 stories with a slightly-lesser-than-average 747 diggs per story. It’s just a tad lower than the average though, I guess just normal variation accounts for it. Diggers do love Gadgets!

Comparison to Gizmodo

Gizmodo, the third most linked to blog, is also pretty much high on the list, at No. 13 with 512 stories with a much higher than average 954 Diggs per story. Means it gets to the frontpage lesser number of times than Engadget, but when it does, it gets more Diggs. Here’s a chart showing this visually:

image017

So, yep, while Gizmodo does get to the frontpage lesser number of times than Engadget, it gets a comparatively larger number of Diggs.

CNN, News.com, BBC, NYTimes, Wired

Mainstream Media. Comparing the large number of stories that these sites actually churn out, the number of them that made it to the Digg frontpage is relatively small. Here’s a chart showing their relative popularity:

image019

CNN.com has the most number of stories, Wired has the highest number of Diggs per Story. This is expected, since Wired is quite a lot geekier than the other ones. However, even Wired has lesser Diggs per Site than the average, and quite a bit lesser than Gizmodo and some of the other high ranking sites (YouTube, for example). Note that the difference is quite negligible in the case of Wired, but pretty big in the others, with News.com a good 160 Diggs below the average. I think it’s just that Digg is more biased towards technology, something which I will analyze in Part III.

YouTube

YouTube made it to the top ten even though I didn’t include the Videos section. It’s got 616 stories at a well-above-average 997 diggs a story! However, most of those are older ones: Excluding the video section, only 54 YouTube videos made it to the frontpage this year. Just 6 made it in September. So, yeah, this is mostly a leftover, as most videos are now posted (correctly) to the videos section.

Still, it beats the competition handsomely. Google Video is the closest, with 200 stories, but MSN Soapbox & Revver has only 1 each! This is just nitpicking though: A better comparison would seek these numbers from the Video section, data for which I unfortunately do not have. If there’s enough interest, I’ll do Digg’s video section separately, okay?

Other Geeky Sites in the Top Fifty 

Rank

Site

Stories

11

google.com

522

12

msn.com

519

13

gizmodo.com

512

16

flickr.com

447

21

techcrunch.com

305

22

kotaku.com

284

23

zdnet.com

283

24

destructoid.com

270

26

apple.com

259

27

joystiq.com

259

28

ign.com

249

30

appleinsider.com

242

33

consumerist.com

229

35

livescience.com

227

36

lifehacker.com

214

40

nasa.gov

193

41

torrentfreak.com

192

44

wordpress.com

184

46

wikipedia.org

175

47

macrumors.com

173

48

tuaw.com

173

Google & MSN

Most of the stories pointing to Google.com come from Google Video, though there are 14 stories linking to just the frontpage, most of which promptly ask the reader to ignore the story link. There’s even one about a “new search engine called Google” and about “all Google servers have crashed”. Also, comments about Digg itself seem to have Google in the Story Link. Also, in what I might call “funny”, a guy offered $100 to a “Random Digger” who diggs the story at a pre-selected position. I don’t really know if he kept up the offer though :) Also, there are a long list of things that Google once started, and then abandoned (Google Pages has 2 stories, Google Pack has 3, etc). Also, surprisingly (at least for me), Google Groups has only 6 stories to it. Even Google Code has a higher number of stories (7 stories). Google.com gets an about average 785 diggs per story, though I don’t think it really says anything because the Google.com ‘brand’ is so diluted, at least here.

MSN is more like Yahoo than Google here: Most of their links come from MSNBC, their news network. 434 stories or 83% of those 519 stories are from MSNBC. The remaining is scattered around content like MSN Health, Slate, Encarta, MSN Movies, etc. MSN has an above average 831 diggs per story, which means absolutely nothing.

Flickr

Photos. Of the 447 links to Flickr, 2 are to the Flickr Blog and one is to the announcement about the improved uploading feature. The rest are links to pictures, though I cannot determine to whom they belong to because most of the links link directly to the image rather than to the Flickr Page. Duh.

I’ll do an analysis of just the pictures posted to Digg separately.

Gaming (Kotaku, Joystiq, IGN & Destructoid)

Four gaming sites in there. Diggers are gaming freaks :) Here’s a chart comparing Kotaku, Joystiq, IGN and Destructoid:

image021

Kotaku has the most number of stories, followed by Destructoid and Joystiq. IGN is a bit behind, but still, note that the difference between IGN and Kotaku is just around 285. However, looking at the Diggs per Story:

image023

Here, Kotaku and Joystiq take the lead, while the not-a-blog IGN is left behind. Harsher blog Destructoid takes a good hit as well. In fact, while all of them have less-than-average number of Diggs per story, Destructoid seems be quite less popular than the others, with almost 200 diggs lesser than the average.

Apple (Apple.com, AppleInsider, MacRumors, TUAW)

Four Apple related sites, including Apple.com. Here’s the comparison chart:

image025

So, the number of ‘official’ stuff from Apple is pretty small when compared to the combined amount of stories coming from the apple related blogs. AppleInsider seems to be the most popular among them. They do have a lot of people doing PR for them for free, don’t they? :)

image027

So, while stories from Apple.com might not be much, those that do get posted get dugg heavily. In fact, at 1090, it is has the second highest number of Diggs per Story, just slightly behind another very popular site (see below).

Wordpress.com

All the Wordpress.com blogs are bunched together here. There are actually 140 blogs from Wordpress.com that made it to the frontpage, with Biosingularity, Ubuntu Blog & Robert Scoble taking the top spot with six stories each. However, most of the links come from single blogs: 115 of those 140 blogs were on the frontpage just once. Note that this might under-represent many popular Wordpress blogs which are on their own domain, and hence are not counted. I covered Wordpress up there with blogger, go have a look again if you want to.

nasa.gov

The most dugg part of NASA was, of course, the Astronomy Picture of the Day, being on the frontpage 32 times, followed by press releases from various parts of NASA. It’s spread out all around NASA, really. Also, NASA stories get more diggs than average, with 982 diggs per story.

Wikipedia.org

The Diggarticle on Wikipedia article on Wikipedia made it to the frontpage 5 times, while the Diggnation article and the Article about Made up words in the Simpsons made it two times. A lot of awesome articles listed here, like the ones about unusual deaths, songs deemed ‘inappropriate’ after Sep 11 & community currencies in the United States. Heck, they even have a list of ‘unusual’ articles! Also, Wikipedia has the highest number of links per story in the top 100 sites, with 1108 diggs per story, just 18 above Apple.com’s 1090 diggs per story. This, I think, is because Wikipedia is itself so big and varied that anything that is interesting enough to make it to the frontpage is interesting enough to get a lot of Diggs as well.

Others

Techcrunch’s pretty much high up in the list, but has way less than average 647 diggs per story. However, more Techcrunch stories are dugg than all of ZDNet blogs and content combined together, while ZDNet has a slightly higher 695 diggs per story. Below the average though. Also, don’t forget the awesomeness that is Lifehacker: It’s pretty high in the list too, with 214 stories at a well above average of 954 diggs per story. Comparatively, lifehack.org, similar to lifehacker, gets only 36 stories, but a higher 1476 diggs per story. I think this is because if something from lifehack is interesting enough to make it to the frontpage, it is interesting enough to get a lot more Diggs as well!

Other Non-Geek Sites in the Top Fifty

Rank

Site

Stories

4

news.com.au

466

5

reuters.com

457

7

washingtonpost.com

421

8

physorg.com

381

9

thinkprogress.org

345

10

rawstory.com

335

15

businessweek.com

264

19

guardian.co.uk

244

21

crooksandliars.com

235

22

treehugger.com

231

23

consumerist.com

229

24

theinquirer.net

228

25

livescience.com

227

27

abcnews.go.com

199

28

espn.go.com

195

29

usatoday.com

195

32

theregister.co.uk

191

33

newscientist.com

190

35

timesonline.co.uk

184

39

breitbart.com

170

40

forbes.com

167

I am not going to analyze any of these, since it turns I don’t read any of these. So, anyone else wants to try? Also, anyone wanting to do an analysis of the political views most seen on the frontpage is welcome: Just shoot me an email or leave a comment, and I’ll help you with any Data you need, for free.

For Fun

Here, I do just the type of stuff that’ll carry you straight into a flame war:

IBM.com vs. Microsoft.com vs. Sun.com vs. Apple.com vs. Linux.com

image029

Apple.com beats everyone here by a very wide margin. Microsoft.com comes in second with less than half as much stories as Apple.com, while Sun.com somehow manages to have just 21 stories from it to the frontpage. Heck, even IBM.com did better than Sun.com :)

image031

Here too, Apple.com beats everyone handsomely, but somehow Sun.com manages to get more Diggs per story than IBM.com. Still, both IBM.com and Sun.com have less than average diggs per story, while Microsoft.com has higher than average Diggs per story. Apple’s Diggs per story is almost off the charts :)

What does this mean? Looks like IBM and Sun are not exactly hot at Digg. Like we needed two charts to tell that :P

Suggestions welcome for adding stuff here

I am so brain-dead I could not think of anything more, better for this. Any ideas folks?

Trivia

  • Amazon ranks 160th, with 46 stories to the frontpage, but has a really poor 447 diggs per story.
  • There are absolutely zero stories on the frontpage from Java.com
  • Linux.com actually has 3 more stories than Microsoft.com
  • There are 5 links to Digg competitor Reddit, at just 410 diggs per story. After all, Reddit is not supposed to be popular on Digg)
  • There are 56 stories linking to Digg.com itself, at a mammoth 1913 diggs per story!
  • MSDN Blogs has 26 stories at 1083 diggs per story, while Sun Blogs has only 13 stories with a comparatively dismal 646 diggs per story. Sun sure isn’t popular at Digg.
  • Valleywag, the Tech Gossip blog has 63 stories at a higher than average 1058 diggs per story, while celebrity gossip blog Gawker has only 5 stories at a dismal 681 diggs per story. Diggers will rather read about Jason Calacanis or Dave Sifry than about, say, Britney Spears.

What’s Next?

In Part III, I’ll analyse the users whose stories got dugg the most, and probably compare it with the Digg Top 100 Users List. It’ll take some more time though, as School is starting tomorrow

Found this useful? Want a custom analysis?

Found this useful? You can help me write more stuff like this. You see, the main technical bottleneck for me, besides school, is my computer. She’s dying piece by piece now. First her graphics card died, and then the AGP slot itself (I tried 3 graphics cards, they all black out intermittently). And the on board graphics I am now on is slowly dying as well. The noisy Pentium 4 2.4 GHz single core machine with 1 gig of RAM not something that’s enough for me. I have a lot of experiments to do in my mind, but they all require more powerful hardware.

So, dear people-who-read-this, please consider helping me buy a newer, faster computer. The specs are up here and I estimate it’ll cost $1500. There are several ways you can do this:

  • Donate to this at my ChipIn page here. Or use the widget on the sidebar or at the bottom. I’ll name a part of my computer after everyone who donates!
  • Get your questions about Digg answered. Want to know something more specific about the Digg front page (Like, how many stories in the Apple category had Microsoft in the description (or vice versa?))? Ask! Simple question about Digg costs from $5-$20, depending on the complexity, while I can also do more complete analysis for a reasonable price (hey, I’m just trying to work my way to a new computer, okay?). For example, asking about the total number of stories in the Apple category without a word starting with ‘i’ in the description or title would cost you $10, while an analysis of the relative popularity of your favorite Linux distros or languages would cost about $35. It’s easily negotiable. Contact me via email (yuvipanda@gmail.com)
  • Get me to do custom analysis for your blog. I’ll accept most, and for a very good, double digit price, I’ll do an analysis of your blog as well as give you the data in a machine readable form if the inner-geek in you wants to do something more with it.
  • PayPal me directly. My PayPal email is yuvipanda@gmail.com.

Categories
StatBot, Uncategorizable
Comments rss
Comments rss
Trackback
Trackback

« Running Firefox 3 Nightly Builds StatBot: Top 100 Sites on Digg.com by Nett Diggs »

7 responses

[...] few weeks is to not write like an ad

StatBot: TechCrunch Data Analysis | YuviSense: Codin Kid | October 11, 2007 | 5:10 pm

[…] few weeks is to not write like an ad copy drone. Being precise when needed to be helps. My last two Digg Analysis went over 3000 words. Not good. So, I’m writing this analysis of TechCrunch, topper […]

How did you manage to come up with all of

Jeffro2pt0 | October 18, 2007 | 11:28 pm

How did you manage to come up with all of this data?

[...] Read the rest of this great post here [...]

wikipedia » StatBot: Analysing Digg Part II - Sites on the Frontpage | October 25, 2007 | 6:34 am

[…] Read the rest of this great post here […]

Wow! Superb analysis! Spun this story! I was wondering whats

Vikram | November 8, 2007 | 10:02 pm

Wow! Superb analysis! Spun this story! I was wondering whats the source of this superb data?
Great Post!

[...] duplicate things that have not succeeded (research what has

75 Suggestions, Best Practices & Resources for Digg - KoMarketing Associates | November 13, 2007 | 3:10 pm

[…] duplicate things that have not succeeded (research what has succeeded in making the Digg home […]

[...] ранкинг этого списка по “копаемости” и по попаданию на

Блог о сети » Blog Archive » Аттрактив #5 | January 23, 2008 | 6:13 am

[…] ранкинг этого списка по “копаемости” и по попаданию на первую страницу дигга. Многое из списка вполне ожидаемо, но попробую […]

I built a custom scrapper folks :)

yuvipanda | March 21, 2008 | 5:20 pm

I built a custom scrapper folks :)

Leave a comment

You can use these tags : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Contact Me

Email: yuvipanda@gmail.com
IM: yuvipanda@msn.com
GTalk: yuvipanda

My Photos


View my Flickr Page

My Badge


IndiBlogger - Where Indian Blogs Meet

Archives

  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox