StatBot: Analysing Digg Part II - Sites on the Frontpage
October 2, 2007 | 5:43 pmUpdate: Digg this here.
This is Part II of the Analysis of Digg, where I’m going to analyze the sites that are dugg the most. Note that when I say ‘most dugg sites’, I mean the number of times the site has been to Digg’s frontpage. Part I is here.
But before you start
I’m seriously short of processing power, and am not able to run many linguistic analysis because of that. You can help me earn a new computer by buying little tidbits of analysis for your blog (or Digg) from me. Very reasonable, small fees. See the bottom of the post for more information. Thanks folks!
Diversity
The 61,608 stories on the Digg frontpage are spread out around 14,338 sites(or ‘high level domains’, as Jeff Clark (from whom I stole this method) likes to call it) at an average of just 4.29 stories per site. That first looked like incredible diversity to me. And, make no mistake, it is incredible diversity. But, it’s not as diverse as it seems. Here, look at this chart:
The Top 100 (or a ridiculous 0.007%) sites make up 41% of the stories! So, a relatively small number of sites make up a large number of frontpage stories, forming some sort of a “core”, while a very large number of sites(14,338 sites, or 99.993%) contribute the remaining. A good balance, I’d say.
Here’s the chart splitting up sites by the number of stories they have contributed to the front page:
As you can see,71% of sites that make it to Digg’s frontpage make it there only once, while 25% make it 2-10 times. Only 4% ever managed to get to the frontpage more than 25 times, and 1% over 100 times. Excel is being optimistic here: It’s just 79 sites that got more than 100 stories on the frontpage. If your site does make it to the Digg frontpage once, there’s a 71% chance that it won’t go there again
Also, here’s another chart, showing how much the sites with just 1 story contribute vs. the bigger ones:
So, those 1% of sites (just 79 sites, actually: Excel is bad at math) which had more than 100 stories on the frontpage, make up 39% of the frontpage, while the 71% of the sites that contributed just one story make up 16% of the frontpage. On the whole, this is a very fairly divided pie, and things are pretty much very balanced at Digg. The frontpage is not monopolized by a few sites, nor scattered across the web.
- A total of 14,338 sites have been on Digg’s frontpage at least once.
- The top 100 sites (just 0.007%) make up 41% of Digg’s frontpage stories.
- 71% of the sites that make it to the frontpage never get another story on the frontpage.
- There are a total of 79 sites with more than 100 stories hitting the frontpage
- Those 79 “mainstream” sites contribute 39% of all of Digg’s frontpage stories.
Top Ten Sites on the frontpage
Here is the list of the top ten sites that made it to the frontpage:
|
Rank |
Site |
Stories |
|
1 |
blogspot.com |
1224 |
|
2 |
yahoo.com |
1217 |
|
3 |
arstechnica.com |
1155 |
|
4 |
engadget.com |
897 |
|
5 |
cnn.com |
871 |
|
6 |
com.com |
811 |
|
7 |
bbc.co.uk |
748 |
|
8 |
nytimes.com |
731 |
|
9 |
wired.com |
646 |
|
10 |
youtube.com |
616 |
Note that this is not exactly absolute: For example, all the BlogSpot blogs are glued together! Let’s examine each one in detail:
Hosted Blogs (BlogSpot, Wordpress, Typepad)
As you should’ve guessed by now, it’s ALL the BlogSpot blogs combined together that take the first spot. A total of 789 BlogSpot blogs made it to the frontpage. The top 3 most dugg BlogSpot blogs are the Google Blog, the Google Operating System Blog, and the old Digg Blog. However, most of the BlogSpot stories which are Dugg come from those almost unknown blogs. Here’s the chart showing that:
As you can see, the Top 10 ranked blogspot blogs (Top 13 actually, as 10th place was split between Labnol, GoNext & TopMac (but, with just 9 stories)) contribute only 22% of the total. Diversity here, but it also means that if your BlogSpot blog gets to the frontpage for the first time, then there’s a 78% chance that it has gotten there for the last time as well. This is just a tad higher than the overall just-a-single-story percentage of 71%. Also, each story gets an average of 925 Diggs, which is about 150 diggs higher than the overall average of 763 diggs per site. So, even if you get there only once, you get more than the average number of Diggs!
Here’s a chart comparing Blogger to the other free hosted blogging sites, Wordpress and Typepad. Windows Live Spaces isn’t included because there are only 8 stories ever from MSN Spaces, while MySpace has 11, most of which are to services and announcements about the site rather than actual MySpace pages (I don’t really consider them comparable to Blogger or Wordpress though)
As you can see, BlogSpot is several times as big as Wordpress and Typepad. I think this is primarily because Blogger is older than the other two, while the fact that both Wordpress & MovableType can be self hosted more easily can also contribute this. However, Wordpress.com blogs get an average of 1005 diggs a post, which is higher than that of BlogSpot, while TypePad gets a much lower 740 Diggs per post. Here’s that comparison chart:
Both Blogger and Wordpress are well above average, while Typepad is just a tad below it. Heck even Seth Godin didn’t make it to the frontpage once!
Yahoo.com
News, which makes up 1015 or 83% of the 1217 stories from Yahoo. The rest are just spread out among Yahoo Business, Yahoo Sports, Yahoo Finance, and a bunch of stuff that Yahoo gave a burst of life to and then promptly forgot(like Yahoo Pipes, which has 3 stories to it). I can’t really think of any graph to put up here.
has a dismal 643 Diggs per story, well below the average.
ArsTechnica
Ars Technica, with 1,155 stories on the frontpage is actually the individual site with the most number of stories on Digg’s frontpage! Here’s the chart showing from where on ArsTechnica the stories are coming from:
Majority of them come from the News section, while a good part comes from the Journals section as well. Only a small number of featured articles are present here, though that is probably because featured articles aren’t really that frequent. However, it gets a lesser than average 704 diggs per story. Breaking up the Diggs per Story by the section,
Features and Guides get quite lesser than average Diggs per Story, while News and Journals are just about average.
Engadget
Engadget, the most linked to blog in the world, comes in at No 4 with 897 stories with a slightly-lesser-than-average 747 diggs per story. It’s just a tad lower than the average though, I guess just normal variation accounts for it. Diggers do love Gadgets!
Comparison to Gizmodo
Gizmodo, the third most linked to blog, is also pretty much high on the list, at No. 13 with 512 stories with a much higher than average 954 Diggs per story. Means it gets to the frontpage lesser number of times than Engadget, but when it does, it gets more Diggs. Here’s a chart showing this visually:
So, yep, while Gizmodo does get to the frontpage lesser number of times than Engadget, it gets a comparatively larger number of Diggs.
CNN, News.com, BBC, NYTimes, Wired
Mainstream Media. Comparing the large number of stories that these sites actually churn out, the number of them that made it to the Digg frontpage is relatively small. Here’s a chart showing their relative popularity:
CNN.com has the most number of stories, Wired has the highest number of Diggs per Story. This is expected, since Wired is quite a lot geekier than the other ones. However, even Wired has lesser Diggs per Site than the average, and quite a bit lesser than Gizmodo and some of the other high ranking sites (YouTube, for example). Note that the difference is quite negligible in the case of Wired, but pretty big in the others, with News.com a good 160 Diggs below the average. I think it’s just that Digg is more biased towards technology, something which I will analyze in Part III.
YouTube
YouTube made it to the top ten even though I didn’t include the Videos section. It’s got 616 stories at a well-above-average 997 diggs a story! However, most of those are older ones: Excluding the video section, only 54 YouTube videos made it to the frontpage this year. Just 6 made it in September. So, yeah, this is mostly a leftover, as most videos are now posted (correctly) to the videos section.
Still, it beats the competition handsomely. Google Video is the closest, with 200 stories, but MSN Soapbox & Revver has only 1 each! This is just nitpicking though: A better comparison would seek these numbers from the Video section, data for which I unfortunately do not have. If there’s enough interest, I’ll do Digg’s video section separately, okay?
Other Geeky Sites in the Top Fifty
|
Rank |
Site |
Stories |
|
11 |
google.com |
522 |
|
12 |
msn.com |
519 |
|
13 |
gizmodo.com |
512 |
|
16 |
flickr.com |
447 |
|
21 |
techcrunch.com |
305 |
|
22 |
kotaku.com |
284 |
|
23 |
zdnet.com |
283 |
|
24 |
destructoid.com |
270 |
|
26 |
apple.com |
259 |
|
27 |
joystiq.com |
259 |
|
28 |
ign.com |
249 |
|
30 |
appleinsider.com |
242 |
|
33 |
consumerist.com |
229 |
|
35 |
livescience.com |
227 |
|
36 |
lifehacker.com |
214 |
|
40 |
nasa.gov |
193 |
|
41 |
torrentfreak.com |
192 |
|
44 |
wordpress.com |
184 |
|
46 |
wikipedia.org |
175 |
|
47 |
macrumors.com |
173 |
|
48 |
tuaw.com |
173 |
Google & MSN
Most of the stories pointing to Google.com come from Google Video, though there are 14 stories linking to just the frontpage, most of which promptly ask the reader to ignore the story link. There’s even one about a “new search engine called Google” and about “all Google servers have crashed”. Also, comments about Digg itself seem to have Google in the Story Link. Also, in what I might call “funny”, a guy offered $100 to a “Random Digger” who diggs the story at a pre-selected position. I don’t really know if he kept up the offer though
Also, there are a long list of things that Google once started, and then abandoned (Google Pages has 2 stories, Google Pack has 3, etc). Also, surprisingly (at least for me), Google Groups has only 6 stories to it. Even Google Code has a higher number of stories (7 stories). Google.com gets an about average 785 diggs per story, though I don’t think it really says anything because the Google.com ‘brand’ is so diluted, at least here.
MSN is more like Yahoo than Google here: Most of their links come from MSNBC, their news network. 434 stories or 83% of those 519 stories are from MSNBC. The remaining is scattered around content like MSN Health, Slate, Encarta, MSN Movies, etc. MSN has an above average 831 diggs per story, which means absolutely nothing.
Flickr
Photos. Of the 447 links to Flickr, 2 are to the Flickr Blog and one is to the announcement about the improved uploading feature. The rest are links to pictures, though I cannot determine to whom they belong to because most of the links link directly to the image rather than to the Flickr Page. Duh.
I’ll do an analysis of just the pictures posted to Digg separately.
Gaming (Kotaku, Joystiq, IGN & Destructoid)
Four gaming sites in there. Diggers are gaming freaks
Here’s a chart comparing Kotaku, Joystiq, IGN and Destructoid:
Kotaku has the most number of stories, followed by Destructoid and Joystiq. IGN is a bit behind, but still, note that the difference between IGN and Kotaku is just around 285. However, looking at the Diggs per Story:
Here, Kotaku and Joystiq take the lead, while the not-a-blog IGN is left behind. Harsher blog Destructoid takes a good hit as well. In fact, while all of them have less-than-average number of Diggs per story, Destructoid seems be quite less popular than the others, with almost 200 diggs lesser than the average.
Apple (Apple.com, AppleInsider, MacRumors, TUAW)
Four Apple related sites, including Apple.com. Here’s the comparison chart:
So, the number of ‘official’ stuff from Apple is pretty small when compared to the combined amount of stories coming from the apple related blogs. AppleInsider seems to be the most popular among them. They do have a lot of people doing PR for them for free, don’t they?
So, while stories from Apple.com might not be much, those that do get posted get dugg heavily. In fact, at 1090, it is has the second highest number of Diggs per Story, just slightly behind another very popular site (see below).
Wordpress.com
All the Wordpress.com blogs are bunched together here. There are actually 140 blogs from Wordpress.com that made it to the frontpage, with Biosingularity, Ubuntu Blog & Robert Scoble taking the top spot with six stories each. However, most of the links come from single blogs: 115 of those 140 blogs were on the frontpage just once. Note that this might under-represent many popular Wordpress blogs which are on their own domain, and hence are not counted. I covered Wordpress up there with blogger, go have a look again if you want to.
nasa.gov
The most dugg part of NASA was, of course, the Astronomy Picture of the Day, being on the frontpage 32 times, followed by press releases from various parts of NASA. It’s spread out all around NASA, really. Also, NASA stories get more diggs than average, with 982 diggs per story.
Wikipedia.org
The Diggarticle on Wikipedia article on Wikipedia made it to the frontpage 5 times, while the Diggnation article and the Article about Made up words in the Simpsons made it two times. A lot of awesome articles listed here, like the ones about unusual deaths, songs deemed ‘inappropriate’ after Sep 11 & community currencies in the United States. Heck, they even have a list of ‘unusual’ articles! Also, Wikipedia has the highest number of links per story in the top 100 sites, with 1108 diggs per story, just 18 above Apple.com’s 1090 diggs per story. This, I think, is because Wikipedia is itself so big and varied that anything that is interesting enough to make it to the frontpage is interesting enough to get a lot of Diggs as well.
Others
Techcrunch’s pretty much high up in the list, but has way less than average 647 diggs per story. However, more Techcrunch stories are dugg than all of ZDNet blogs and content combined together, while ZDNet has a slightly higher 695 diggs per story. Below the average though. Also, don’t forget the awesomeness that is Lifehacker: It’s pretty high in the list too, with 214 stories at a well above average of 954 diggs per story. Comparatively, lifehack.org, similar to lifehacker, gets only 36 stories, but a higher 1476 diggs per story. I think this is because if something from lifehack is interesting enough to make it to the frontpage, it is interesting enough to get a lot more Diggs as well!
Other Non-Geek Sites in the Top Fifty
|
Rank |
Site |
Stories |
|
4 |
news.com.au |
466 |
|
5 |
reuters.com |
457 |
|
7 |
washingtonpost.com |
421 |
|
8 |
physorg.com |
381 |
|
9 |
thinkprogress.org |
345 |
|
10 |
rawstory.com |
335 |
|
15 |
businessweek.com |
264 |
|
19 |
guardian.co.uk |
244 |
|
21 |
crooksandliars.com |
235 |
|
22 |
treehugger.com |
231 |
|
23 |
consumerist.com |
229 |
|
24 |
theinquirer.net |
228 |
|
25 |
livescience.com |
227 |
|
27 |
abcnews.go.com |
199 |
|
28 |
espn.go.com |
195 |
|
29 |
usatoday.com |
195 |
|
32 |
theregister.co.uk |
191 |
|
33 |
newscientist.com |
190 |
|
35 |
timesonline.co.uk |
184 |
|
39 |
breitbart.com |
170 |
|
40 |
forbes.com |
167 |
I am not going to analyze any of these, since it turns I don’t read any of these. So, anyone else wants to try? Also, anyone wanting to do an analysis of the political views most seen on the frontpage is welcome: Just shoot me an email or leave a comment, and I’ll help you with any Data you need, for free.
For Fun
Here, I do just the type of stuff that’ll carry you straight into a flame war:
IBM.com vs. Microsoft.com vs. Sun.com vs. Apple.com vs. Linux.com
Apple.com beats everyone here by a very wide margin. Microsoft.com comes in second with less than half as much stories as Apple.com, while Sun.com somehow manages to have just 21 stories from it to the frontpage. Heck, even IBM.com did better than Sun.com
Here too, Apple.com beats everyone handsomely, but somehow Sun.com manages to get more Diggs per story than IBM.com. Still, both IBM.com and Sun.com have less than average diggs per story, while Microsoft.com has higher than average Diggs per story. Apple’s Diggs per story is almost off the charts
What does this mean? Looks like IBM and Sun are not exactly hot at Digg. Like we needed two charts to tell that
Suggestions welcome for adding stuff here
I am so brain-dead I could not think of anything more, better for this. Any ideas folks?
Trivia
- Amazon ranks 160th, with 46 stories to the frontpage, but has a really poor 447 diggs per story.
- There are absolutely zero stories on the frontpage from Java.com
- Linux.com actually has 3 more stories than Microsoft.com
- There are 5 links to Digg competitor Reddit, at just 410 diggs per story. After all, Reddit is not supposed to be popular on Digg)
- There are 56 stories linking to Digg.com itself, at a mammoth 1913 diggs per story!
- MSDN Blogs has 26 stories at 1083 diggs per story, while Sun Blogs has only 13 stories with a comparatively dismal 646 diggs per story. Sun sure isn’t popular at Digg.
- Valleywag, the Tech Gossip blog has 63 stories at a higher than average 1058 diggs per story, while celebrity gossip blog Gawker has only 5 stories at a dismal 681 diggs per story. Diggers will rather read about Jason Calacanis or Dave Sifry than about, say, Britney Spears.
What’s Next?
In Part III, I’ll analyse the users whose stories got dugg the most, and probably compare it with the Digg Top 100 Users List. It’ll take some more time though, as School is starting tomorrow
Found this useful? Want a custom analysis?
Found this useful? You can help me write more stuff like this. You see, the main technical bottleneck for me, besides school, is my computer. She’s dying piece by piece now. First her graphics card died, and then the AGP slot itself (I tried 3 graphics cards, they all black out intermittently). And the on board graphics I am now on is slowly dying as well. The noisy Pentium 4 2.4 GHz single core machine with 1 gig of RAM not something that’s enough for me. I have a lot of experiments to do in my mind, but they all require more powerful hardware.
So, dear people-who-read-this, please consider helping me buy a newer, faster computer. The specs are up here and I estimate it’ll cost $1500. There are several ways you can do this:
- Donate to this at my ChipIn page here. Or use the widget on the sidebar or at the bottom. I’ll name a part of my computer after everyone who donates!
- Get your questions about Digg answered. Want to know something more specific about the Digg front page (Like, how many stories in the Apple category had Microsoft in the description (or vice versa?))? Ask! Simple question about Digg costs from $5-$20, depending on the complexity, while I can also do more complete analysis for a reasonable price (hey, I’m just trying to work my way to a new computer, okay?). For example, asking about the total number of stories in the Apple category without a word starting with ‘i’ in the description or title would cost you $10, while an analysis of the relative popularity of your favorite Linux distros or languages would cost about $35. It’s easily negotiable. Contact me via email (yuvipanda@gmail.com)
- Get me to do custom analysis for your blog. I’ll accept most, and for a very good, double digit price, I’ll do an analysis of your blog as well as give you the data in a machine readable form if the inner-geek in you wants to do something more with it.
- PayPal me directly. My PayPal email is yuvipanda@gmail.com.























[...] few weeks is to not write like an ad
StatBot: TechCrunch Data Analysis | YuviSense: Codin Kid | October 11, 2007 | 5:10 pm[…] few weeks is to not write like an ad copy drone. Being precise when needed to be helps. My last two Digg Analysis went over 3000 words. Not good. So, I’m writing this analysis of TechCrunch, topper […]
How did you manage to come up with all of
Jeffro2pt0 | October 18, 2007 | 11:28 pmHow did you manage to come up with all of this data?
[...] Read the rest of this great post here [...]
wikipedia » StatBot: Analysing Digg Part II - Sites on the Frontpage | October 25, 2007 | 6:34 am[…] Read the rest of this great post here […]
Wow! Superb analysis! Spun this story! I was wondering whats
Vikram | November 8, 2007 | 10:02 pmWow! Superb analysis! Spun this story! I was wondering whats the source of this superb data?
Great Post!
[...] duplicate things that have not succeeded (research what has
75 Suggestions, Best Practices & Resources for Digg - KoMarketing Associates | November 13, 2007 | 3:10 pm[…] duplicate things that have not succeeded (research what has succeeded in making the Digg home […]
[...] ранкинг этого списка по “копаемости” и по попаданию на
Блог о сети » Blog Archive » Аттрактив #5 | January 23, 2008 | 6:13 am[…] ранкинг этого списка по “копаемости” и по попаданию на первую страницу дигга. Многое из списка вполне ожидаемо, но попробую […]
I built a custom scrapper folks :)
yuvipanda | March 21, 2008 | 5:20 pmI built a custom scrapper folks