Carnegie Mellon Study Ranks Most Informative Blogs October 30, 2007 8:12 AM Subscribe
I am uber-uber lazy. Links plz.
posted by hydrophonic at 8:24 AM on October 30, 2007 [1 favorite]
posted by hydrophonic at 8:24 AM on October 30, 2007 [1 favorite]
michellemalkin.com informative?
posted by mathowie (staff) at 8:28 AM on October 30, 2007 [1 favorite]
posted by mathowie (staff) at 8:28 AM on October 30, 2007 [1 favorite]
You don't expect me to copy and paste all those, do you!?
posted by Alvy Ampersand at 8:29 AM on October 30, 2007
posted by Alvy Ampersand at 8:29 AM on October 30, 2007
LAME.
their guiding question is: "Which blogs should one read to be most up to date, i.e., to quickly know about important stories that propagate over the blogosphere?" Huh? What blogs should you read to efficiently learn about what blogs are talking about?
Their algorithm uses number of posts, number of inlinks and outlinks from other blogs, and number of outlinks in general. So the resulting list is not the blogs that are the "most informative," but seems to be the blogs that are most connected to others. So its a popularity contest. Kinda explains how Malkin came in #5. In fact, there sure seem to be a lot of conservative blogs at the top of the list.
In all, seems more like a quantification of echo-chamber value than quality of content.
posted by googly at 8:30 AM on October 30, 2007 [1 favorite]
their guiding question is: "Which blogs should one read to be most up to date, i.e., to quickly know about important stories that propagate over the blogosphere?" Huh? What blogs should you read to efficiently learn about what blogs are talking about?
Their algorithm uses number of posts, number of inlinks and outlinks from other blogs, and number of outlinks in general. So the resulting list is not the blogs that are the "most informative," but seems to be the blogs that are most connected to others. So its a popularity contest. Kinda explains how Malkin came in #5. In fact, there sure seem to be a lot of conservative blogs at the top of the list.
In all, seems more like a quantification of echo-chamber value than quality of content.
posted by googly at 8:30 AM on October 30, 2007 [1 favorite]
hydrophonic writes "I am uber-uber lazy. Links plz."
posted by Mitheral at 8:30 AM on October 30, 2007 [4 favorites]
- Instapundit
- Don Surber
- Science & Politics
- Watcher of Weasesls
- Michelle Malkin
- National Journal's Blogometer
- The Modulator
- BloggersBlog.com
- Boing Boing
- Atrios
- A Blog for All
- Gothamist
- mparent777
- TFS Magnum
- Alliance of Free Blogs
- anglican.tk
- Micropersuasion
- Pajamas Media
- BlogHer
- The Jawa Report
- Soccer Dad
- Nose on Your Face
- aHistorically
- The Anchoress
- AmericaBlog
- SFist
- TBogg
- HorsePigCow
- Why Homeschool
- The Daou Report
- Sisu
- MetaFilter
- Megite
- LAist
- Captain's Quarters
- Shakesville
- Guy Kawasaki
- Lucy by Lucy
- Blue Star Chronicle
- Official Google Blog
- The Glittering Eye
- asterisco.paradigma.pt
- Read/WriteWeb
- Hullabaloo
- The Conservative Cat
- Phillyist
- The Social Customer Manifesto
- The Next Net
- Gateway Pundit
- Crooks and Liars
- Right Wing News
- 10,000 Birds
- O'Reilly Radar
- Cowboy Blog
- Business Opportunities Weblog
- DCist
- Creating Passionate Users
- Citizens For Legitimate Government
- What About Clients?
- Rough Type
- The Unofficial Apple Weblog
- Dans la cuisine d'Audinette
- The London Fog
- Bostonist
- Seattlest
- Austinist
- Indian Writing
- Power Line
- Firedoglake
- Blog d'Elisson
- Rhymes With Right
- Written World
- The Jeff Pulver Blog
- blog d'eMeRY
- Hugh MacLeod's gapingvoid
- Catymology
- Hugh Hewitt
- Lifehacker
- jordoncooper.com
- Econbrowser
- A Socialite's Life
- Gates of Vienna
- NevilleHobson.com
- Waxy.org
- A Life Restarted
- The Volokh Conspiracy
- See Also...
- Dr. Sanity
- Mudville Gazette
- www.saysuncle.com
- Privacy Digest
- Londonist
- Shanghaiist
- Catholic and Enjoying It
- Single Serve Coffee
- Jeremy Zawodny's blog
- ScienceBlogs
- Basic Thinking Blog
- Scobleizer
posted by Mitheral at 8:30 AM on October 30, 2007 [4 favorites]
The list with each entry being hyperlinked (BTW -- it's the first hyperlink in the FPP).
posted by ericb at 8:32 AM on October 30, 2007
posted by ericb at 8:32 AM on October 30, 2007
googly writes "In fact, there sure seem to be a lot of conservative blogs at the top of the list. "
Also the list is very American content heavy. I wonder if it is proportional to internet consumption or if the US is disproportionally represented in the blogsphere.
posted by Mitheral at 8:35 AM on October 30, 2007
Also the list is very American content heavy. I wonder if it is proportional to internet consumption or if the US is disproportionally represented in the blogsphere.
posted by Mitheral at 8:35 AM on October 30, 2007
I am kind of interested by their second take on "most influential", which penalizes blogs with large numbers of posts. Lot of lesser-known, disproportionately influential, smaller sites on there.
posted by ormondsacker at 8:42 AM on October 30, 2007
posted by ormondsacker at 8:42 AM on October 30, 2007
A recent Carnegie Mellon study used higher mathematics to answer the question: if you want to be informed about what the entire blogospohere is talking about, but you can only read 100 blogs (out of the millions available), which blogs should you read?
Based on that, how did single serve coffee make it on to the list?
posted by birdlady at 8:47 AM on October 30, 2007
Based on that, how did single serve coffee make it on to the list?
posted by birdlady at 8:47 AM on October 30, 2007
Also, as is the custom here, here is a GreaseMonkey Script to turn plain text urls into links.
posted by blue_beetle at 8:48 AM on October 30, 2007
posted by blue_beetle at 8:48 AM on October 30, 2007
Oh, I was joking, but thanks, Mitheral and blue_beetle.
posted by hydrophonic at 9:06 AM on October 30, 2007
posted by hydrophonic at 9:06 AM on October 30, 2007
How come Metafilter is seventeen places behind anglican.tk, which is nothing more than a spam site?
posted by Aloysius Bear at 9:11 AM on October 30, 2007
posted by Aloysius Bear at 9:11 AM on October 30, 2007
Number 33 is dead to me.
posted by found missing at 9:34 AM on October 30, 2007
posted by found missing at 9:34 AM on October 30, 2007
On a quick scan, seems very weighted towards the US and US-centric concerns, which to my mind would make it less informative for American readers that venturing further afield. One of the things I enjoy most about MeFi is reading about things I'd not even heard of before.
posted by Abiezer at 10:22 AM on October 30, 2007
posted by Abiezer at 10:22 AM on October 30, 2007
In practice, the cost of reading a blog is not simply proportional to the number of posts, since we also need to navigate to the blog (which takes constant effort per blog). Hence, a combination of unit and NP cost is more realistic.
Someone should tell them about RSS feeds.
posted by roofus at 10:29 AM on October 30, 2007
Someone should tell them about RSS feeds.
posted by roofus at 10:29 AM on October 30, 2007
Instapundit, informative?
What a load of crap this list is.
honorable exceptions: BB, Atr, MeFi, C&L, FdL
posted by Bletch at 10:46 AM on October 30, 2007
What a load of crap this list is.
honorable exceptions: BB, Atr, MeFi, C&L, FdL
posted by Bletch at 10:46 AM on October 30, 2007
cuteoverload was ROBBED.
posted by Ambrosia Voyeur at 10:59 AM on October 30, 2007 [3 favorites]
posted by Ambrosia Voyeur at 10:59 AM on October 30, 2007 [3 favorites]
I have a hard time taking Carnegie Mellon seriously. It sounds too much like a lecherous lush asking to cop a feel.
posted by srboisvert at 11:30 AM on October 30, 2007
posted by srboisvert at 11:30 AM on October 30, 2007
I have a hard time taking Carnegie Mellon seriously because I graduated from there. Most people seem to take them seriously though.
posted by ludwig_van at 12:02 PM on October 30, 2007
posted by ludwig_van at 12:02 PM on October 30, 2007
Number 3 is dead.
From Bora Zivkovic over at Number 3:
"And how useful it is to read a dead blog - this one you are at right now, my old blog ranked #3? I abandoned it in June 2006. I occasionally use it for testing stuff or for Google-bombing ;-) If you want to read a really useful blog, go check my current blog, not this one!
How useful it is to rank blogs according to the 2006 data anyway - that is eons ago in Internet time?
This must have been some fuzzy math. I hope the blogosphere responds with a big laugh."
And indeed, the "algorithm" used seems focused on Google-bombing and log-rolling.
posted by 3.2.3 at 12:08 PM on October 30, 2007
From Bora Zivkovic over at Number 3:
"And how useful it is to read a dead blog - this one you are at right now, my old blog ranked #3? I abandoned it in June 2006. I occasionally use it for testing stuff or for Google-bombing ;-) If you want to read a really useful blog, go check my current blog, not this one!
How useful it is to rank blogs according to the 2006 data anyway - that is eons ago in Internet time?
This must have been some fuzzy math. I hope the blogosphere responds with a big laugh."
And indeed, the "algorithm" used seems focused on Google-bombing and log-rolling.
posted by 3.2.3 at 12:08 PM on October 30, 2007
So the list is bogus? Phew. For a moment there I thought MetaFilter might actually be informative.
posted by Deathalicious at 12:46 PM on October 30, 2007
posted by Deathalicious at 12:46 PM on October 30, 2007
One of my first thoughts upon reading this was "where did they get their data?" So I skimmed the paper and the expanded tech report, which says:
Still, if we assume that the blog list came directly from the data for this paper, the authors of the current study were working off a list that was two years old. Two years is a very long time as far as weblogs are concerned. Their methodology excludes blogs that had died before 2006, but it wouldn't include any blogs that weren't indexed by one of the source services in 2004. In total, their dataset consisted of 45,000 blogs. As far as testing their outbreak dection algorithm goes, this is probably fine, but you can't really say that it identifies the "most informative blogs" period. At best, it identifies the "most informative" blogs within that set (though there are still the issues mentioned by googly and 3.2.3 above).
The authors seem to understand this, saying "In this work we are not explicitly modeling the spread of information over the network, but rather consider cascades as input to our algorithms." Unfortunately, that remark is easy to miss, and even if they had advocated more caution about generalizing their results (which they should have, IMHO) it's easy for important details like that to get lost in translation. Case in point: the linked blog post from Bloggers Blog in turn links to this post from Data Mining, which says: "It must be noted that this work is a theoretical exploration - the dataset mined to create the list is not a live corpus of blogs; thus some of the blogs may be stale or even abandoned." Bloggers Blog left that bit out, perhaps because it didn't really fit the tone of a "we're in the top 10!" post.
This comment ended up being a lot longer than I intended, but as a researcher myself I tend to get passionate about these sorts of things.
posted by I Said, I've Got A Big Stick at 2:21 PM on October 30, 2007
Here we are interested in blogs that actively participate in discussions, we biased the dataset towards the active part of the blogosphere, and selected a subset from the larger set of 2.5 million blogs of [8].[8] turned out to be this paper, which was published in 2005. That list of blogs was created by getting lists of updated blogs from "centralized services," and the "services include the update lists from: blogrolling.com, weblogs.com, diaryland.com, livejournal.com, xanga.com, blo.gs and myspace.com." It's implied that the data was collected in 2004, but it's also implied that the research might be ongoing.
Still, if we assume that the blog list came directly from the data for this paper, the authors of the current study were working off a list that was two years old. Two years is a very long time as far as weblogs are concerned. Their methodology excludes blogs that had died before 2006, but it wouldn't include any blogs that weren't indexed by one of the source services in 2004. In total, their dataset consisted of 45,000 blogs. As far as testing their outbreak dection algorithm goes, this is probably fine, but you can't really say that it identifies the "most informative blogs" period. At best, it identifies the "most informative" blogs within that set (though there are still the issues mentioned by googly and 3.2.3 above).
The authors seem to understand this, saying "In this work we are not explicitly modeling the spread of information over the network, but rather consider cascades as input to our algorithms." Unfortunately, that remark is easy to miss, and even if they had advocated more caution about generalizing their results (which they should have, IMHO) it's easy for important details like that to get lost in translation. Case in point: the linked blog post from Bloggers Blog in turn links to this post from Data Mining, which says: "It must be noted that this work is a theoretical exploration - the dataset mined to create the list is not a live corpus of blogs; thus some of the blogs may be stale or even abandoned." Bloggers Blog left that bit out, perhaps because it didn't really fit the tone of a "we're in the top 10!" post.
This comment ended up being a lot longer than I intended, but as a researcher myself I tend to get passionate about these sorts of things.
posted by I Said, I've Got A Big Stick at 2:21 PM on October 30, 2007
I am uber-uber lazy. Links plz.
If you use Firefox, the single best extension ever (copying and extending functionality from IE2/Maxthon (which kind of sucks now) allows you to drag and drop anything from a page -- text, links, urls, images, whatever -- and customize what happens when you drop them (like search, define, save, go to, go to in a background tab, save to your fave bookmarking tool, and a million other things, directionally). ROCK.
posted by stavrosthewonderchicken at 5:02 PM on October 30, 2007
If you use Firefox, the single best extension ever (copying and extending functionality from IE2/Maxthon (which kind of sucks now) allows you to drag and drop anything from a page -- text, links, urls, images, whatever -- and customize what happens when you drop them (like search, define, save, go to, go to in a background tab, save to your fave bookmarking tool, and a million other things, directionally). ROCK.
posted by stavrosthewonderchicken at 5:02 PM on October 30, 2007
That extension looks like it's worth trying or you could copy paste the source of the page.
posted by Mitheral at 6:09 PM on October 30, 2007
posted by Mitheral at 6:09 PM on October 30, 2007
I laugh out loud at the idea that Boing Boing could be better than metafilter at anything...
...well anything other than showcasing the latest disney-themed, electronic, do-it-yourself, automated, crypto-zoological, automated copyright rant machine.
posted by milarepa at 7:38 PM on October 30, 2007 [2 favorites]
...well anything other than showcasing the latest disney-themed, electronic, do-it-yourself, automated, crypto-zoological, automated copyright rant machine.
posted by milarepa at 7:38 PM on October 30, 2007 [2 favorites]
Isn't the owner of waxy.org, number 85 on the list, a mefite? I seem to recall some MeFi statistic being hosted at waxy.org.
posted by rjs at 1:52 PM on October 31, 2007
posted by rjs at 1:52 PM on October 31, 2007
Of course. Everybody's a member of Metafilter, or was. Then again, once a MeFite, always a MeFite.waxpancake, aka Andy Baio.
posted by stavrosthewonderchicken at 10:33 PM on October 31, 2007
posted by stavrosthewonderchicken at 10:33 PM on October 31, 2007
You are not logged in, either login or create an account to post comments
http://instapundit.com
http://donsurber.blogspot.com
http://sciencepolitics.blogspot.com
http://www.watcherofweasels.com
http://michellemalkin.com
http://blogometer.nationaljournal.com
http://themodulator.org
http://www.bloggersblog.com
http://www.boingboing.net
http://atrios.blogspot.com
http://lawhawk.blogspot.com
http://www.gothamist.com
http://mparent7777.livejournal.com
http://wheelgun.blogspot.com
http://gevkaffeegal.typepad.com/the_alliance
http://www.anglican.tk
http://www.micropersuasion.com
http://pajamasmedia.com
http://blogher.org
http://mypetjawa.mu.nu
http://reddit.com
http://soccerdad.baltiblogs.com
http://www.thenoseonyourface.com/the_nose_on_your_face
http://ahistoricality.blogspot.com
http://theanchoressonline.com
http://americablog.blogspot.com
http://www.sfist.com
http://tbogg.blogspot.com
http://www.horsepigcow.com
http://whyhomeschool.blogspot.com
http://daoureport.salon.com
http://sisu.typepad.com/sisu
http://www.metafilter.com
http://www.megite.com
http://www.laist.com
http://www.captainsquartersblog.com/mt
http://shakespearessister.blogspot.com
http://blog.guykawasaki.com
http://tryinotocomeundone.blogstream.com
http://bluestarchronicles.blogspot.com
http://googleblog.blogspot.com
http://theglitteringeye.com
http://asterisco.paradigma.pt
http://www.readwriteweb.com
http://digbysblog.blogspot.com
http://www.conservativecat.com
http://www.phillyist.com
http://www.socialcustomer.com
http://business2.blogs.com/business2blog
http://gatewaypundit.blogspot.com
http://www.crooksandliars.com
http://www.rightwingnews.com
http://www.10000birds.com
http://radar.oreilly.com
http://cowboyblob.blogspot.com
http://www.business-opportunities.biz
http://www.dcist.com
http://headrush.typepad.com/creating_passionate_users
http://www.legitgov.org
http://www.whataboutclients.com
http://www.roughtype.com
http://www.tuaw.com
http://aude91.canalblog.com
http://thelondonfog.blogspot.com
http://www.bostonist.com
http://www.seattlest.com
http://www.austinist.com
http://indianwriting.blogspot.com
http://powerlineblog.com
http://firedoglake.blogspot.com
http://elisson1.blogspot.com
http://rhymeswithright.mu.nu
http://ragnell.blogspot.com
http://pulverblog.pulver.com
http://mry.blogs.com/les_instants_emery
http://www.gapingvoid.com
http://catymology.blogspot.com
http://hughhewitt.com
http://www.lifehacker.com
http://www.jordoncooper.com
http://www.econbrowser.com
http://socialitelife.com
http://gatesofvienna.blogspot.com
http://www.nevillehobson.com
http://www.waxy.org/links
http://aliferestarted.blogspot.com
http://volokh.com
http://library.coloradocollege.edu/steve
http://drsanity.blogspot.com
http://www.mudvillegazette.com
http://www.saysuncle.com
http://www.privacydigest.com
http://www.londonist.com
http://www.shanghaiist.com
http://markshea.blogspot.com
http://www.singleservecoffee.com
http://jeremy.zawodny.com/blog
http://www.scienceblogs.com
http://www.basicthinking.de/blog
http://scobleizer.wordpress.com
posted by jessamyn (staff) at 8:19 AM on October 30, 2007