tag clouds for insight and knowledge. May 4, 2007 12:52 PM Subscribe
Inspired by this thread, and in an effort to help us know each other better, would it be possible to have tag clouds of everything we've ever written on metafilter - not just the tags from our posts - displayed in our profile?
Wouldn't that just result in a big cloud of "I" "a" "the" "and" and so forth?
posted by CKmtl at 12:59 PM on May 4, 2007
posted by CKmtl at 12:59 PM on May 4, 2007
You mean a cloud of all the tags of all threads we've commented in and/or created?
Would be interesting, yes.
posted by and hosted from Uranus at 1:00 PM on May 4, 2007
Would be interesting, yes.
posted by and hosted from Uranus at 1:00 PM on May 4, 2007
I would like to have a parts of speech cloud for my profile please.
posted by thirteenkiller at 1:01 PM on May 4, 2007
posted by thirteenkiller at 1:01 PM on May 4, 2007
Don't bother with all that coding mumbo-jumbo. Here you go:
meh Iraq asshat circumcision cockpunch cat douchebag yawn Bush
posted by ND¢ at 1:07 PM on May 4, 2007
meh Iraq asshat circumcision cockpunch cat douchebag yawn Bush
posted by ND¢ at 1:07 PM on May 4, 2007
You'd have to come up with a big list of little words to ignore, or use something like Yahoo's term extraction service to pick out keywords. But crunching that much text for thousands of users would be a bit resource intensive no matter which way you go.
posted by pb (staff) at 1:10 PM on May 4, 2007 [1 favorite]
posted by pb (staff) at 1:10 PM on May 4, 2007 [1 favorite]
ND¢ - You left out "batshitinsane."
posted by Joey Michaels at 1:26 PM on May 4, 2007
posted by Joey Michaels at 1:26 PM on May 4, 2007
Man. I like that. Putting it on my big list of Things To Do With The DB Eventually Maybe, that's for sure.
posted by cortex (staff) at 1:38 PM on May 4, 2007
posted by cortex (staff) at 1:38 PM on May 4, 2007
I don't like to post negative comments, but I don't like this idea. If you're curious about a persons posting history, comments, etc, just check the profile.
Derail: Matt's time would be better spent creating MeRe (Metafilter Reviews). Just my very humble opinions.
posted by snsranch at 1:41 PM on May 4, 2007
Derail: Matt's time would be better spent creating MeRe (Metafilter Reviews). Just my very humble opinions.
posted by snsranch at 1:41 PM on May 4, 2007
Oh, and that should be pronounced, Mee Ree, not Mee Reh.
posted by snsranch at 1:42 PM on May 4, 2007
posted by snsranch at 1:42 PM on May 4, 2007
Why can't we stick with the current heuristics we have to help us relate to one another? Why does anyone need good information to make bad judgements about those with whom we disagree? Let's just keep on with the half-remembered grievances and square holes we already have winnowed out for anyone we don't really know (or even care to know) crossing our path. If you need a crib, ND¢ has given you a good start.
posted by carsonb at 1:49 PM on May 4, 2007
posted by carsonb at 1:49 PM on May 4, 2007
Maybe this can be a side project for someone. I don't think it would need to be updated constantly. Once a day, failing that, week/month would work, too.
posted by Dave Faris at 1:51 PM on May 4, 2007
posted by Dave Faris at 1:51 PM on May 4, 2007
(I should maybe clarify that what I like is the idea of the analysis itself. The autoposting on everybody's profile, not so much, though it might be a fun optional widget.)
posted by cortex (staff) at 1:57 PM on May 4, 2007
posted by cortex (staff) at 1:57 PM on May 4, 2007
Actually, just looking over my posts, I already see a fly in the ointment. A lot of my posts include quotes from websites, news stories, or other posters, which would really skew the results.
posted by Dave Faris at 1:59 PM on May 4, 2007
posted by Dave Faris at 1:59 PM on May 4, 2007
Huh, yep, you're right about that fly, Dave Faris. And how would y2karl's tooltip text be read or imaged in a tag cloud? This is probably a nifty thing if it can be accomplished by magic, but maybe not so nifty that it needs to be engineered.
posted by cgc373 at 2:16 PM on May 4, 2007
posted by cgc373 at 2:16 PM on May 4, 2007
I fail to see anything interesting about this and I'm a data nerd myself. The reason tagclouds are interesting for presidential speeches is that they are focused deliveries and this helps pull out that focus and make it more obvious. A scattershot summation of comments which span thousand of different topics seems utterly useless.
posted by vacapinta at 2:17 PM on May 4, 2007
posted by vacapinta at 2:17 PM on May 4, 2007
I miss GeneFilter.
posted by monju_bosatsu at 2:39 PM on May 4, 2007
posted by monju_bosatsu at 2:39 PM on May 4, 2007
I'd be embarrassed to see a graphical depiction of how frequently I use the word "fucktard". So let's do this thing!
posted by Mister_A at 2:41 PM on May 4, 2007
posted by Mister_A at 2:41 PM on May 4, 2007
I miss GeneFilter.
Me too. That was a thousand times better than this could ever be. Bring back GeneFilter!
posted by languagehat at 2:51 PM on May 4, 2007
Me too. That was a thousand times better than this could ever be. Bring back GeneFilter!
posted by languagehat at 2:51 PM on May 4, 2007
A scattershot summation of comments which span thousand of different topics seems utterly useless.
Maybe, but it'd reveal some fun idiosyncracies if done right, and give a reasonable view into topics of interest.
It seems like one way to make the analysis more interesting is to get away from the idea of just emphasizing words in a cloud only according to raw frequency, and instead hilight those words for each user which appear more often than compared to a normalized mefite baseline.
So we count the total number of occurances of token "terror" in all mefi comments, and the total number of comments, and get our mefi-specific word frequency. And then compare each specific mefite's frequency for that word to the baseline, and hilite the outliers. (Would take some futzing with thresholds to get it looking good, especially accounting for mefites with small total word counts, but that's all just polish.)
And yeah, it's a silly thing, but silly things are awesome.
posted by cortex (staff) at 2:57 PM on May 4, 2007
Maybe, but it'd reveal some fun idiosyncracies if done right, and give a reasonable view into topics of interest.
It seems like one way to make the analysis more interesting is to get away from the idea of just emphasizing words in a cloud only according to raw frequency, and instead hilight those words for each user which appear more often than compared to a normalized mefite baseline.
So we count the total number of occurances of token "terror" in all mefi comments, and the total number of comments, and get our mefi-specific word frequency. And then compare each specific mefite's frequency for that word to the baseline, and hilite the outliers. (Would take some futzing with thresholds to get it looking good, especially accounting for mefites with small total word counts, but that's all just polish.)
And yeah, it's a silly thing, but silly things are awesome.
posted by cortex (staff) at 2:57 PM on May 4, 2007
MetaFilter: yeah, it's a silly thing, but silly things are awesome.
posted by cgc373 at 3:02 PM on May 4, 2007
posted by cgc373 at 3:02 PM on May 4, 2007
And then compare each specific mefite's frequency for that word to the baseline, and hilite the outliers.
That sounds good. It would catch all the people who write "dependant" instead of "dependent" as well as all the people who use in-jokes way too much.
posted by vacapinta at 3:12 PM on May 4, 2007
That sounds good. It would catch all the people who write "dependant" instead of "dependent" as well as all the people who use in-jokes way too much.
posted by vacapinta at 3:12 PM on May 4, 2007
I'd just like even to see a list of tags for my own comments. Not any kind of nitty gritty text crunching like you guys seem to be talking about here, but just a list of tags attached to posts which I've commented on. AFAIK, as things stand you can only see the top few tags for things where you're the original poster, right?
posted by juv3nal at 3:15 PM on May 4, 2007
posted by juv3nal at 3:15 PM on May 4, 2007
I think it's a pretty nifty idea. I wonder how much mine would reveal about me, that even I don't know. Like; "Hey, I had no idea that used the term 'pig-fucker' that often. What does that say about me?.."
I say bring it on. Self examination should always be a source of amusement.
posted by quin at 3:44 PM on May 4, 2007
I say bring it on. Self examination should always be a source of amusement.
posted by quin at 3:44 PM on May 4, 2007
But crunching that much text for thousands of users would be a bit resource intensive no matter which way you go.
Okay, how about just giving me everything I've written and let me do the crunching?
posted by timeistight at 3:50 PM on May 4, 2007
Okay, how about just giving me everything I've written and let me do the crunching?
posted by timeistight at 3:50 PM on May 4, 2007
What we need is a Firefox extension that generates a popup tag cloud when you mouse over a user name. I have no idea how to produce such magical goodness, but someone here can do it.
posted by LarryC at 4:06 PM on May 4, 2007
posted by LarryC at 4:06 PM on May 4, 2007
Uh oh. I can see that whole topless phase o' mine is going to continue to haunt me...
posted by miss lynnster at 5:34 PM on May 4, 2007
posted by miss lynnster at 5:34 PM on May 4, 2007
What we need is a Firefox extension that generates a popup tag cloud when you mouse over a user name
Would an extension which lets you select and set various tags for individual users be sufficient? 'Cuz yesterday I was reading through AskMe and ran across a virulently stupid remark and thought, "hmmm, now I remember, this person has idiot trended for a while now and it isn't even me." Which made me think creating an extension to tag users per post might be handy to better track different users' credibility or interests or whatever. And then I took a nap and got over it. But it's still an idea.
Otherwise, an automatic tag cloud extension as you describe would probably need to accrete tags over a period of time due to the multiple page hits/lookups necessary to grow it, as post history is slowly mined out. You also couldn't do it for every single member simultaneously without carting around a prohibitive amount of data in the browser session, but there are ways around that, e.g. limiting to particular members or day counts. Possible. Just hard.
posted by mdevore at 6:02 PM on May 4, 2007
Would an extension which lets you select and set various tags for individual users be sufficient? 'Cuz yesterday I was reading through AskMe and ran across a virulently stupid remark and thought, "hmmm, now I remember, this person has idiot trended for a while now and it isn't even me." Which made me think creating an extension to tag users per post might be handy to better track different users' credibility or interests or whatever. And then I took a nap and got over it. But it's still an idea.
Otherwise, an automatic tag cloud extension as you describe would probably need to accrete tags over a period of time due to the multiple page hits/lookups necessary to grow it, as post history is slowly mined out. You also couldn't do it for every single member simultaneously without carting around a prohibitive amount of data in the browser session, but there are ways around that, e.g. limiting to particular members or day counts. Possible. Just hard.
posted by mdevore at 6:02 PM on May 4, 2007
OK, OK, OK! Yes! You caught me! Fine! Christ! Sure, Seventeen year old girls shows up a lot in the cloud! So what? I'm just being honest, unlike the rest of you horny old goats (I saw you looking, don't deny it).
And I have no idea how pubic hair, ferret-legging, and where-is-the-rainbow-party got in the cloud, but it's a LIE propagated by the CABAL to DISCREDIT me.
posted by maxwelton at 6:31 PM on May 4, 2007
And I have no idea how pubic hair, ferret-legging, and where-is-the-rainbow-party got in the cloud, but it's a LIE propagated by the CABAL to DISCREDIT me.
posted by maxwelton at 6:31 PM on May 4, 2007
Interesting. I wonder what the algorithm behind it was? It doesn't look like it's plugged into a language parser... looks like it's:
Grab a handful of comments
Split them by periods, randomly pick one of the resulting segments (ie. a sentence) from each
Concatenate the sentences, in random order
The difficult part is the first step of course.
posted by Firas at 7:50 PM on May 4, 2007
Grab a handful of comments
Split them by periods, randomly pick one of the resulting segments (ie. a sentence) from each
Concatenate the sentences, in random order
The difficult part is the first step of course.
posted by Firas at 7:50 PM on May 4, 2007
When Web 3.0 arrives next month, clouds are going to start to look ridiculous.
posted by jbickers at 7:59 PM on May 4, 2007
posted by jbickers at 7:59 PM on May 4, 2007
I've looked at tag clouds from both sides now.
I really don't know tag clouds at all.
Still, it's a really neat idea, even if I'd be kind of scared to see my results.
posted by Alvy Ampersand at 8:28 PM on May 4, 2007
I really don't know tag clouds at all.
Still, it's a really neat idea, even if I'd be kind of scared to see my results.
posted by Alvy Ampersand at 8:28 PM on May 4, 2007
Fuck fucking farfegnugen Federov filibuster Faberge Holocaust infinitesmimal Appallonia Prince
posted by klangklangston at 8:38 PM on May 4, 2007
posted by klangklangston at 8:38 PM on May 4, 2007
The lack of BIG tag makes that less funny; just imagine where appropriate.
posted by klangklangston at 8:39 PM on May 4, 2007
posted by klangklangston at 8:39 PM on May 4, 2007
"Appallonia" is worth the price of a cloud on its own, klangklangston.
posted by cgc373 at 9:33 PM on May 4, 2007
posted by cgc373 at 9:33 PM on May 4, 2007
Firas: google "Markov Chains". Probabilistic regurgitation of source material according to a frequency table of expected cooccurances. Neat shit.
posted by cortex (staff) at 11:08 PM on May 4, 2007
posted by cortex (staff) at 11:08 PM on May 4, 2007
OK, so... is there any reason the MeFi database should not be made publicly available? Like the Wikipedia database. Is it full of secret information that is not already visible across the site? Could that not be easily stripped out before distributing, say, monthly snapshots?
This tag cloud idea is exactly the kind of really-time-consuming-but-kinda-interesting thing which is crying out for a third party with too much time on his hands to do.
posted by hoverboards don't work on water at 2:42 AM on May 5, 2007
This tag cloud idea is exactly the kind of really-time-consuming-but-kinda-interesting thing which is crying out for a third party with too much time on his hands to do.
posted by hoverboards don't work on water at 2:42 AM on May 5, 2007
"Is it full of secret information that is not already visible across the site?"
It has Matt's comments about all the users plastered across so only he and Jessamyn can see (sorry cortex, you're too new), in order to keep us straight.
Allowing us to see how the MeFi sausage is made would lead to tears.
posted by klangklangston at 6:34 AM on May 5, 2007
It has Matt's comments about all the users plastered across so only he and Jessamyn can see (sorry cortex, you're too new), in order to keep us straight.
Allowing us to see how the MeFi sausage is made would lead to tears.
posted by klangklangston at 6:34 AM on May 5, 2007
I've always been for a db dump of some sort. Matt was talking about it as a possibility months back, in a thread in the grey, but I don't think we quite got him to critical mass on making it happen.
It's got to be a hefty chunk of data, though, and once we're talking about handing over essentially 8 years of what mefi is, arguing toward a more closed, opt-in, state your intentions release starts to sound like a good idea.
Is it full of secret information that is not already visible across the site?
There's stuff that would need to be scrubbed for general release, I think.
(sorry cortex, you're too new)
So you think, Mister 'doesn't play well with others; see medical report viz. sexual difficulties; rated C-14 on bannability scale; lousy at badminton'.
posted by cortex (staff) at 7:27 AM on May 5, 2007
It's got to be a hefty chunk of data, though, and once we're talking about handing over essentially 8 years of what mefi is, arguing toward a more closed, opt-in, state your intentions release starts to sound like a good idea.
Is it full of secret information that is not already visible across the site?
There's stuff that would need to be scrubbed for general release, I think.
(sorry cortex, you're too new)
So you think, Mister 'doesn't play well with others; see medical report viz. sexual difficulties; rated C-14 on bannability scale; lousy at badminton'.
posted by cortex (staff) at 7:27 AM on May 5, 2007
You're obviously not seeing the real info, cortex— You'd know that my sexual difficulties are directly related to being lousy (also indecent) at badminton. Or "bad mitten," as I like to think of it.
posted by klangklangston at 7:50 AM on May 5, 2007
posted by klangklangston at 7:50 AM on May 5, 2007
I've always been for a db dump of some sort.
Goody!
Matt was talking about it as a possibility months back, in a thread in the grey, but I don't think we quite got him to critical mass on making it happen.
It's got to be a hefty chunk of data, though, and once we're talking about handing over essentially 8 years of what mefi is, arguing toward a more closed, opt-in, state your intentions release starts to sound like a good idea.
Okay, where do I sign up?
posted by timeistight at 9:51 AM on May 5, 2007
Goody!
Matt was talking about it as a possibility months back, in a thread in the grey, but I don't think we quite got him to critical mass on making it happen.
It's got to be a hefty chunk of data, though, and once we're talking about handing over essentially 8 years of what mefi is, arguing toward a more closed, opt-in, state your intentions release starts to sound like a good idea.
Okay, where do I sign up?
posted by timeistight at 9:51 AM on May 5, 2007
It's the fastest moving ball of any sport!
posted by klangklangston at 1:55 PM on May 5, 2007
posted by klangklangston at 1:55 PM on May 5, 2007
I think pb and matt should work on all the cool things they were discussing in podcast #6. Especially the admin ponies.
posted by terrapin at 2:04 PM on May 5, 2007
posted by terrapin at 2:04 PM on May 5, 2007
Oh, and that should be pronounced, Mee Ree, not Mee Reh.
posted by snsranch at 1:42 PM on May 4 [+]
[!]
Oh thank you kindly, I had just finished say May ray in my head...
posted by infini at 6:47 PM on May 5, 2007
posted by snsranch at 1:42 PM on May 4 [+]
[!]
Oh thank you kindly, I had just finished say May ray in my head...
posted by infini at 6:47 PM on May 5, 2007
Why yes, I think it should be easy for anybody who's interested to keep even closer tabs on us and everything we post here. Why don't we share our credit card numbers and ATM PINs too? Dave Daris, you first!
posted by davy at 12:15 AM on May 10, 2007
posted by davy at 12:15 AM on May 10, 2007
Take your meds.
posted by Dave Faris at 4:58 AM on May 10, 2007
posted by Dave Faris at 4:58 AM on May 10, 2007
You are not logged in, either login or create an account to post comments
posted by cgc373 at 12:56 PM on May 4, 2007