Tweetclouds reveal even more about people
I just finished another cut-n-paste-a-thon. In February it focused on scraping out a couple of month’s worth of blog posts and
pasting them into Many Eyes to create “real” tag clouds based on actual content instead of the author’s bookmarking.
This time, I was scraping a month’s worth of Tweets. You can blame my natural curiosity and JP Rangaswami, who’s idea it was.
I’ve said it before, someone needs to build an application that does this automatically. We need something that looks at our posts, tweet streams, and links and outputs things like tag clouds (based on what we’re writing), blog rolls (based on what we’re reading or linking to), and potentially parses things like resources. For example, Jeremiah Owyang barfs out so many links and stuff to read that trying to go back and find them when I have time is nearly impossible.
How I did it
There was nothing glamorous about this. Ultimately, it was tons of cut and pasting. I didn’t include timestamps, or other people’s names. Yes, including other people’s names could have been interesting in that it could show relationships but since I wanted to compare tweets with blogs, I wanted to focus on the substance of the conversations not relationships. So, I opened each person’s archive and pasted a month’s worth of tweets into a plain text file. That was then uploaded to Many Eyes. I must have forgotten that a month’s worth of tweets–especially for these people–is a hell of a lot of content.
Who I chose
Before some comes and slaps me with a gender stick–yes, these are all dudes. I chose three people because I knew it would be a lot of work. Obviously, I was going to include JP, since it was his idea. Dennis Howlett also raised his hand to be included. That left one more spot and since Jeremiah is a Tweet-a-saurus Rex and someone I know focuses on industry-focused Tweets, I arbitrarily chose him since I knew there could be a correlation to his blog.
JP Rangaswami
Interestingly JP’s blog (Jan/Feb 08) was very focused on the industry and his tweet stream was much more about his life. Although JP’s love of music appears in his blog, it’s front and center in his twitter stream. “Listening” refers to whatever music is playing while he’s at his computer. I liked how he had a lot of action words like “listening,” “watching” (typically Cricket), and “reading.” As well, his deep family orientation shines with references to “family,” “daughter,” “Hope,” (his daughter and her injury this month) and how happy he is to be “back” after a lot of travel. I found it apropos that JP’s tweet stream was almost musical–a bit like the pauses between his significant blog notes. It was interesting that there was less of a pattern in the two-word cloud. I think it shows that he’s posting a range of commentary. In comparison with the other two people I looked at, JP had the least Twitter conversations ( “@” replies).

Jeremiah Owyang
A month’s worth of Jeremiah’s tweets equals a full day of cut and pasting. Perhaps more if you’re not already a level 5 keystroker like I’ve become. Jeremiah’s tweet clouds were incredibly consistent with his blog. I’m not sure I’m surprised, though as someone who reads his stream it felt like there would have been some deeper perspective. Jeremiah’s insane curiousity and enthusiasm did pop out, however, with words like “interesting” and “great” clearly poking out. A hint of his power of promotion was also evident in his two word cloud with “blogger dinner” taking prominence. It was interesting that unlike JP and Dennis, Jeremiah actually did have a pattern in his two-word cloud. I think it shows how topically focused he is. Jeremiah had a balance of content and conversation with “@replies” composing about 50% of his tweets.

Dennis Howlett
Dennis wins the award for the most conversations. Nearly 100% of his tweets were replies so they guy is very much a part of the conversations. I found it interesting that his tweets were laden with opinionated words like “good,” “crap,” “dead,” “hell,” etc but that his two word could revealed the most with “optimizing efficiency” and “personal productivity” as examples.

Things people have said about this post
Sam,
Great Post. I continue to enjoy your work with tag clouds. On another note, there is a way to view all of any Twitter user’s link tweets. Building on your example of Jeremiah Owyang, I cloned and modified an existing Yahoo Pipe to create a filtered list of @jowyang ’s Tweets that contain a URL.
You can view it here: http://pipes.yahoo.com/pipes/pipe.info?_id=a23eee0fac5c42ef3f1acd4df6149c85
I know a bunch of social psychologists who would be highly intrigued. Great idea Sam. A question in regards to the clouds: as someone who has not used Tweetclouds yet, there look to be three clouds per person. What are the reasons for the three clouds and what is the differential in the data per tweetcloudset?
Sam, thanks a lot. I feel you’ve done well with the range of people you’ve chosen, the three of us have quite different styles. I need to think about the implications; I wouldn’t go so far as to say there are conclusions to draw, but I found the exercise insightful. thanks for doing it.
@Sharon- Sorry, i could see how it’s confusing. The first cloud is actually from the post I did analyzing these people’s blogs. The last two clouds are from me looking at their tweets (the only difference is whether it’s one word or two word tweet clouds).
……if only there was a way to track how many times a user changes their icon/pic/avatar… you’d be surprised how much of an obsession that is with some people…… “of course” i’m not talking about someone from this post
;0
-b
Re: ““real” tag clouds based on actual content instead of …” A noble sentiment!
A couple of days ago (yesterday?) Peter Westwood released a cloud of contributers to WordPress 2.5. Brilliantly parsimonious, yes? (I would have left the numbers off *shrug*>) What I saw as absent is how the cloud could be used as an interface / portal to richer information. (For me most everything is a dashboard … “mandala theory”, doncha know.)
I visited TagCrowd.com … *plop-plop fizz-fizz* after providing the URL for my project’s quasi-bibliography I became the proud father of a sweet/primitive data-based cloud. (I’ve tweaked the font sizes there; TagCrowd really should offer min/max as a configuration item.)
All of this is great fun. But … where’s the beef?
It’s cute. And it’s totally dead-ended. (Tranform a “silo” into a planter so as to allow the tree to shoot up out of it?)
I’m reminded of my first sessions using PsychLit; with so much functionality, there just has to be a way of having it jump up and make toast. I optimized my PsychLit searches and was handsomely rewarded.
Looking at clouds, pondering VRML and visualizations of multivariate analysis, I can’t help thinking that we’re just one step away from some very richly interactive information.
My point is this: ATM clouds seem to me to be data, rather than information. I suspect we will be handsomely rewarded when we transform them by enriching them with another layer of interactivity. Or two.
I can see a way of using them as ?what? a partial product … the interim step in a process that finds who’s like who, or which document is like which others … comparing clouds in a cloudy sky, if you will. (”CloudySky” … a good name for a software suite?)
Thanks for including me in on this, sorry it stole your day away from you.
[…] is already very large. It matters little as you can still @jowyang and I’ll see your message. Sam Lawrence did analysis on my Twitter behavior (even on a Sunday) and noticed that about half of my tweets are @replying to […]
check out http://tweetclouds.com it does something very similar to this.
Liked this when I first saw it, and when I saw tweetclouds.com I immediately thought of it. John K’s beaten me to the comment though. I’d tweeted about tweetclouds, so I’ve added a pointer back here.
[…] TweetClouds […]