I just finished another cut-n-paste-a-thon. In February it focused on scraping out a couple of month’s worth of blog posts and
pasting them into Many Eyes to create “real” tag clouds based on actual content instead of the author’s bookmarking.
This time, I was scraping a month’s worth of Tweets. You can blame my natural curiosity and JP Rangaswami, who’s idea it was.
I’ve said it before, someone needs to build an application that does this automatically. We need something that looks at our posts, tweet streams, and links and outputs things like tag clouds (based on what we’re writing), blog rolls (based on what we’re reading or linking to), and potentially parses things like resources. For example, Jeremiah Owyang barfs out so many links and stuff to read that trying to go back and find them when I have time is nearly impossible.
How I did it
There was nothing glamorous about this. Ultimately, it was tons of cut and pasting. I didn’t include timestamps, or other people’s names. Yes, including other people’s names could have been interesting in that it could show relationships but since I wanted to compare tweets with blogs, I wanted to focus on the substance of the conversations not relationships. So, I opened each person’s archive and pasted a month’s worth of tweets into a plain text file. That was then uploaded to Many Eyes. I must have forgotten that a month’s worth of tweets–especially for these people–is a hell of a lot of content.
Who I chose
Before some comes and slaps me with a gender stick–yes, these are all dudes. I chose three people because I knew it would be a lot of work. Obviously, I was going to include JP, since it was his idea. Dennis Howlett also raised his hand to be included. That left one more spot and since Jeremiah is a Tweet-a-saurus Rex and someone I know focuses on industry-focused Tweets, I arbitrarily chose him since I knew there could be a correlation to his blog.
JP Rangaswami
Interestingly JP’s blog (Jan/Feb 08) was very focused on the industry and his tweet stream was much more about his life. Although JP’s love of music appears in his blog, it’s front and center in his twitter stream. “Listening” refers to whatever music is playing while he’s at his computer. I liked how he had a lot of action words like “listening,” “watching” (typically Cricket), and “reading.” As well, his deep family orientation shines with references to “family,” “daughter,” “Hope,” (his daughter and her injury this month) and how happy he is to be “back” after a lot of travel. I found it apropos that JP’s tweet stream was almost musical–a bit like the pauses between his significant blog notes. It was interesting that there was less of a pattern in the two-word cloud. I think it shows that he’s posting a range of commentary. In comparison with the other two people I looked at, JP had the least Twitter conversations ( “@” replies).

Jeremiah Owyang
A month’s worth of Jeremiah’s tweets equals a full day of cut and pasting. Perhaps more if you’re not already a level 5 keystroker like I’ve become. Jeremiah’s tweet clouds were incredibly consistent with his blog. I’m not sure I’m surprised, though as someone who reads his stream it felt like there would have been some deeper perspective. Jeremiah’s insane curiousity and enthusiasm did pop out, however, with words like “interesting” and “great” clearly poking out. A hint of his power of promotion was also evident in his two word cloud with “blogger dinner” taking prominence. It was interesting that unlike JP and Dennis, Jeremiah actually did have a pattern in his two-word cloud. I think it shows how topically focused he is. Jeremiah had a balance of content and conversation with “@replies” composing about 50% of his tweets.

Dennis Howlett
Dennis wins the award for the most conversations. Nearly 100% of his tweets were replies so they guy is very much a part of the conversations. I found it interesting that his tweets were laden with opinionated words like “good,” “crap,” “dead,” “hell,” etc but that his two word could revealed the most with “optimizing efficiency” and “personal productivity” as examples.



