A couple of thoughts on microblogging.

07 January 2015

The thing about microblogging, or services which allow posts that are very short (around 140 characters) and are disseminated in the fashion of a broadcast medium is that it lends itself to fire-and-forget posting. See something, post it, maybe attach a photograph or a link and be done with it. If your goal is to get information out to lots of people at once leveraging one's social network is criticial: Post something, a couple of the users following you repost it so that more people see it, a couple of their followers repost it in turn... like ripples on the surface of a pond information propagates across the Net like radio waves through the air. Unfortunately, this also lends itself to people taking things at face value. By just looking at the text posted (say, the title of an article) without following the link and reading the article it's very easy for people to let the title or the text mislead them. News sites call this clickbait, and either use it quietly, because the goal is to get people to click in and get the ads and not actually have decent articles, or they religiously swear against using it and put forth the effort to write articles that don't suck.

There is another thing that is worth noting: Microblogging sites like Twitter also carry out location-based trend analysis of what's being posted and offer each user a list of the terms that are statistically significant near them. It's a little tricky to get just trending terms but sometimes you can make an end run with the mobile version of the site. By default trending terms are tailored to the user's history and perceived geographic location, but this can be turned off. At a glance it's very easy to look at whatever happens to be trending, check out the top ten or twenty tweets, and not bother digging any deeper because that seems to be what's happening. However, that can be misleading in the extreme for several reasons. First of all, as mentioned earlier trending terms are regional first and foremost - just because your neighborhood seems boring and quiet doesn't mean that the next town over isn't on fire and crying for help. Second, it's already known that regional censorship is being practiced to keep certain bits of information completely away from certain parts of the world without resorting to "block the site entirely" censorship tactics used in some countries. Of course, the reverse is also true: It's possible to manipulate trends to make things pop to the surface, either to ensure that something gets seen (in the right way, possibly) or to push other terms off the bottom of the trending terms list.

For some time I've been writing and deploying bots that interface with Twitter's user API, the service they offer which makes it possible to write code which interacts with their back end directly without having to write code that loads a page, parses the HTML, and susses out the stuff I'm interested in. It's ugly, unreliable, and a real pain in the ass, and I'd much rather do that only as a last resort if at all. Anyway, one of the things my bots do is interface with Twitter's trending terms in various places API as well as Twitter's keyword search API, download anything that fits their criteria, and then run their own statistical analysis to see if anything interesting shakes out. If their sensor nets do see anything I get paged in various ways depending on how serious the search terms are (ranging from "send an e-mail" to "generate speech and call me"). Sometimes it's the e-mails that wind up being the most interesting. Earlier this week my bots noticed the following:

BNWATL03 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/iZh2S0mZOU created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL01 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/uG2DUtADnf created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL04 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/bTQ58lEe25 created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL02 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/PscML5TvfN created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL06 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/CZEjc5Vqf1 created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL09 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/vfv4Wbw80L created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL05 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/U5WY39ZrVf created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL07 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/S5k5VXJ3sZ created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL10 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/5bpYcPkARs created_at: Sat Jan 03 06:41:42 +0000 2015 BNWATL08 - Instead of shooting to kill, Oakland police fire bean bag rounds at suspect with replica handgun. http://t.co/4uE3IV7Nlb created_at: Sat Jan 03 06:41:42 +0000 2015

All of those tweets were posted simultaneously (down to the second, which is the limit of timekeeping resolution available to me) from accounts that have the same username (modulo the number at the end). It wouldn't be very hard to set up a couple of e-mail addresses and scripts that automate Twitter account creation are floating around. Writing some code that tweets whatever you tell it is pretty easy; Here is a proof of concept written in Python that, at its heart, just uses the statuses/update API call. Twitter's URL shortening service t.co seems to dynamically generate new tiny URLs for the same link as long as it comes from different accounts, so that oddity is accounted for.

At least for the moment, Twitter's trending algorithms may or may not treat those ten accounts (regardless of their one or two character differences) as one account, as a suspect account, or what. They change up their back end from time to time and I haven't heard anything about it. It wouldn't be too difficult to modify a sock puppet generator script to not generate sequential Twitter usernames - I've already seen that happen under other circumstances. It wouldn't be hard to modify the bots to spoof their geographic locations to evade detection of trend manipulation, nor would it be difficult to make them not post all at the same time. This time there doesn't seem to be anything particularly evil about it, merely shady: If you click on any of the links in those tweets you will find that they take you to the same site and same blog post where the same Youtube video is playing. It looks like an attempt at using an existing event's trending search terms to game the site's rankings in search engines as part of a first attempt at blackhat SEO to be honest.

Now, for why I have bots running around doing nothing but watching for anomalies like this.

In many ways social media is the media outlet of the twenty-first century. Chances are, if anything happens you'll probably find out about it from friends on Facebook or Twitter or any number of other social networks extant today before it even merits a scrolling red-bar alert on a local television station. On one hand, this can mean that you find out about things more or less immediately, while you're in a position to act if you're able. On the other, the very short turn-around time of such a notification might mean that you look before you leap... it can also mean that large numbers of people can be hammered into either having an opinion that they might not otherwise have (even if it's only because getting slammed with way too many notifications in a short period of time pissed them off), or it might be due to the fact that a lot of people who seem to be much better informed broadcast strongly held opinions and speaking out against them doesn't seem worthwhile. Or it could just be because a flamewar happening just makes you not give a damn (as anyone who's ever read the comments on news articles on some of the bigger sites can attest (and, as we like to say, reading the comments causes cancer)). I didn't post about this a few years ago when it was new because people who had more time and energy were all over it (and more effectively, I'll admit) but among the privacy conscious it's been a fact of life that sock puppet software is being used to manipulate social media by either starting flamewars which distract everyone, spread dissenting opinions to move the discourse in a particular direction, or even push the discussion to one extreme or another. I've found no evidence that Ntrepid's contract has ended. There was a talk at Hack In the Box in 2014 by some researchers who work for Thinkst Applied Research in which they not only described how they forensically detected such manipulation happening right now but they wrote their own software to successfully manipulate discussions on e-mail lists, in comment threads, the popularity of articles on certain websites, and discussions on social networks to prove that it could be done without being a shadowy government operation. As data scientist Gilad Lotan once said of Twitter, it's easy to mistake popularity with credibility. Or truthfulness.

Makes you wonder about the Cypherpunks mailing list, doesn't it? Or Hackernews.

Ultimately, what I'm trying to say is this: There are patterns of activity in your media of choice, be it the newspaper, television, magazines, radio, websites, or social networks. Sometimes those patterns arise organically because people do what they do. But sometimes those patterns are engineered, and it would behoove all of us to keep this firmly in mind because ultimately we're the ones being manipulated. Use discernment when following the dialogue surrounding a particular issue because you can never be certain of why something is or is not being said.