Journey's End

Jan 28
2010

Shorty Awards Audit

After reading the sordid tale of ballot stuffing via twitter over at Bad Astronomy, I wonder if @mercola has the same "problem". Further, I wanted to know if it was also affecting @DrRachie.


To that end I wrote a python script to "audit" shorty award votes. Given a username, the script will scrape shortyawards.com for voters, and hit their twitter.com profile to generate a file containing 2 columns: username and number of updates. Users with deleted accounts will have -1 updates.


I have ran the script for @mercola, and at the time of data collection (UTC 1100) this is the breakdown of where the votes came from:

  • deleted accounts: 348 (12.07%)
  • accounts with 1 tweet: 407 (14.11%)
  • accounts with 2 tweets: 288 (9.98%)
  • other accounts: 1838 (63.71%)
  • total: 2885
The discrepancy of 4 comes from users who somehow managed to have no tweets: I suspect the account was deleted, then recreated. These 4 users were: bugoff48, budsgirl54, tracyaustin, janesperr.
You might wonder why I took an exception to users with 2 tweets. The following screen shots should suffice as an explanation:
I checked at random 10 users with only 2 tweets, and they were all people who created a twitter account for the express purpose of voting in the shorty awards, which is against the rules.
Personally, **I would say that only 64% of votes for @mercola are valid**. This puts him in the lead still, but only ~300 votes in front.
Feel free to do your own analysis of [the data](http://pastebin.com/f4565b739).
I am still running the script for DrRachie, so I will update when that script is done. In case you are wondering why it takes so long, that's because I am been nice and rate limiting my queries :)
**Update 1**: realised some users were showing up twice. Removed them, recalculated, re-linked data.
**Update 2**: @DrRachie's data is available! See the following.
OK, here is a break down of where @DrRachie's votes came from:
  • deleted accounts: 113 (6.50%)
  • accounts with 1 tweet: 41 (2.35%)
  • accounts with 2 tweets: 47 (2.70%)
  • other accounts: 1542 (88.42%)
  • total: 1744
Again there is a discrepancy, this time of a single user, Superpositional.
Just as I did for @mercola, I checked random accounts with 2 tweets. They all broke the rule. These accounts contained only tweets voting in the shorty awards.
**My personal opinion is that 88% of votes for @DrRachie are valid**, a percentage much higher than @mercola's.


Again, the [data is available](http://pastebin.com/m4d01a61a) for your own analysis.
What should be done about this, I hear you ask. Personally I am happy if @mercola and @DrRachie both have their vote count adjusted accordingly.
**Update 3**: I am running the same analysis for 1st and 2nd place for \#music, to see if the same pattern holds. Those results will be in a new post.
**Update 4**: I should point out that I am aware both @mercola and @DrRachie received votes in multiple categories. But seeing as how majority of votes are in \#health, I feel it would be Too Much Effort to separate the vote out. Though if enough people complain, I will fix it.
**Update 5**[Part 2](http://www.shuningbian.net/2010/01/shorty-awards-audit-part-2-exception-or.php) has been posted. It explores the question whether 64% valid votes is the exception or the rule.
Cheers,
Steve
ts=00:29 tags=[internet]