January 27, 2010

Shorty Awards Audit

After reading the sordid tale of ballot stuffing via twitter over at Bad Astronomy, I wonder if @mercola has the same "problem". Further, I wanted to know if it was also affecting @DrRachie.


To that end I wrote a python script to "audit" shorty award votes. Given a username, the script will scrape shortyawards.com for voters, and hit their twitter.com profile to generate a file containing 2 columns: username and number of updates. Users with deleted accounts will have -1 updates.


I have ran the script for @mercola, and at the time of data collection (UTC 1100) this is the breakdown of where the votes came from:
  • deleted accounts: 348 (12.07%)
  • accounts with 1 tweet: 407 (14.11%)
  • accounts with 2 tweets: 288 (9.98%)
  • other accounts: 1838 (63.71%)
  • total: 2885


The discrepancy of 4 comes from users who somehow managed to have no tweets: I suspect the account was deleted, then recreated. These 4 users were: bugoff48, budsgirl54, tracyaustin, janesperr.


You might wonder why I took an exception to users with 2 tweets. The following screen shots should suffice as an explanation:





I checked at random 10 users with only 2 tweets, and they were all people who created a twitter account for the express purpose of voting in the shorty awards, which is against the rules.


Personally, I would say that only 64% of votes for @mercola are valid. This puts him in the lead still, but only ~300 votes in front.


Feel free to do your own analysis of the data.


I am still running the script for DrRachie, so I will update when that script is done. In case you are wondering why it takes so long, that's because I am been nice and rate limiting my queries :)


Update 1: realised some users were showing up twice. Removed them, recalculated, re-linked data.


Update 2: @DrRachie's data is available! See the following.


OK, here is a break down of where @DrRachie's votes came from:


  • deleted accounts: 113 (6.50%)
  • accounts with 1 tweet: 41 (2.35%)
  • accounts with 2 tweets: 47 (2.70%)
  • other accounts: 1542 (88.42%)
  • total: 1744


Again there is a discrepancy, this time of a single user, Superpositional.


Just as I did for @mercola, I checked random accounts with 2 tweets. They all broke the rule. These accounts contained only tweets voting in the shorty awards.


My personal opinion is that 88% of votes for @DrRachie are valid, a percentage much higher than @mercola's.


Again, the data is available for your own analysis.


What should be done about this, I hear you ask. Personally I am happy if @mercola and @DrRachie both have their vote count adjusted accordingly.


Update 3: I am running the same analysis for 1st and 2nd place for #music, to see if the same pattern holds. Those results will be in a new post.


Update 4: I should point out that I am aware both @mercola and @DrRachie received votes in multiple categories. But seeing as how majority of votes are in #health, I feel it would be Too Much Effort to separate the vote out. Though if enough people complain, I will fix it.


Update 5Part 2 has been posted. It explores the question whether 64% valid votes is the exception or the rule.


Cheers,
Steve



1 comment:

  1. Agreed, as long as the votes are adjusted, it's all good. Besides, this is still just nominations.. top 5 in the category advance to the final phase. At that point everyone votes again, but the votes aren't really meaningful and effectively can be vetoed by the Shorty Awards staff who will go through the tweets and websites of the nominees and decide a winner. What criteria they use, who knows.. I hope DrRachie wins though. :)

    ReplyDelete