The 2012 Obama campaign: Big Data Analysis and Gamification?
March 18, 2012
As the Republican presidential candidates continue to duke it out in contentious primary elections around the country, I’ve started to notice the increasingly public signs that the Obama campaign is gearing up for battle. Not surprisingly, I tend to focus on the Obama re-election team’s uses of digital technologies, where a number of shifts may result in important changes for both the voter-facing and internal components of the Obama For America’s (OFA) digital operations. I started writing this post with the intent of reviewing some of the recent news coverage of the campaign, but it turned into a bit more of a long-form reflection about the relationship between the campaign’s approach to digital tools might mean for democracy.
OFA 2.0: Bigger, Faster, & Stronger (Data)
A fair amount of media coverage has suggested that the major technology-driven innovations within OFA and the Democratic party this election cycle are likely to consist of refined collection and analysis collection of vast troves of voter data as opposed to highly visible social media tools (such as My.BarackObama.com) that made headlines in 2008.
As Daniel Kreiss & Phil Howard elaborated a few years ago, database centralization and integration became core strategic initiatives for the Democratic National Committee after the 2000 election and the Obama campaign in 2008. These efforts have been expanded in big ways during the build-up to the current campaign cycle.
According to the bulk of the (often quite breathless) reporting on the semi-secretive activities of the 2012 Obama campaign, the biggest and newest initiatives represent novel applications of the big data repositories gathered by the campaign and its allies in previous years. These include the imaginatively named “Project Narwhal” aimed at correlating diverse dimensions of citizens’ behavior with their voting, donation, and volunteering records. There is also “Project Dreamcatcher,” an attempt to harness large-scale text analytics to facilitate micro-targeted voter outreach and engagement.
For a vivid example of what these projects mean (especially if you’re on any of the Obama campaign email lists), check out ProPublica’s recent coverage comparing the text of different versions of the same fundraising email distributed by the campaign two weeks ago (the narrative is here and the actual data and analysis are here).
(Side note: in general, Sasha Issenberg’s coverage of these and related aspects of the campaigns for Slate is great.)
What’s Next: “Gamified” and Quasi-open Campaign App Development
As the Republicans sort out who will face Obama in November, OFA will, of course, roll-out more social media content and tools. In this regard, last week’s release of heavily hyped “The Road We’ve Traveled” on YouTube was only the beginning of the campaign’s more public-facing phase.
The polished, professional video suggests that OFA will build on all of the social media presence and experience they built during and the last cycle as well as over the intervening years of Obama’s administration.
Less visible and less certain are whether any truly new social media tools or techniques will emerge from the campaign or its allies. Here, there are two recent initiatives that I think we might be talking about more over the course of the next six months.
The first of these started late last year, when OFA experimented with a relatively unpublicized initiative called “G.O.P. Debate Watch.” Aptly characterized by Jonathan Easley in The Hill as a “drinking game style fundraiser” the idea was that donors committed to give money for every time that a Republican candidate uttered particular, politicized keywords identified ahead of time (e.g. “Obamacare” or “Socialist”).
In its attempt to combine entertainment and a little bit of humor with small-scale fundraising, G.O.P. Debate Watch fits with a number of OFA’s other techniques aimed at using digital initiatives to lower the barriers to participation and engagement. At the same time, it incorporates much more explicit game-dynamics, setting it apart from earlier efforts and exemplifying the wider trend towards commercial gamification.
The second initiative, which only recently became public knowledge, has just begun with OFA opening a Technology Field Office in San Francisco last week.
The really unusual thing about the SF office is that it appears as though the campaign will use it primarily to try to organize and harness the efforts of volunteers who possess computer programming skills. This sort of coordinated, quasi-open tool-building effort is completely unprecedented, especially within OFA, which has historically pursued a secretive and closed model of innovation and internal technology development.
If the S.F. technology field office results in even one or two moderately successful projects – I imagine there will be a variety of mobile apps, games, and related tools that it will release between now and November – it may give rise to a wave of similar semi-open innovation efforts and facilitate an even closer set of connections between Silicon Valley firms and OFA.
Is This What Digital Democracy Looks Like?
I believe that the applications of commercial data-mining tools and gamification techniques to political campaigns have contradictory implications for democracy.
On the one hand, big data and social games represent the latest and greatest tools available for campaigns to use to try to engage citizens and get them actively involved in elections. Given the generally inattentive and fragmented state of the American electorate, part of me therefore believes that these efforts ultimately serve a valuable civic purpose and may, over the long haul, help to create a vital and digitally-enhanced civic sphere in this country.
At the same time, it is difficult to see how the OFA initiatives I have discussed here (and others occurring elsewhere across the U.S. political spectrum) advance equally important goals such as promoting cross-ideological dialogue, deliberative democracy, voter privacy, political accountability, or electoral transparency. (Along related lines, Dan Kreiss has blogged his thoughts about the 2012 Obama campaign and its embodiment of a certain vision of “the technological sublime.”)
All the database centralization, data mining, and gamified platforms for citizen engagement in the world will neither make a dysfunctional democratic government any more accountable to its citizens; erase broken aspects of the electoral system; nor generate a more deeply democratic and representative networked public sphere. Indeed, these techniques have generally been used to grow the bottom line of private companies with little or no concern for whether or not any broader public goods are created or distributed. Voters, pundits, President Obama, and the members of his campaign staff would all do well to keep that in mind no matter what happens this Fall.
Truth and conferences
March 11, 2012

Craig Newmark (with an assist from the Colbert-head-on-a-stick puppet) shares his feelings about what he'd like to tell people who use the Internet to spread nefarious lies and misinformation.
It’s been a busy week. I spent two days of it attending the Truthiness and Digital Media symposium co-hosted by the Berkman Center and the MIT Center for Civic Media. As evidenced by the heart-warming picture above, the event featured an all-star crowd of folks engaged in media policy, research, and advocacy. Day 1 was a pretty straight-ahead conference format in a large classroom at Harvard Law School, followed on day 2 by a Hackathon at the MIT Media Lab. To learn more about the event, check out the event website, read the twitter hashtag archive, and follow the blog posts (which, I believe, will continue to be published over the next week or so).
In the course of the festivities, I re-learned an important, personal truth about conferences: I like them more when they involve a concrete task or goal. In this sense, I found the hackathon day much more satisfying than the straight-ahead conference day. It was great to break into a small team with a bunch of smart people and work on achieving something together – in the case of the group I worked with, we wanted to design an experiment to test the effects of digital (mis)information campaigns on advocacy organizations’ abilities to mobilize their membership. I don’t think we’ll ever pursue the project we designed, but it was a fantastic opportunity to tackle a problem I actually want to study and to learn from the experiences and questions of my group-mates (one of whom already had a lot of experience with this kind of research design).
The moral of the story for me is that I want to use more hackathons, sprints, and the like in the context of my future research. It is also an excellent reminder that I want to do some reading about programmers’ workflow strategies more generally. I already use a few programmer tools and tactics in my research workflow (emacs, org-mode, git, gobby, R), but the workflow itself remains a kludge of terrible habits, half-fixes, and half-baked suppositions about the conditions that optimize my putative productivity.
Matt Salganik and Karen Levy (both of the Princeton Sociology Department) recently released a working paper about what they call “Wiki Surveys” that raises several important points regarding the limitations of traditional survey research and the potential of participatory online information aggregation systems to transform the way we think about public opinion research more broadly.
Their core insight stems from the idea that traditional survey research based on probability sampling leaves a ton of potentially valuable information on the table. This graph summarizes that idea in an extraordinarily elegant (I would say brilliant) way:

Figure 1 from Salganik and Levy (2012), which they title: "a schematic rank order plot of contributions to successful information aggregation systems on the Web."
Think of the plot as existing within the space of all possible opinion data on a particular issue (or set of issues). No method exists for collecting all the data from all of the people whose opinions are represented by that space, so the best you – or any researcher – can do is find a way to collect a meaningful subset of that data that will allow you to estimate some characteristics of the space.
The area under the curve thus represents the total amount of information that you could possibly collect with a hypothetical survey instrument distributed to a hypothetical population (or sample) of respondents.
Traditional surveys based on probability sampling techniques restrict their analysis to the subset of data from respondents for whom they can collect complete answers to a pre-defined subset of closed-ended questions (represented here by the small white rectangle in the bottom left corner of the plot). This approach loses at least two kinds of information:
- the additional data that some respondents would be happy to provide if researchers asked them additional questions or left questions open-ended (the fat “head” under the upper part of the curve above the white rectangle);
- the partial data that some respondents would provide if researchers had a meaningful way of utilizing incomplete responses, which are usually thrown out or, at best, used to make estimates about the characteristics of whether attrition from the study was random or not (this is the long “tail” under the part of the curve to the right of the white rectangle).
Salganik and Levy go on to argue that many wiki-like systems and other sorts of “open” online aggregation platforms that do not filter contributions before incorporating them into some larger information pool illustrate ways in which researchers could capture a larger proportion of the data under the curve. They then elaborate some statistical techniques for estimating public opinion from the subset of information under the curve and detail their experiences applying theses techniques in collaboration with two organizations (the New York City Mayor’s Office and the Organization for Economic Cooperation and Development, or OECD).
If you’re not familiar with matrix algebra and Bayesian inference, the statistical part of the paper probably won’t make much sense, but I encourage anyone interested in collective intelligence, surveys, public opinion, online information systems, or social science research methods to read the paper anyway.
Overall, I think Salganik and Levy have taken an incredibly creative approach to a very deeply entrenched set of analytical problems that most social scientists studying public opinion would simply prefer to ignore! As a result, I hope their work finds a wide and receptive audience.
Yet another $0.02 on SOPA/PIPA
January 23, 2012
I don’t have a lot to add to the excellent overviews and insightful commentary the SOPA/PIPA debacle, but I thought I would round up a couple of thoughts as well as some of my favorite posts related to it.
SOPA and PIPA may be history for now, but you can be sure that they’ll be back in some form or another. As a result, the big question that interests me about this particular policy fight has to do with its implications for the distribution of political power around knowledge and technology policy.
The big story in this sense is that a quite substantial sub-population of the Internet’s most active users and most powerful organizations decided to blackout their sites on Wednesday. The blackout left Reddit, Google, Wikipedia, Craigslist, AND MORE at least partially disabled for the better part of the day.
This more popular activism has been matched by aggressive lobbying, testifying, wheeling & dealing on the Hill by a staggering coalition of Silicon Valley companies.
Both the majority of these companies as well as these large online collectives and communities have only begun to find their political voices. Moments like these – when groups coalesce around particular common causes and realize that they wield immense collective power can sometimes look really important after the fact when (say, twenty years from now) we’re living in a world where the MPAA and RIAA have continued to waste away and the bottom lines (and political arms) of the Googles, Facebooks, and Twitters of the world are likely to be doing even more heavy lifting in terms of national GDP and policy impact.
Will this be such a turning point? I think one of the biggest obstacles to long term transformation is the anti-political ideology that prevails among many Silicon Valley elites. By and large, many Silicon Valley companies would prefer to avoid public scrutiny even understand what it is they are trying to create (much less regulate it or use it effectively). This is an unfortunate reality because it means that it will take a very long time for the Valley to really catch Hollywood when it comes to political muscle.
There has also been very little overlap or effective attempts by Silicon Valley to harness the public opposition to Hollywood’s positions. Maybe the SOPA/PIPA experience can facilitate some organizational alliances and capacity building to fill that gap.
Read what other Berkman Center affiliates had to say about SOPA/PIPA this week.
In which the Internet out-twits me (again)
November 27, 2011
A little while ago, I thought that maybe I had an original idea. As usual, the Internet proved me wrong. This is almost certainly a common experience and may even be generalizable.
My idea was straightforward: having stumbled across a copy of Luc Sante’s NYRB Classics translation of Felix Feneon’s Novels in Three Lines, I believed that the book would make for a good twitter feed or something like that. To illustrate just one of the many reasons why Feneon and the Internet might get along, here is a portrait of Feneon painted by Paul Signac in 1890:
If you’re not familiar with Feneon, the French Wikipedia entry on him is a helpful start (and if, like me, you don’t really read French, the Google Translate version of the page is your friend). Other good places to read more about him and his work are this book review by Julian Barnes and this blogpost (authored by one “Mr. Whiskets”). Basically, Feneon was a Parisian anarchist, literature buff, translator, art critic, and journalist during the late 19th and early 20th centuries. Feneon’s “novels”were actually short “pointilist” reports of current events printed in the newspaper Le Matin. The texts are often funny, violent and ironic, maybe best characterized as a cross between the late, great “bus plunges off a cliff” NYT stories of yore; a poetic police blotter; and “metropolitan diaries” (sans smug New Yorker ‘tude). Here is another, somewhat more serious portrait of Feneon at work:
Nevermind all that, though. In fact, anything of substance about Feneon is besides the point here. A few quick searches revealed that I was late to the Feneon party. There exist no fewer than thirteen (!) Feneon twitter accounts <and> two Feneon-themed tumblr’s. As you would expect, some of this nouveau-Feneon content is good and some of it is crap. However, the point stands that long before the idea was even a glimmer in my eye, the Internet had already found Feneon and had reproduced, translated, imitated, and remixed him.
All of this suggests something like a Feneon Principle, or at least a Feneon Corollary to Rule 34 (H/T to Mako for that one).
Somebody on the Internet has already tried to turn anything you can think of into a meme.
Falsifiable hypotheses and empirical evidence to follow…
Calling Bullsh*t on the Facebook Governance Vote
April 24, 2009
Well, Facebook users’ votes on the proposed Terms and Conditions are in – all 650,000 of them – and the company is pleased to report that 75% of the voters approved!
Hang on a moment, though – they only got 650,000 votes? I thought they wanted 30% of the Facebook user population to participate…
Since Facebook claims over 200,000,000 users – 650K is less than one third of one percent. Thirty percent would have been 60 million votes, not a measly 650 thousand.
That’s as if the United States held a national vote to reform the constitution and only the state of Montana voted…And then somebody described the election as a success.
In fact, since only 75% – or 450,000 of the 650,000 voters actually approved of the new T and C, it is more accurate to say that less than one quarter of one percent of the Facebook population supports this proposal.
So the equivalent in a U.S. election would be if the entire population of Memphis, Tennessee voted in favor of amending the constitution; the population of Spokane, Washington voted against the amendment; and the rest of the country just sat it out on the sidelines.
Since Facebook spokes-persons seem to indicate that the company intends to accept this vote as a sufficient mandate for adopting the new T and C, they are turning my snarky twilight zone scenario into a reality.
Here’s Facebook’s chart of the results (as re-published on the LA Times’ Technology blog):
…and here’s my chart of the same results (sorry for the fuzzy image – feel free to take 5 minutes and bake your own if you want a better one):
Such a woeful mockery would be even funnier if it weren’t so sad. Go, go, gadget, democracy!
I draw two conclusions:
1. Facebook has been hoisted by their own petard and they probably deserve whatever they get. This was a well-intentioned – but nevertheless naive – stunt from the beginning. It’s unfortunate that nobody at FB saw fit to back up all the rhetoric of user-generated revolution with a more meaningful participatory process.
2. Legitimate democracy is really, really hard. It doesn’t matter if it’s online or not. It’s not as simple as just holding a vote and hoping everyone will show up. It’s also not as simple as saying that the Facebook users were irresponsible because they didn’t show up. You have to build a culture of democracy in order to support democratic institutions like elections. That doesn’t happen overnight and it may be that a population like the users of Facebook isn’t sufficiently organized or engaged to begin that process.

The SF Chron reports that Gavin Newsom loudly declared his candidacy for governor via social media services like Facebook and Twitter yesterday.
In his speech Newsom promised “to spin CA to the future.”
Ugh.
Welcome to the sad reality of post-Obama politics in the U.S., where every candidate will succumb to the temptation to imitate the form and style of the OFA campaign without capturing the substance.
If Newsom is any indication, many of these candidates will fall flat on their faces – repeatedly – in the process.
Somebody get this man a new speech writer.
(updated: April 24, 2009 )

New Pew Survey: The Internet in Campaign 2008
April 16, 2009
The crazy-productive folks at Pew’s Internet and American Life project have a new survey published looking at The Internet’s Role in Campaign 2008.
There’s a lot of fun results to mine for anybody interested in political news consumption, participation and engagement via the Internet. I still need to read it more closely, but some of my favorite sound-bites so far:
- ~20% of those surveyed posted political commentary or content online
- ~20% of those surveyed reported seeking news sources that challenged their point of view
- A handy chart comparing where self-identified democrats and republicans get their online news. Statistically significant differences are marked with a “^” (Hint: look at CNN, Fox, Radio, and the Internet). Caveat: see my methodological comments below before interpreting this too deeply.
- This staggering time-series graph illustrating the decline of newspapers as a primary source of political news over the past 10 years or so (respondents were only allowed to mention their top two sources of news)
On a methodological note, it’s interesting that the surveyors chose to conduct the survey via land-line telephones only.
Some of you might recall that Pew also published some really interesting data in the middle of the campaign season suggesting that cell-only voters are disproportionately young, democratic, and Internet users.
Despite the fact that the surveyors weighted their results to try to reflect the demographics of telephone users in the U.S. as a whole, I take that to imply that the numbers in this latest survey should provide a conservative estimate the total Internet use in the population as a whole. At the same time, I think it undermines some of the comparisons between democratic and republican voters based on the land-line only data.
Google, Embeddedness, and De-facto Censorship
December 17, 2008
Chris Soghoian describes how he bumped up against Google’s questionable ad-sense trademark enforcement policies.
Soghoian’s story is troubling and it exposes yet another way in which the structure of web traffic has positioned Google as a de-facto arbiter of all kinds of legal speech, political salience, and good taste. More broadly, it demonstrates how key actors and institutions exercise influence in the networked public sphere.
For more on that idea, check out Matthew Hindman’s research. In his new book, The Myth of Digital Democracy, Hindman makes a related argument in a number of different ways, not the least of which is his compelling notion of “Googlearchy.” I disagree with Matt on a number of substantive points, but the significance of his analysis is undeniable. His work complements more established models for thinking about how social structure circumscribes certain kinds of thought and action.
One of the fascinating aspects of the Internet is that powerful forms of social order & status originate in seemingly innocuous expressions of aggregated opinions (e.g. the PageRank algorithm). Hindman’s work takes on the notion that such aggregated opinions are somehow equivalent to a utopian radical democracy or a free market of ideas.
In this sense, his argument parallels the work of economic sociologists, many of whom have analyzed the importance of the “embeddedness” of economic markets. Simply put, the thesis behind the concept of embeddedness is that the sorts of decentralized, disaggregated behaviors that occur in market-like settings are always an extension of the social and cultural contexts in which they occur. It’s a relatively simple idea, but it violates one of the core assumptions of neo-classical economic theory: that markets are a free and accurate expression of individual actors expressing rational preferences for the enhancement of their own wealth and welfare.
Sociologists such as Viviana Zelizer have shown how the economists’ assumptions break down in markets for deeply valued cultural goods such as intimacy and adoption. More recently, a number of scholars (including Marion Fourcade – a professor of mine at Berkeley) have taken up the idea that financial markets are also expressions of (economists’) cultural preferences and not merely an aggregated form of pure rationality.
Considering Hindman’s work and the continuing emergence of experiences like Soghoian’s, I think there’s a case to be made that research on the embeddedness of search technology might be a promising topic. Granted, I don’t know if there are many “neo-classical” information theorists out there that would be willing to defend the straw-man position that search technology serves up knowledge in a pure and rational form.
The Change.gov question tool: web two-point-boring
December 16, 2008
President-Elect Obama’s team at Change.gov has posted their first batch of replies to a few of the most popular inquiries submitted via the bally-hooed question tool since it went live last week.
If you really want to read the responses, go ahead, knock yourself out. They’re just like the comment threads at DailyKos/LGF/Wonkette/BoingBoing/Lifehacker except they’re completely dry, soul-less, and snark-free.
Take this stirring exchange, for example:
Q: “What will you do to establish transparency and safeguards against waste with the rest of the Wall Street bailout money?” Diane, New Jersey
A: President-elect Barack Obama does not believe an economic crisis is an excuse for wasteful and unnecessary spending. As our economic teams works with congressional leadership to put together a plan, we will put in place reforms to ensure that your money in invested well. We will also bring Americans back into government by amending executive orders to ensure that communications about regulatory policymaking between persons outside government and all White House staff are disclosed to the public. In addition all appointees who lead the executive branch departments and rulemaking agencies will be required to conduct the significant business of the agency in public so that every citizen can see in person or watch on the Internet these debates.
Can you even remember the question after all that opacity? Turns out the White House press corps might not be out of a job after all.
Be honest, though, who’s actually surprised that the Obama team is sticking to their script and refusing to engage in precisely the sort of off-the-cuff banter that makes conversations on the Internet interesting? I was at an event with a few members of their new media team last week and these folks are at least as disciplined as a Bill Belichik offense.
For all the hoopla about the many wonderful ways in which Change.gov might transform the relationship between the POTUS and the rest of us, it’s going to take more than a few Rick-rolls before somebody mistakes this site for 4chan. (Dear Mr. President Elect, I video taped myself asking you a very important question: please watch it here!)
That said, the first idiot who celebrates the fact that someone in the transition team took the time to answer the question about legalizing pot ought to have their head examined. It may be Democracy in motion, but only in the sense in which Jeffersonian “mob rule” sense. I’m not one to romanticize the high-flown days of the republic of media gatekeeping, but this is just campaigning by other means and a waste of everybody’s time.
It may not matter, though, because unless the Change.gov team loosens up a bit (and opens the door to the risk of a mini scandal or two), I suspect people will quickly forget about this site after the inauguration.







subscribe to my feed