Matt Salganik and Karen Levy (both of the Princeton Sociology Department) recently released a working paper about what they call “Wiki Surveys” that raises several important points regarding the limitations of traditional survey research and the potential of participatory online information aggregation systems to transform the way we think about public opinion research more broadly.
Their core insight stems from the idea that traditional survey research based on probability sampling leaves a ton of potentially valuable information on the table. This graph summarizes that idea in an extraordinarily elegant (I would say brilliant) way:
Think of the plot as existing within the space of all possible opinion data on a particular issue (or set of issues). No method exists for collecting all the data from all of the people whose opinions are represented by that space, so the best you – or any researcher – can do is find a way to collect a meaningful subset of that data that will allow you to estimate some characteristics of the space.
The area under the curve thus represents the total amount of information that you could possibly collect with a hypothetical survey instrument distributed to a hypothetical population (or sample) of respondents.
Traditional surveys based on probability sampling techniques restrict their analysis to the subset of data from respondents for whom they can collect complete answers to a pre-defined subset of closed-ended questions (represented here by the small white rectangle in the bottom left corner of the plot). This approach loses at least two kinds of information:
- the additional data that some respondents would be happy to provide if researchers asked them additional questions or left questions open-ended (the fat “head” under the upper part of the curve above the white rectangle);
- the partial data that some respondents would provide if researchers had a meaningful way of utilizing incomplete responses, which are usually thrown out or, at best, used to make estimates about the characteristics of whether attrition from the study was random or not (this is the long “tail” under the part of the curve to the right of the white rectangle).
Salganik and Levy go on to argue that many wiki-like systems and other sorts of “open” online aggregation platforms that do not filter contributions before incorporating them into some larger information pool illustrate ways in which researchers could capture a larger proportion of the data under the curve. They then elaborate some statistical techniques for estimating public opinion from the subset of information under the curve and detail their experiences applying theses techniques in collaboration with two organizations (the New York City Mayor’s Office and the Organization for Economic Cooperation and Development, or OECD).
If you’re not familiar with matrix algebra and Bayesian inference, the statistical part of the paper probably won’t make much sense, but I encourage anyone interested in collective intelligence, surveys, public opinion, online information systems, or social science research methods to read the paper anyway.
Overall, I think Salganik and Levy have taken an incredibly creative approach to a very deeply entrenched set of analytical problems that most social scientists studying public opinion would simply prefer to ignore! As a result, I hope their work finds a wide and receptive audience.
April 10, 2009
Tony Curzon Price has a thoughtful piece at Open Democracy in which he examines what he calls The G20′s sins of commission.
I’m interested in a whole bunch of angles that Price explores, but the money shot for all you global governance and development geeks out there is a graph Curzon Price recycles from Paul Swartz at Council on Foreign Affairs Geo-Graphics blog:
Curzon Price goes on to use the graph to make an interesting (and important) claim about the implications of China’s newfound romance with the IFI’s and global regulation.
I, on the other hand, thought it would be kind of fun to play with the graph to try to get a better sense of what may have driven these changes in the IMF’s role over time. Since Swartz doesn’t share the data or source for his graphic, I’m reduced to hacking around with the .jpg in the GIMP (which made for a really fun distraction during a meeting the other day). Apologies for the resulting visual clutter, but here’s the same graph with some new knobs and bits. The bigger dots correspond to the events that accompanied the biggest shifts:
- Margaret Thatcher elected: May 1979
- Black Monday: Dec 1987
- Berlin Wall taken down: November, 1989
- Soviet Union Collapses: December 8, 1991
- Mexican Peso crisis: Dec 1994
- Asian Financial crisis: July 1997
- Brazil devalues the Real: Jan 1999
- Dot-com bubble bursts: March 10, 2000
- September 11, 2001
- Argentine debt default: Dec 2001
- US invades Iraq: March 20, 2003
- Brazil and Argentina pay off IMF debts: Dec. 2005
- Global Recession: October 2008
Some of the things I thought might correlate with sudden changes in the global weight of IMF lending – such as Black Monday (2); the Dot-com bubble burst (8); Argentina and Brazil paying off their debts (12) – didn’t seem to matter at all.
Others – such as Thatcher’s (and Reagan’s) election (1); the Mexican Peso crisis (5); and the 1-2 combo of the Asian (6) and Brazilian (7) financial crises – appear magnified when seen through this lens.
Most intriguing to me is the long steep slide that occurs following September 11, 2001 (9). My inclination is to explain that as the result of a perfect storm that combined the eroding credibility of the IMF (Joe Stiglitz, eat your heart out!) and a real estate derivative and petro-dollar fueled explosion of private lending world-wide. No matter how you slice it, though, there’s no denying that the world financial system has gone through some exceptionally dramatic changes in the last ten years.
Other than that, I don’t have a flashy Theory of Everything to explain all the data here. Heck, as I said, I don’t even have the data. Nevertheless, it’s fun to speculate.
December 17, 2008
Chris Soghoian describes how he bumped up against Google’s questionable ad-sense trademark enforcement policies.
Soghoian’s story is troubling and it exposes yet another way in which the structure of web traffic has positioned Google as a de-facto arbiter of all kinds of legal speech, political salience, and good taste. More broadly, it demonstrates how key actors and institutions exercise influence in the networked public sphere.
For more on that idea, check out Matthew Hindman’s research. In his new book, The Myth of Digital Democracy, Hindman makes a related argument in a number of different ways, not the least of which is his compelling notion of “Googlearchy.” I disagree with Matt on a number of substantive points, but the significance of his analysis is undeniable. His work complements more established models for thinking about how social structure circumscribes certain kinds of thought and action.
One of the fascinating aspects of the Internet is that powerful forms of social order & status originate in seemingly innocuous expressions of aggregated opinions (e.g. the PageRank algorithm). Hindman’s work takes on the notion that such aggregated opinions are somehow equivalent to a utopian radical democracy or a free market of ideas.
In this sense, his argument parallels the work of economic sociologists, many of whom have analyzed the importance of the “embeddedness” of economic markets. Simply put, the thesis behind the concept of embeddedness is that the sorts of decentralized, disaggregated behaviors that occur in market-like settings are always an extension of the social and cultural contexts in which they occur. It’s a relatively simple idea, but it violates one of the core assumptions of neo-classical economic theory: that markets are a free and accurate expression of individual actors expressing rational preferences for the enhancement of their own wealth and welfare.
Sociologists such as Viviana Zelizer have shown how the economists’ assumptions break down in markets for deeply valued cultural goods such as intimacy and adoption. More recently, a number of scholars (including Marion Fourcade – a professor of mine at Berkeley) have taken up the idea that financial markets are also expressions of (economists’) cultural preferences and not merely an aggregated form of pure rationality.
Considering Hindman’s work and the continuing emergence of experiences like Soghoian’s, I think there’s a case to be made that research on the embeddedness of search technology might be a promising topic. Granted, I don’t know if there are many “neo-classical” information theorists out there that would be willing to defend the straw-man position that search technology serves up knowledge in a pure and rational form.
October 23, 2008
My trusty RSS feeds have turned up two interesting recent posts on the subject of the Obama campaign and it’s implications for the future of governance in a networked society.
First, David Lazer, professor at Harvard’s Kennedy School and Director of the Program on Networked Governance, asks some big questions (emphasis added):
The lights are not going off on this operation. If Obama loses, the network provides him an instant infrastructure to run again. The more intriguing question to me, as a student of politics, is what happens if, as seems likely right now, he wins. There are inter-related political and strategic questions. On the political side, the question is how Obama might use the apparatus to help him govern. Does he directly appeal to his e-mail list to support his policy objectives? There are, on average, about four thousand politically active Obama supporters in each Congressional district–that could be a lot of letters to Members.
And a few lines down:
…On the strategic side, the question is to what extent does the apparatus continue to evolve to allow grassroots involvement, and to what extent does stuff flow up as well as down? In the long run, the only way that there will be some stickiness to the structure is if the people who have been involved can mobilize for local action, can connect to each other, and feel that their voices matter.
Meanwhile, Joshua-Michele Ross at O’Reilly interviews Jascha Franklin-Hodge (founder and CTO of Blue State Digital, or BSD), who offers some partial answers to many of the same questions.
I recommend reading the whole post (and watching the videos, if you’re more of a visual person or whatever), but here’s the bullet-point version of Ross’s claims if you absolutely insist (emphasis removed from the original):
- Online U.S. political communities will morph from a campaign fundraising role to a governing role.
- Rather than one centrally governed behemoth, MyBO is enabling a thousand small campaigns to flourish…This kind of swarm politics has generated enormous amounts of energy (and money) from ordinary citizens.
- Technology (infrastructure and know-how) will become a necessary core competence in all U.S. political campaigns…Campaigns that maintain or are able to tap into a continuity of software, infrastructure and human capital will have serious advantage.
- When lobbyist data, earmark data etc. is available in standard formats it will be a great leap forward for more transparency in government.
Responses 1-3 are in varying stages of already being true. Number 4, on the other hand, has a long way to go (although the folks at the Sunlight Foundation are plugging away on that front).
Whether Franklin-Hodge’s vision of digital democracy comes to fruition, the devil will be in the details. An underlying concern voiced by Lazer is how the nodes (citizens and groups) at the edges of U.S. politics might use digital networks to enhance traditional mechanisms of representation (politicians and political parties). I would build off this insight to ask both authors whether they think the architecture of the network and the technologies that run it will also play an important role in determining the fate of netwoked democracy? If so, how do we design networks to facilitate democratic practice?
As a number of folks have argued, the choice of particular platforms and standards will enable certain forms of civic engagement while foreclosing or devaluing others. Furthermore, just because voters could gain access to the same kinds of technologies doesn’t mean they’ll use them equally effectively or even in the same ways (check out Eszter Hargittai’s research on skillful Internet use if you want some really sobering examples).
All of this is to say that the prospect of a networked polis (like a networked public sphere) presents a number of problems and challenges that few (if any) societies have been able to resolve with earlier communications technologies or institutional formations. In the ancient Greek version of the polis, a narrow class of citizens (land-owning men of means) had the ability and the right to participate. While contemporary democracies have become more populist and inclusive, the reality is that the playing field remains wildly uneven in favor of the wealthy, the well-educated, and the well-connected.
If the future imagined by Franklin-Hodge, Lazer, and others indeed comes to pass, all the fiber optic cable in the world will not make the democratization of effective citizenship any less of an uphill battle.
August 25, 2008
A second large international survey has found that Danes are the world’s happiest people.The Der Spiegel article linked to above offers a few explanations, including a quasi-structuralist class analysis:
The strong social safety nets that cradle Danish citizens from birth until death are welcoming to foreigners, too. Kate Vial, a 55-year-old American expat who has lived and worked in Denmark for more than 30 years, passed up opportunities over the years to return to the U.S., choosing instead to raise her three children in Denmark. Vial knows she will never be rich, but says that she valued family, the ability to travel, and simple economic security above all else. “I just chose a simpler lifestyle, one where I could ride my bike all over and where I don’t have to make a great living to survive,” she says.
And a more culturalist version:
Some people attribute the prevailing attitude among Danes to something less tangible, called hygge (pronounced “hooga”). Danes say the word is difficult to translate — and to comprehend — but that it describes a cozy, convivial sentiment that involves strong family bonds. “The gist of it is that you don’t have to do anything except let go,” says Vial. “It’s a combination of relaxing, eating, drinking, partying, spending time with family.”
Gotta get me some of that hygge.
In the meantime, I’m sure a bumper crop of follow up studies will try to explain the results. Personally, I wonder what sorts of behavioral and political results stem from being happy. Are Danes more cooperative? Do they smile more? If you walk into a bar full of Danes and tell a bad joke, are they more likely to laugh?
Please share your own theories, questions, and dim-witted asides (along with any spare hygge you may have lying around the house) in the comments…
May 14, 2008
Copyright law guru William Patry takes a look at Robert Merton’s thoughts on the ownership of ideas.
The moral of the story: the notion of originality (and therefore ownership in some sense) is vastly overrated.
Patry links to a number of Merton’s essays as well as his ASA Presidential address. I’ll be following up since I’m long overdue to read a lot more Merton.
March 15, 2008
Zizek talks like he writes. Somehow his ideas emerge from amidst his frenetic gestures, thickly accented english, dirty jokes, maxist psychoanalytic jargon, and references to pop culture. A few years ago I read his book, Iraq: The Borrowed Kettle and came away with a distinct feeling of confused enjoyment. This evening was similar in that regard…
One of the most thought provoking claims Zizek made (or at least one of the last things he said – so it stuck in my mind) had to do with what he called “positive dogmatics.” What does he mean by that phrase? I’m not sure, but the example he used had to do with the American public discourse on torture.
In the face of the persistent post-9/11 arguments in favor of legalizing torture that have circulated among American intellectuals like Alan Dershowitz, the Zizek argues that the best strategy for the left is to insist that there can be no debate on the issue. He contrasts this with the typical, politically-correct, liberal response, which is to engage in a reasoned debate on the issue.
The problem with polite and reasoned debate, in this case, is that it signifies partial acceptance that the moral and juridical boundary against torture is subject to contestation. Once you open the debate, you invite open transgression. People may continue to believe that torture is not good, but they will consider it a matter of legal and personal opinion – which in American society means that it is pretty much okay.
From a political standpoint, Zizek claims that it is better to publicly refuse the debate in the first place – thereby foreclosing the issue – while privately recognizing that torture still might happen sometimes. This sounds a little ridiculous at first, but then why do I think he’s right?
Zizek’s argument hinges on an underlying claim about the political utility of a customs. Customs – in the sense that they represents a code of polite fictions known to everyone in any given community – grease the wheels of society. While not “true” in a larger metaphysical sense, customs make it possible for us to coexist. Every day, we apologize without meaning it, hold doors for people we don’t care about, and smile to strangers whose views about the world we would find abhorrent (if only we knew them!). To be smart political subjects, we must learn to use these codes effectively; learn when it is better to smile and nod versus when it is better to give someone the finger. We must also learn when a stance of principled refusal – while dishonest in a sense – can serve a socially progressive purpose.
This is what I think Zizek has in mind when he argues that the left should claim the moral high ground in the torture debate by arguing that debate itself is not an option. This is not equivalent with not participating in the argument at all. Rather, it is vociferous denunciation of the legitimacy of the debate – which is a very powerful form of participation.
Such an inflexible moral stance is exactly the kind of position the American “left” never takes. Enamored of their ability to reason their way through an argument and out-analyze their opponents, Democratic politicians continually find themselves out-maneuvered rhetorically. John Kerry and Al Gore elevated this foible into an art form during their respective campaigns against our putative president.
Would this strategy help the democrats look less wishy-washy in the debates about torture? What about the debates over domestic wire-tapping? It’s difficult to say for sure. What’s certain is that a similar tactic has helped the Obama campaign reap dividends in open primaries and swing states. Apparently drawn by Obama’s personal charisma and his ability to speak in terms of values and morals, centrist independents have boosted his numbers and helped him fend off Hillary’s attacks. Time and again, she has come off looking petty and scheming despite the legitimacy of some of her claims. Elections are not about truth or reason, so much as they are about striking the right tone with the electorate and building a ground-campaign that can reach out to undecided voters. Thus far, Obama has defeated Hillary on both fronts.
March 13, 2008
I have to preface this post with a confession: I once signed up to receive monthly newsletter emails from the Oxford English Dictionary. I still receive these emails and – unlike many of the emails that come to me from friends, family, colleagues and the communities in which I actively participate – I read the OED emails as soon as they show up in my inbox.
In my defense, the emails often provide a fascinating window onto the fast-paced world of dictionary-writing. Well, okay, maybe it’s not fast-paced. And maybe my girlfriend can’t believe how boring it is, but at least it’s fascinating to me…
This month’s edition made what I would consider a shocking announcement: for years, the OED editorial staff has revised content by proceeding alphabetically through the previous edition, adding new words, and updating existing entries. The result, as it’s not hard to imagine, is that a lot of words that nobody knows get attention disproportionate to their usage. But, as of this most recent quarterly update however, the editorial staff will begin to complement the old approach with a new method based on lexical frequency, semantic search data, and alphabetical clusters. This means that the words updated this time around look much more familiar: for example, “heaven,” “hell,” “fuck,” “computers,” “gay” and “free.”
It took a moment for me to realize the implications of this change. Although the newsletter doesn’t go into detail about the new selection process or the data on which it was based, it’s clear that the new technique suggests a radically different concept of language and information. Let me explain what I mean.
The old, alphabetical way of prioritizing updates ascribes an implicit equality to each word. This is reasonable from a bird’s eye perspective of the lexical universe: each word is, after all, equal to its peers in the sense that they all have a place in the dictionary. They are all words. However, the assumption of lexical equality falls apart quickly upon closer inspection of the way people use words in the world. Relatively few words dominate our everyday speech and writing patterns. As with many social phenomena, the frequency of word use in any given natural language follows a “Zipfian Distribution” – a.k.a. a power law. The new OED revision method attempts to take this reality into account. They will no longer leave it up to chance whether the most relevant, heavily used, and contentious words that populate everyday language undergo regular “check-ups.”
Why am I making such a big deal out of this? Dictionaries – as well as their not-too distant cousins, encyclopedias – reveal a lot about the way people think about themselves and the world around them. The original encyclopédie was a distinctive product of the Enlightenment. Edited by Denis Diderot and Jean le Rond D’Alembert, it was intended to be a comprehensive catalogue of the entirety of human knowledge. Around the same time (i.e. the 18th century), the idea arose to make dictionaries follow an alphabetical order. Prior to that they had been organized around discrete topics. By embracing the alphabet as an abstract system of categorization, dictionarists effectively accepted the notion that language use was not as important as a totalizing system of categorization. The shift represented an attempt to make the dictionary more encyclopedic in its scope and structure. Not surprisingly, the method of dictionary production and revision would come to follow the structure.
In adopting a revision system that takes natural language use patterns into account, the OED editors have, in one sense, recognized the limits of the encyclopedic project. While this should not suggest the end of the enlightenment or any such nonsense, it does suggest an interesting turning point in the evolution of human self-knowledge.