July 21, 2013
In a new paper, recently published in the open access journal PLOSONE, Benjamin Mako Hill and I build on new research in survey methodology to describe a method for estimating bias in opt-in surveys of contributors to online communities. We use the technique to re-evaluate the most widely cited estimate of the gender gap in Wikipedia.
A series of studies have shown that Wikipedia’s editor-base is overwhelmingly male. This extreme gender imbalance threatens to undermine Wikipedia’s capacity to produce high quality information from a full range of perspectives. For example, many articles on topics of particular interest to women tend to be under-produced or of poor quality.
Given the open and often anonymous nature of online communities, measuring contributor demographics is a challenge. Most demographic data on Wikipedia editors come from “opt-in” surveys where people respond to open, public invitations. Unfortunately, very few people answer these invitations. Results from opt-in surveys are unreliable because respondents are rarely representative of the community as a whole. The most widely-cited estimate from a large 2008 survey by the Wikimedia Foundation (WMF) and UN University in Maastrict (UNU-MERIT) suggested that only 13% of contributors were female. However, the very same survey suggested that less than 40% of Wikipedia’s readers were female. We know, from several reliable sources, that Wikipedia’s readership is evenly split by gender — a sign of bias in the WMF/UNU-MERIT survey.
In our paper, we combine data from a nationally representative survey of the US by the Pew Internet and American Life Project with the opt-in data from the 2008 WMF/UNU-MERIT survey to come up with revised estimates of the Wikipedia gender gap. The details of the estimation technique are in the paper, but the core steps are:
- We use the Pew dataset to provide baseline information about Wikipedia readers.
- We apply a statistical technique called “propensity scoring” to estimate the likelihood that a US adult Wikipedia reader would have volunteered to participate in the WMF/UNU-MERIT survey.
- We follow a process originally developed by Valliant and Dever to weight the WMF/UNU-MERIT survey to “correct” for estimated bias.
- We extend this weighting technique to Wikipedia editors in the WMF/UNU data to produce adjusted estimates of the demographics of their sample.
Using this method, we estimate that the proportion of female US adult editors was 27.5% higher than the original study reported (22.7%, versus 17.8%), and that the total proportion of female editors was 26.8% higher (16.1%, versus 12.7%). These findings are consistent with other work showing that opt-in surveys tend to undercount women.
Overall, these results reinforce the basic substantive finding that women are vastly under-represented among Wikipedia editors.
Beyond Wikipedia, our paper describes a method online communities can adopt to estimate contributor demographics using opt-in surveys, but that is more credible than relying entirely on opt-in data. Advertising-intelligence firms like ComScore and Quantcast provide demographic data on the readership of an enormous proportion of websites. With these sources, almost any community can use our method (and source code) to replicate a similar analysis by: (1) surveying a community’s readers (or a random subset) with the same instrument used to survey contributors; (2) combining results for readers with reliable demographic data about the readership population from a credible source; (3) reweighting survey results using the method we describe.
Although our new estimates will not help us us close the gender gap in Wikipedia or address its troubling implications, they give us a better picture of the problem. Additionally, our method offers an improved tool to build a clearer demographic picture of other online communities in general.
This weekend, Andrés and I attended the CrowdCamp Workshop at CHI in Austin, Texas. The workshop was structured a lot like a hackathon, with the objective being to work in teams to produce projects, papers, or research.
The group I worked with coalesced around a proposal made by Niki Kittur, who suggested that we envision how crowdsourcing and distributed work contribute to solving grand challenges, such as economic inequality and the ongoing impact of the 2008 financial crisis.
We then spent the better part of the weekend outlining an ambitious set of scenarios and goals for the future of crowdwork.
While many moments of our conversation were energizing, the most compelling aspects derived from the group’s shared desire to imagine crowdwork and distributed online collaboration as potentially something more than the specter of alienated, de-humanized piece-work that it is frequently depicted to be.
To spur our efforts, we used a provocative thought experiment: what it would take for crowdwork to facilitate fulfilling, creative, and sustainable livelihoods for us or our (hypothetical or real) children?
Despite the limits of this framing, I think it opened up a discussion that goes beyond the established positions in debates about the ethics and efficiencies of paid crowdsourcing, distributed work, and voluntary labor online (all of which are, to some extent, encompassed under the concept of crowdwork in this case). It also hellped us start imagining howwe, as designers and researchers of crowdwork platforms and experiences, would go about constructing an ambitious research agenda on the scale of a massive project like the Hadron Collider.
If everything goes according to plan, this effort will result in at least a paper within the coming few weeks. Assuming that’s the case, our group will be sharing more details about the workshop and our vision of the future of crowdwork soon.
April 13, 2012
Zombie trade agreements: According to some documents acquired by the organization European Digital Rights (EDRi), it appears the G8 has decided to do a Dr. Frankenstein impression and reanimate some of the most thoughtless portions of ACTA’s Internet provisions. This latest instantiation of the ACTA agreement wants control over intellectual property, technology devices, network infrastructure, and YOUR BRAINS.
An awesome experiment on awards (published in PLoS ONE) by Michael Restivo and Arnout van de Rijt – both in the Sociology department at SUNY Stony Brook – shows that receiving an informal award (a barnstar) from a peer may have a positive effect on highly active Wikipedians’ contributions. The paper is only three pages long, but if you want to you can also read the Science Daily coverage of it.
Mako’s extensive account of his workflow tools is finally up on Uses This. The post is remarkable for many reasons. First of all, Mako puts more care and thought into his technology than anybody I know, so it’s great to see the logic behind his setup explained more or less in full. Secondly, I found it extra remarkable because I have been collaborating (and even living!) closely with Mako for a while now and I still learned a ton from reading the post. My favorite detail is unquestionably the bit about his typing eliciting a noise complaint while he was in college. As a rather loud typist myself, I have been subject to snark and snubbery from various quarters over the years, but I’ve never had anybody call the cops on me!
The Soviet Union lives on! But maybe not quite where you’d expect it. My friends and former Oakland neighbors Daniel Gallegos and Zhanara Nauruzbayeva have recently moved themselves and their incredible Artpologist project to New York. Upon arrival, they found themselves surrounded by a post soviet reality that most New Yorkers or Americans simply do not know exists at all, much less in the epicenter of finance capital. Their latest project, My American New York, chronicles this “post soviet America” through photos, stories, Daniel’s beautiful sketches, drawings, and paintings (e.g. the image at the top of this post), all wrapped up in a series of urban travelogues.
Philosophy Quantified: Kieran Healy has done a series of elegant and thoughtful guest posts on Leiter Reports in which he explores data from the 2004 and 2006 Philosophical Gourmet Report (PGR) surveys in an effort to generate some preliminary insights about the relationships between department status and areas of specialization.
Matt Salganik and Karen Levy (both of the Princeton Sociology Department) recently released a working paper about what they call “Wiki Surveys” that raises several important points regarding the limitations of traditional survey research and the potential of participatory online information aggregation systems to transform the way we think about public opinion research more broadly.
Their core insight stems from the idea that traditional survey research based on probability sampling leaves a ton of potentially valuable information on the table. This graph summarizes that idea in an extraordinarily elegant (I would say brilliant) way:
Think of the plot as existing within the space of all possible opinion data on a particular issue (or set of issues). No method exists for collecting all the data from all of the people whose opinions are represented by that space, so the best you – or any researcher – can do is find a way to collect a meaningful subset of that data that will allow you to estimate some characteristics of the space.
The area under the curve thus represents the total amount of information that you could possibly collect with a hypothetical survey instrument distributed to a hypothetical population (or sample) of respondents.
Traditional surveys based on probability sampling techniques restrict their analysis to the subset of data from respondents for whom they can collect complete answers to a pre-defined subset of closed-ended questions (represented here by the small white rectangle in the bottom left corner of the plot). This approach loses at least two kinds of information:
- the additional data that some respondents would be happy to provide if researchers asked them additional questions or left questions open-ended (the fat “head” under the upper part of the curve above the white rectangle);
- the partial data that some respondents would provide if researchers had a meaningful way of utilizing incomplete responses, which are usually thrown out or, at best, used to make estimates about the characteristics of whether attrition from the study was random or not (this is the long “tail” under the part of the curve to the right of the white rectangle).
Salganik and Levy go on to argue that many wiki-like systems and other sorts of “open” online aggregation platforms that do not filter contributions before incorporating them into some larger information pool illustrate ways in which researchers could capture a larger proportion of the data under the curve. They then elaborate some statistical techniques for estimating public opinion from the subset of information under the curve and detail their experiences applying theses techniques in collaboration with two organizations (the New York City Mayor’s Office and the Organization for Economic Cooperation and Development, or OECD).
If you’re not familiar with matrix algebra and Bayesian inference, the statistical part of the paper probably won’t make much sense, but I encourage anyone interested in collective intelligence, surveys, public opinion, online information systems, or social science research methods to read the paper anyway.
Overall, I think Salganik and Levy have taken an incredibly creative approach to a very deeply entrenched set of analytical problems that most social scientists studying public opinion would simply prefer to ignore! As a result, I hope their work finds a wide and receptive audience.
February 19, 2012
Lin-sanity notwithstanding, this is a time of year when I always find myself wanting more as a sports fan in America. The memories of the Super Bowl and BCS Championship game have already started to fade; March madness remains a long way off; pitchers and catchers have yet to report for Spring Training; and both the NBA and NHL have just passed the midpoint of their respective regular seasons. Add that it’s the middle of Winter (even an historically mild one), and these factors combine to make mid February a less than thrilling few weeks.
Lately, I’ve partially solved my urge for non-stop sports entertainment by turning to leagues that have much less popularity and almost no visibility in mainstream U.S. media coverage.
First, during a brief trip to Brazil for a conference, I enjoyed watching some early round action in the Paulistão, or the elite soccer league of São Paulo state. With historically dominant teams like Corinthians, Santos, and Palmeiras, São Paulo boasts one of the most competitive state-level championships within Brazil and usually includes several young players who will become international superstars with household names within a few years (e.g. if you haven’t heard of Neymar yet, just be patient, the teenage phenom will likely figure prominently in the Brazilian national team’s efforts when the country hosts the World Cup in 2014).
Then, the week after I returned from Brazil, I spent a few afternoons watching the final games of the Serie del Caribe, an international tournament that wraps up the Winter leagues in the Dominican Republic, Mexico, Puerto Rico, and Venezuela. The games were tight, competitive and included a number of Major League players who seemed either to have chosen to return home as triumphant stars or to hone their skills among Latin America’s most competitive leagues.
Despite the fact that you’ll never see your local ESPN network cover either of these events, both have a ton of history behind them and tremendous fan-bases (ESPN’s Brazilian and regional Latin American affiliates cover both). They are also extraordinarily competitive and played at a very high skill level.
Latin American soccer and baseball are not the only options. There are also a whole range of winter sports that never show up on U.S. television schedules until the Olympics. In other words, the only thing preventing you from watching terrific, exciting sporting events in the middle of the annual mid-Winter lull is the fact that you would probably either need to pay an inordinate sum for satellite coverage or seek out unauthorized streams on websites that serve sketchy advertisements and mal-ware along with the game.
At the risk of making a very Ethan Zuckerman-esque point, the Internet makes it theoretically trivial to solve this problem, but that theoretical triviality only underscores a much bigger problem in the way our attention is distributed and canalized by a combination of cultural habits and incumbent media networks. In other words, maybe you’d be more likely to watch Neymar and Santos take on Palmeiras if either your local television network would it or if you could easily find a high quality stream broadcasting in English (I also enjoy watching these things online because I get to listen to Portuguese and Spanish language announcers). Indeed, as long as somebody is streaming a broadcast of any of these games anywhere around the world, there’s no practical reason that it isn’t possible to watch that stream anywhere else. But for a whole variety of reasons that I don’t fully understand, that just doesn’t happen yet.
My point is that American sports fans live in a media ecosystem that has not yet figured out what to do with its (long) tail. There has to be a better, less monopolistic solution than satellite and cable providers charging high rates for access to particular sports packages or leagues. This model ensures that only existing fans who are willing to pay to watch teams they already like will ever subscribe to such services, condemning these sports and teams to continued obscurity. Instead, it would be great to see some affordable way for fans to take advantage of existing Internet streams to experiment with new sports, new leagues, and new cultures by tuning into otherwise less popular or less well-known events when their hometown favorites are not in season.