An interesting article at OpenDemocracy – a new and improved attempt to define the journalistic component of what Yochai Benkler has called “the networked public sphere.”

Back at the Berkman@10 Conference after lunch and watching the Q&A following a discussion of networked cooperation between Yochai Benkler and Jimmy Wales.

Meanwhile, over on twitter & the conference backchannel, people seem to be getting bent out of shape about the fact that Wales made a comment implying that he thought “crowdsourcing” was a bad way to think about networked collaboration. It sounded to me that what Wales meant was that the way some folks in the private sector talk about crowdsourcing (hypothetical sample quote: it’s like outsourcing to India, only cheaper…) misses the point of cooperative social action entirely.

I’m sympathetic to that idea (for many reasons). First and foremost because the possibility of distributed peer-production of knowledge (like Wikipedia) raises a number of fundamental questions about the nature of business and markets. In theory, we’ve wound up developing markets to resolve a large scale collective action problem of goods distribution.

As Esther Dyson is currently clarifying in her comment, the big problem is that many folks in the private sector just don’t understand that networked collaboration offers an alternative to how we build social institutions of exchange. Instead, some folks think its just another tool for profit-making and (in the worst instances) rent-extraction.

What if networked cooperation made it so that scarcity were not a precondition of successful (i.e. profitable) exchange? What if we could do better than markets (as we currently know them)?

That’s the promise of the commons.

Everybody and their cousin’s got a link to Duncan Watt’s recent NYTimes Magazine piece on cumulative advantage. It’s a nice bit of public sociology and an interesting application of experimental methods to understand how social networks function.

The argument also has implications for the ongoing debates about the nature of the networked public sphere. Watt’s results suggest that there may be some merit to the position of folks like U Chicago’s Cass Sunstein, who claims that big media has historically promoted civic virtues by exposing us to ideas we wouldn’t encounter by googling alone. If, as Watts says, people’s interests are over-determined by knowledge of what is popular, then search algorithms predicated on popularity (like “PageRank“) could produce a feedback mechanism that stifles diversity in public debate. To adopt Matthew Hindman’s phrase, we’d be left with “googlearchy.”

Yochai Benkler has disagreed with both of these views for a while [full disclosure: I currently work as a research assistant for Benkler], in part because they both idealize the state of public discourse prior to the creation of the Internet. Watt’s data could just as easily be turned around to make the claim that traditional print media and television – much like the recording industry – were giving us a false impression of popularity (or importance) that only reflected the prejudices of a handful of editors and industry executives. By disseminating these perspectives widely, the big media therefore imposed an elitist politics and outlook on the public as a whole.

I’ll have to do some more thinking and reading to figure out where I fall on this issue – but for the moment, studies like Watts’ shed important light on the complex nature of social networks and reputation on the role of information in society.

I have to preface this post with a confession: I once signed up to receive monthly newsletter emails from the Oxford English Dictionary. I still receive these emails and – unlike many of the emails that come to me from friends, family, colleagues and the communities in which I actively participate – I read the OED emails as soon as they show up in my inbox.

In my defense, the emails often provide a fascinating window onto the fast-paced world of dictionary-writing. Well, okay, maybe it’s not fast-paced. And maybe my girlfriend can’t believe how boring it is, but at least it’s fascinating to me…

This month’s edition made what I would consider a shocking announcement: for years, the OED editorial staff has revised content by proceeding alphabetically through the previous edition, adding new words, and updating existing entries. The result, as it’s not hard to imagine, is that a lot of words that nobody knows get attention disproportionate to their usage. But, as of this most recent quarterly update however, the editorial staff will begin to complement the old approach with a new method based on lexical frequency, semantic search data, and alphabetical clusters. This means that the words updated this time around look much more familiar: for example, “heaven,” “hell,” “fuck,” “computers,” “gay” and “free.”

It took a moment for me to realize the implications of this change. Although the newsletter doesn’t go into detail about the new selection process or the data on which it was based, it’s clear that the new technique suggests a radically different concept of language and information. Let me explain what I mean.

The old, alphabetical way of prioritizing updates ascribes an implicit equality to each word. This is reasonable from a bird’s eye perspective of the lexical universe: each word is, after all, equal to its peers in the sense that they all have a place in the dictionary. They are all words. However, the assumption of lexical equality falls apart quickly upon closer inspection of the way people use words in the world. Relatively few words dominate our everyday speech and writing patterns. As with many social phenomena, the frequency of word use in any given natural language follows a “Zipfian Distribution” – a.k.a. a power law. The new OED revision method attempts to take this reality into account. They will no longer leave it up to chance whether the most relevant, heavily used, and contentious words that populate everyday language undergo regular “check-ups.”

Why am I making such a big deal out of this? Dictionaries – as well as their not-too distant cousins, encyclopedias – reveal a lot about the way people think about themselves and the world around them. The original encyclopédie was a distinctive product of the Enlightenment. Edited by Denis Diderot and Jean le Rond D’Alembert, it was intended to be a comprehensive catalogue of the entirety of human knowledge. Around the same time (i.e. the 18th century), the idea arose to make dictionaries follow an alphabetical order. Prior to that they had been organized around discrete topics. By embracing the alphabet as an abstract system of categorization, dictionarists effectively accepted the notion that language use was not as important as a totalizing system of categorization. The shift represented an attempt to make the dictionary more encyclopedic in its scope and structure. Not surprisingly, the method of dictionary production and revision would come to follow the structure.

In adopting a revision system that takes natural language use patterns into account, the OED editors have, in one sense, recognized the limits of the encyclopedic project. While this should not suggest the end of the enlightenment or any such nonsense, it does suggest an interesting turning point in the evolution of human self-knowledge.

This morning’s WaPo story on “National Data Exchance” (N-DEx ), the new DoJ system for integrating intelligence information gathered across federal, state, and local law enforcement agencies, makes for a disturbing, albeit thought-provoking read.

First off, the fact that a system like this is going online for use by organizations that have so blatantly violated privacy and human rights in the interests of national security in the past makes me uneasy to say the least. There is no transparency here – beyond the private contractors hired to design and maintain the system – and (thanks to post-9/11 changes foisted on our legal system) you can bet there will be minimal oversight from congress, the courts, or civil society organizations to defend against abuses. Yikes.

The larger question here concerns the value of such a system and the approach to law enforcement it represents. In the article, we read:

“A guy that’s got a flat tire outside a nuclear facility in one location means nothing,” said Thomas E. Bush III, the FBI’s assistant director of the criminal justice information services division. “Run the guy and he’s had a flat tire outside of five nuclear facilities and you have a clue.”

Now this all sounds lovely in the abstract, but for me it conjures up images of Amaznode, except the DoJ search engine will probably generate connections between people, places, and actions as opposed to just books.

So what’s the big deal? Well, guilt by association is one thing, but guilt by data-connection is another. Numerous, better-informed people with more educated opinions than I have written at length about the dangers of applying data-mining techniques for public security. With so much information, the risk of “false positive” connections becomes extremely high. Think of this in terms of a few analogies: How often do you really want to buy the books that Amazon recommends you? How frequently do Google’s ads not pertain to the actual content or intentions behind your search? Reading between the lines of Thomas E. Bush III’s statement above, it does not take a lot of imagination to see that the success of a system like N-DEx (or its already operational cousin, Coplink – see the article for details) will hinge on its ability to help officials avoid these kinds of problems.

As the number of ways in which our everyday lives connect us to electronic records and surveillance will only continue to grow for the foreseeable future (RFID, IP addresses, biometrics, etc.), it makes me wonder how far we are from encryption tools that will extend the capacity for anonymous browsing into our everyday lives.

I just got back from hearing Kieran Healy’s presentation of his latest research at the Harvard Economic Sociology Seminar. In this new project, he seeks to apply the performativity thesis – that so-called calculative technologies do not merely describe the world, but reformat it in their own image – beyond studies of financial markets and economics. The talk elaborated a preliminary sketch of how the rise of social network theory might have followed a comparable pattern through its subsequent formalization in the Internet.

Since this is still unpublished and very much work-in-progress, I won’t say more about the details of Kieran’s argument; however, the talk got me thinking about a broader issue that has not attracted sufficient attention in the existing work on performativity within economic sociology.

Without question, the most well-substantiated work in this area is Donald MacKenzie’s An Engine Not a Camera (2006). MacKenzie does an excellent job methodically documenting the impact of finance theory on financial markets during the latter half of the twentieth century. Yet, despite MacKenzie’s attention to detail and thorough support for his theoretical claims, it is his theory that proves somewhat unsatisfying in the end. Performativity theory’s greatest weakness – as it currently stands – remains its inability to explain why certain calculative technologies (theories, formulae, models, etc.) gain greater purchase on the world than others. This is important for several reasons, but the one that most interests me has to do with the role of power and institutions.

Ultimately, all ideas – like the people that create and apply them – are not created equal. Within the context of MacKenzie’s work, the Black-Scholes-Merton options pricing model gained traction in the world partly on the basis of its scientific merits (precision, validity, elegance, etc.) and partly on the basis of its political and social position. Performativity theory pays attention to the first of these factors at the expense of the second.

Without an adequate analysis of the role of power and institutions in shaping the outcomes of a given scientific field, it is very difficult to understand why one model of the world wins out over the competition. If you can’t explain that, the value of performativity as an analytical tool becomes limited.

To put the problem in concrete terms, I’ll borrow an example that did come up in Healy’s talk and which has frequently been a topic of concern for scholars of the Internet. How do we explain the success of Google? Did Larry and Sergey merely find the correct algorithm that most accurately described the emerging social space of the web? Clearly not. It would be more precise to say that they applied a set of theories about how information and knowledge function in the world and formalized those theories into the famous “Page Rank” algorithm. Page Rank went on to radically transform the way that people use and understand the Internet, spawning an entire sub-industry of consultants seeking to game their formula (an area of work known as Search Engine Optimization, or SEO for short). Ta-da! Performativity in action!

Not so fast. I’d wager that Larry and Sergey’s success – and therefore the success of their spiffy little algorithm – had everything to do with the institutional setting and power-dynamics of the field within which they worked at the time. The Stanford University Computer Science Department and the Silicon Valley of the late 1990’s needs a larger place in this story if we want to understand where Larry and Sergey’s ideas came from, how they got the money for their start-up, and what sorts of problems they ran into (or not) along the way.

Somebody will tell this story well at some point – whether they do so in the peculiar language of sociological theory is not really important. What matters is that they recognize that without both the calculative technology and the institutional context you ultimately can’t explain very much about the current shape of the Network Society.