Wikipedia, probability and community

I’m sure I’m far from the only person to have a conflicted relationship with Wikipedia. Yes, I know that at least one prominent study found that the user-produced encyclopedia is about as accurate as the venerable Britannica. I also know that Wikipedia can be mind-bogglingly wrong — sometimes only for a few minutes or a few days, until an adult (I’m talking about maturity, not age) undoes someone else’s vandalism. But that’s not much consolation if you’re the victim of that bad information.

I tell my students that Wikipedia can be a great starting point, but that they should use it to find more-authoritative sources of information, not cite it in their papers. As for me, well, I’ve been known to link to Wikipedia articles, but I try to be careful, and I try to keep it to a minimum.

This week’s New York Times Magazine includes a worthwhile story by Jonathan Dee on the emergence of Wikipedia as a news source. Dee reports on a small army of activists (one is just 16) who jump in with summaries of major news events even as they are unfolding. These activists come across as admirably dedicated to the idea of fair, neutral content; many look for vandalism after a major news event takes place, such as the death of the Rev. Jerry Falwell, a favorite target of those who opposed his homophobic, right-wing views. (Yes, if I wrote that on Wikipedia, someone would edit those descriptions out.) But I would still have a nagging sense that something might be very wrong.

So what is the real difference between Wikipedia and a more traditional encyclopedia such as the Britannica? It’s not just the notion that anonymous and pseudonymous amateurs write and edit Wikipedia articles, whereas Britannica relies on experts. That’s certainly part of it, although if that were the entire explanation, Wikipedia would be worthless.

The more important difference is the idea of community-based, bottom-up verification (Wikipedia) versus authority-based, top-down verification (Britannica). Each has its purpose. The question is why the community-based model works — or at least works often enough that Wikipedia is a worthwhile stop on anyone’s research quest.

To that end, I want to mention a couple of ideas I’ve run across recently that help explain Wikipedia. The first is a 2006 book by Wired editor Chris Anderson, “The Long Tail,” in which he suggests that the accuracy of Wikipedia is based on probability theory rather than direct verification. The more widely read a Wikipedia article is, the more likely it is to be edited and re-edited, and thus be more accurate and comprehensive than even a Britannica article. But you never know. Anderson writes (I’m quoting from the book, but this blog post captures the same idea):

Wikipedia, like Google and the collective wisdom of millions of blogs, operates on the alien logic of probabilistic statistics — a matter of likelihood rather than certainty. But our brains aren’t wired to think in terms of statistics and probability. We want to know whether an encyclopedia entry is right or wrong. We want to know that there’s a wise hand (ideally human) guiding Google’s results. We want to trust what we read.

When professionals — editors, academics, journalists — are running the show, we at least know that it’s someone’s job to look out for such things as accuracy. But now we’re depending more and more on systems where nobody’s in charge; the intelligence is simply “emergent,” which is to say that it appears to arise spontaneously from the number-crunching. These probabilistic systems aren’t perfect, but they are statistically optimized to excel over time and large numbers. They’re designed to “scale,” or improve with size. And a little slop at the microscale is the price of such efficiency at the macroscale.

Anderson is no Wikipedia triumphalist. He also writes: “[Y]ou need to take any single result with a grain of salt. Wikipedia should be the first source of information, not the last. It should be a site for information exploration, not the definitive source of facts.”

Right now I’m reading “Convergence Culture” (2006), by Henry Jenkins, director of the Comparative Media Studies Program at MIT. Jenkins, like Anderson, offers some insight into that clichéd phrase “the wisdom of the crowd,” and why it often works. Jenkins quotes the philosopher Pierre Lévy, who has said of the Internet, “No one knows everything, everyone knows something, all knowledge resides in humanity.” Jenkins continues:

Lévy draws a distinction between shared knowledge, information that is believed to be true and held in common by the entire group, and collective intelligence, the sum total of information held individually by the members of the group that can be accessed in response to a specific question. He explains: “The knowledge of a thinking community is no longer a shared knowledge for it is now impossible for a single human being, or even a group of people, to master all knowledge, all skills. It is fundamentally collective knowledge, impossible to gather together into a single creature.” Only certain things are known by all — the things the community needs to sustain its existence and fulfill its goals. Everything else is known by individuals who are on call to share what they know when the occasion arises. But communities must closely scrutinize any information that is going to become part of their shared knowledge, since misinformation can lead to more and more misconceptions as new insight is read against what the group believes to be core knowledge.

Jenkins is writing not about Wikipedia but about an online fan community dedicated to figuring out the winners and losers on CBS’s “Survivor.” But the parallels to Wikipedia are obvious.

I’ve sometimes joked that the madness of the mob must turn into the wisdom of the crowd when you give everyone a laptop. The Jenkins/Lévy model suggests something else — “shared knowledge” defines a mob mentality; “collective intelligence” is the wisdom of the crowd. At its best, that what drives Wikipedia.

14 thoughts on “Wikipedia, probability and community

  1. stephen

    That was really interesting. I’m surprised by how much I use Wikipedia. Maybe too much. But it is extremely quick and convenient, and surprisingly good. I certainly don’t treat it as an end-all, authoritative source, but think of how many school papers and reports have been written using it as the exclusive or primary source! That’s scary.

  2. Michael Corcoran

    Dan, approach is pretty much mine. I think Wikipedia is amazingly valuable. Yes, it is valuable as a starting point and is not fit for citation in academic of journalistic work (although, depending on the context, I think it is fine to link on a blog); but I don’t see why there is so much animosity towards it in some circles.

  3. Cody Pomeray

    I’m a wikipedia fan as well. It’s just so easy and valuable, warts and all.What I found most interesting about today’s post is that it immediately brought my mind to the issue of religion vs. science, as in probability (i.e. wikis) representing quantum physics vs. authoritarianism (i.e. encyclopedias) representing religious determinism. Hmmm.

  4. David

    Discussions that compare accuracy for Wikipedia v Brittanica miss something. There are a large number of noncontroversial, obscure topics about which I know nothing. Brittanica won’t have anything. But I can trust Wikipedia to give me the 50-cent definition. Yeah, sometimes I’ll get misinformation, but without Wikipedia I’d often get nothing, (or use whatever else shows up when I Google.)E.g., just now over on PeaceBang’s blog, somebody mentioned Asatru. What the heck is that? Wikipedia gave me a good explanation – it’s a kind of Nordic paganism. Even if some of the details in that article are wrong, I trust it to give me the basic picture. Without Wikipedia, I’d have Bupkis.

  5. mike_b1

    Stephen nails it: it is extremely quick and convenient. Also, it contains way more pop culture entries than Britannica.

  6. Anonymous

    Although I regularly use Wikipedia as a starting point for several purposes, there is an additional reason not to cite to it in an academic paper. A major reason is that an entry in Wiki can change from one day to the next, so what may be “there” the day the paper is written may not be “there” the next day. For each entry, Wiki has a page that shows the changes made to the page, and some identification of the computer from which the change is made, but it is probable that most people do not know that.I use Wiki for (a) things that I pretty much already know (science and math, not controversial, but want a concise quotation for) and (b) to remind me of things that I had previously read, but did not recall (also not controversial).Britannica has its own problems, but it does not have the problem that people can change it willy-nilly. Still, if I were using Britannica as a cite, I’d cite to the date of publication of the issue of Britannica.–raj

  7. Don (no longer) Fluffy

    . . . and, of course, no “expert” was ever biased. Remember, most history of war, for instance, was written by the winners.

  8. Anonymous

    For what it is worth, Dan expressed my sentiments exactly. It is great for general info though. If you’re curious, you can look up Bigfoot, the Concertgebow concert hall, presidential election results, and the War of the Worlds radio broadcast. And no one needs to know that you are that crazy to look at all those things.

  9. man who is admittedly a wikipedia fan

    I think the greater “problem” (if you can call it that) with Wikipedia is not with us, it’s with “the kids”.(note: yes, I’m overgeneralizing here, but bear with me…the overall point is valid)Many of the “millennial” generation (and younger) that I work with rarely listen to the radio, never read a newspaper or magazine, and have never been in a library. Too often they haven’t quite “gotten” the concept that you can’t completely trust ANYTHING you read online. Some things you can mostly trust, but nothing’s 100% completely trustworthy.That level of judgment also applies to Wikipedia, just even more so. And too often they feel Wikipedia is the starting AND ending point for research, and they don’t understand why it’s so important that it not be.Dan, of course, is doing his part…but the “problem” is endemic. I suppose you could say it’s too many years of kids knowing more about the internet than their parents, which implies a rendering of judgment that…while possibly true…is somewhat unfair.Am I imagining things…or it is really true that with each passing year, there’s fewer and fewer sources of information you can really trust? 😦

  10. Elizabeth

    I am a librarian, and my first visits to Wikipedia were based on what I had read in the press. I didn’t see how the thing could possibly work, I was offended by the very concept, and I visited in search of egregiously bad examples. I was almost disappointed to find so much of it was so very good. I found errors and gaps, but I could contribute and help improve things.One great thing about Wikipedia is that its process is so transparent that it helps people think about general issues of accuracy, objectivity and quality. No source can be trusted absolutely — even the venerable Encyclopedia Britannica. The professional news media certainly makes errors, despite all that professional editing. Here’s an interesting example from the BBC:BBC replaces Parrot story after much-deserved squawking.If you want to learn more about how and why the Wikipedia works as well as it does, visit the Community Portal page, and scroll down to look at all the current projects and collaborations and requests for assistance. Wikipedia has been very successful in attracting a self-organizing corps of dedicated volunteers (like the ones mentioned in the New York Times article) who really work at improving it.

  11. Lou

    I like Wikipedia very much, but for trivia/curiousity rather than serious academic research. For instance, if I watch HBO’s Rome and don’t recall ever reading about the original war between Octavius Caesar and Marc Antony, I could go to Wikipedia to check up on it quickly. Or if I read a biography of Marie Antoinette and want to see what other sources say, I can go to Wikipedia and skim various sources.

  12. Anonymous

    A book you should read is David Weinberger’s “Everything is Miscellaneous,” which is a very insightful look at how the digital world is changing the shape of knowledge. He writes about Wikipedia quite insightfully.

  13. Anonymous

    Dan, your post caused a consensus to develop on an important topic. Nobody is ranting. What happened?

Comments are closed.