Developing Algorithms To Prevent Citizen Journalism From Being Gamed: Lessons From Google and Digg
5 min read

Developing Algorithms To Prevent Citizen Journalism From Being Gamed: Lessons From Google and Digg

Is there a risk that citizen journalism can be gamed by “PR flacks and unqualified hacks” — Adam Weinstein in Mother Jones thinks so.

Unfortunately, he casts the issue in terms of the risk that economically burdened newsrooms will trade expensive quality journalism for no-cost, untrustworthy content — instead of looking at the very real risk that a more open journalistic process can be more easily gamed — and how that gaming can be prevented.

Which got me thinking about what citizen/networked journalism might learn from the battle that systems like Digg (which accepts anonymous users) and Google (which interprets open linking on the web) have waged against gaming.

By forcing their beleaguered staffs to depend on outsiders for content, then running the content without much editorial oversight, newspapers may be taken in by crackpots and sly marketers who make Jayson Blair look like a grade-school plagiarist. Lobbyists and spin doctors have already taken notice of the new model. Take the South Dakota Politics and Daschle V. Thune blogs, which influenced the Gannett-owned Sioux Falls Argus Leader‘s coverage of the state’s 2004 United States Senate race; eventually, the bloggers were found to be on the payroll of just-elected Republican Senator John Thune. “Got a story you can’t convince a mainstream reporter to run?” wrote Chris Suellentrop in Slate. “Leak it anonymously to a blog on your payroll. Then get a local reporter to write a story on the controversial, gossipy, local political blog. Soon everyone in town will be talking about the story you leaked to the blog…. And no one will know that the blog post was a paid placement until after the election.”

The risk that political reporting can be manipulated by political operatives acting as bloggers, for example, is very real. But Weinstein seems more interested in impugning the integrity of every newsroom experimenting with “networked journalism.”

If you could convince me that crowdsourcing and mojos and information centers weren’t about cost cutting or lazy journalism, I’d be all for them. The blogosphere and the 24/7 news cycle are realities, and editors and reporters have a lot of ink-stained baggage to dump if they want to thrive in the new-media world. But that doesn’t mean that bean-counting publishers must recruit mercenary bloggers or convert their cub reporters into untrained, overworked, self-editing news tickers.

I attended Jeff Jarvis’ Networked Journalism Summit last fall, and I can tell you that there was no one advocating “cost cutting or lazy journalism,” nor were there any “bean-counting publishers” seeking to destroy the integrity of journalism — which is not to say this can’t or hasn’t already happened in some instances. But it’s a cynical red herring at best, which distracts from the real issue — the potential for gaming of an open networked/citizen journalism system, and how the problem can be addressed.

Of course, it’s also a canard to argue that the traditional journalistic process was ever immune from gaming, whether by savvy flacks on the outside or Jayson Blairs on the inside. But at least in the command-and-control editorial model, there were processes in place designed to prevent, as much as possible, the manipulation of facts, dissemination of disinformation, and other efforts to advance agendas.

That’s not to say that “objectivity” has ever been easy, particularly when the ideal of balanced coverage conflicts with the imperative to establish “facts” — for example (click through to see the video, it’s an eye-opener):

But as if it weren’t previously hard enough, opening up the journalistic process to a network presents a whole new set of challenges, which Weinstein does legitimately highlight, albeit with the unconstructive polemic.

To address the counter-argument, that there isn’t really a risk of gaming in networked journalism, consider that the web has pretty conclusively demonstrated that any open system that can be gamed, will be gamed. (Blog comment spam is another example)

Fortunately, there are a lot of lessons to be learned outside the sphere of traditional journalism, which can guide the new practice of networked journalism, as it learns how to combat the new potential for manipulation — in particular, newsrooms should study the efforts by open networked systems like Digg and Google to stem the endless tide of gaming.

There are tons of articles out there on the topic (just do an ironic search on Google):

  • I Bought Votes on Digg
  • “Undetectable” spam — this an example of a post by Matt Cutts, head of Google’s anti-spam team — I won’t go so far as to suggest that his blog be required reading for newsrooms working on networked journalism projects, but they ought to peruse it just to get a feel for what it means to combat the manipulation of open systems.

The basic principle that sites like Google and Digg apply to combat gaming is that there are identifiable strategies for manipulating the system, and if you can identify markers of those strategies, then you can design “algorithms” to detect and block manipulative behavior. These algorithms do not always eliminate gaming, but they can reduce it.

So what would be the equivalent of such anti-gaming algorithms in networked journalism? Journalists have always had the responsibility of vetting the sources for their stories — so now they will have to take on the responsibility of vetting “sources” who are actually writing some of the content.

Take Weinstein’s example of the Tallahassee blogger writing about urban development who previously ran a PR campaign for Wal-Mart — Weinstein complains that the Democrat, which published her blog posts, didn’t disclose this. If it’s because they didn’t know, that seems like a pretty correctable problem.

Algorithm: Does blogger background = PR work for special interests related to blogging topics? If YES, then add disclose and forbid blogging on topics with conflict of interest.

Will this algorithm work perfectly? Of course not. But it’s not like journalists don’t know how to do background checks.

Some algorithms will be points of serious contention — should the “algorithm” block bloggers who contribute to political campaigns? How about those who have worked for a political campaign? What if the blogger discloses all conflicts of interest?

Yeah, well, maybe newsrooms should read Matt Cutts’ blog.

But already there are examples being documented of how the network can manipulate journalists — here’s one from a MediaShift piece on The Benefits and Pitfalls of Using Social Media for Reporting:

Such is the case of Bilawal Bhutto Zardari, son of slain former Pakistani Prime Minister Benazir Bhutto. Back in the old days, after a major event like Bhutto’s assassination, a reporter might jump through hoops trying to score an interview with a member of her family. These days it seems that at least some reporters can’t be bothered, and turned immediately to social networks to see if they could gain insight into his thoughts.

Several big news companies — among them London’s Telegraph and Agence France Presse — lifted quotes about Islam from her son’s Facebook profile. The only problem was that the profile was fake. In this case, traditional media in all its experience didn’t know that social media sources can be a minefield, and it exploded in their faces. Had the “joke” not been discovered sooner or the fake quotes more inflammatory, this could have had serious political implications.

Then there’s the recent President of Facebook incident in France.

But instead of using these instances, as Weinstein does, to support a case for why networked journalism can never work, they should be used to “program” journalism’s manipulation algorithm.

What every newsroom or journalist practicing networked journalism should do is develop a list of all known instances of gaming, and also come up with other possible risks — then they need to develop algorithms to try to protect against these.

So what about the likelihood that some of these algorithms will fail as part of the learning process? The advantage that news organizations have over systems like Google and Digg is that they can actually disclose their algorithms, which Google and Digg can’t do because it would actually give gamers a roadmap.

But news organizations can be transparent. Want to publish content from bloggers who donate to political campaigns? Disclose to readers that you’re doing it, ask the bloggers to disclose the contributions, and leave the rest to readers — they are pretty smart. Given all the right information, they can figure it out for themselves.

Here’s a final piece of advice — don’t look for black and white in this issue, as Weinstein does. Citizen journalism isn’t all good or all bad. The risk of gaming the network can be addressed, although not entirely.

Traditional journalism resisted manipulation, although not always successfully. But in its best moments, it accomplished a lot in the public good. The same can be true of networked journalism.