Is Google's Indexing Of News Sites Copyright Infringement?
3 min read

Is Google's Indexing Of News Sites Copyright Infringement?

The issue of whether Google’s indexing of news sites constitutes copyright infringement is likely to receive more attention following a report that Google has entered into licensing agreements with several large UK news groups, similar to the licensing deals that Google has made with the Associated Press in the U.S. and Agence France-Presse in France. Duncan Riley at TechCrunch is predicting the end of news indexing as we know it, but I’m not so sure.

What’s odd about how this debate has played out is the focus on Google News rather than Google’s main web search results — it’s now particularly odd in light of Google’s new Universal Search, which integrates Google News results into Google’s main web results. But the reality is that news stories have always appeared in the main Google results, which is significant because:

  • Google has been running ads next to indexed news stories for years, even though it hasn’t yet done so on Google News
  • News sites receive every-increasing traffic from Google
  • Many news organizations, including the New York Times, are aggressively optimizing for search

One of the lead stories on the New York Times today is about the immigration bill. If you Google “immigration bill,” the first results is indeed Google News results, but the rest of the top results are from news sites, including Washington Post, Fox News, CNN, LA Times, New York Times, MSNBC, and CBSNews.

Google is selling ads alongside these excerpts from news sites, but look at who the third advertiser is:

immigration-bill-search-sponsored-links.jpg

The New York Times’ SEO effort puts the issue of Google’s relationship with news organizations into persepective. In fact, the New York Times recently got caught violating Google’s webmaster rules by ranking search results in Google using the domain query.nytimes.com. For a while, the search results for “sex” at query.nytimes.com was ranking in the top five for a search on “sex,” although it’s now been removed after SEOs accused the Times of spamming Google. Danny Sullivan has a great write-up.

As John Andrews observed:

We knew it was coming, and we knew the New York Times was “getting� SEO. And it didn’t take long. The King of Content is now dominating the Google SERPs across a wide swath of the keyword space, via the re-published, re-purposed, New York Times Archives. Each “article� is re-purposed on a clean, CSS-driven text page, clearly dated TODAY and not-co-clearly labeled as “originally published� back in 1997, 1998, or whatever all the way back to 1981. Of course cross-referenced, categorized, sub-categorized, ad-infinitum.

Here’s an excerpt from the New York Times’ Q1 2007 earnings call:

As of March, the Times Company was the 12th most visited parent company on the web in the United States with 43.5 million unique visitors, up 12% from March of 2006, according to Nielsen NetRatings. Traffic growth has been accelerating as we optimize our website for search.

And here’s what New York Times Chairman Arthur Sulzberger had to say about the importance of search at the recent New York Times shareholders meeting:

Moreover, About is having a powerful effect on our Company by providing NYT.com, Boston.com, IHT.com and our regional sites with critical, digital expertise. This includes optimizing content so that it is more visible to search engines, which leads to significant increases in traffic and thereby makes our online pages more profitable.

Compare that with:

Sam Zell, the tycoon behind the recent purchase of US group Tribune, indicated that he might challenge Google’s use of his company’s content, and Daily Telegraph editor Will Lewis recently attacked Google during a speech at the Ifra conference in Paris.

“Companies such as Google and Yahoo are seeking to build a business model on the back of our own investment without recognition. All media companies need to be on guard for this,” said Lewis.

How can news sites claim that Google’s “copyright infringement” is damaging their businesses when the New York Times, the paper of record, is managing to profit from its inclusion in the Google index?

The real issue here has nothing to do with Google or any other aggregator — the problem for news organizations is that online ad revenue, even for optimized online news businesses, will never replace declining print ad revenue. Newspaper print ad revenue was based on monopoly pricing, and those monopoly market conditions are now a thing of the past.

Google is walking a fine line between quietly cutting deals to make the “issue” go away and publicly fighting it as they chose to do with Viacom’s YouTube copyright lawsuit.

One way Google could deal with news sites crying copyright infringement — cut deals for including content from those news sites in Google News, then exclude those sites from the main Google search results, which would force all search traffic to go through Google News to discover those news sites, thus compounding the disintermediation.

But the ball isn’t really in Google’s court — search and other aggregators are increasingly the gatekeepers to online content, and it’s up to news organizations to decide whether they want to be a thriving part of the online ecosystem or an island in a networked media world.