Google Book Search

The Google Book Search Project is an audacious attempt to make the print world as searchable as the web world is currently. To accomplish this transition from molecules to bits, Google must undertake the laborious process of manually scanning each individual book and digitizing it in a format that can be stored in a database and indexed by a search engine. As its initial foray into this quest, Google has embarked on a two pronged approach to acquire books for the database.

The first prong is the Partner Program, an uncontroversial program which involves an agreement between Google and a publisher in which Google scans the books under the agreement, shows several pages of the book to users in response to searches, and shares advertising revenues derived therefrom with the publisher.

The second prong is the Library Project, a controversial program which involves Google cutting deals with, initially, several influential libraries to scan their collections into the database. For these books, in response to searches, users are only shown a snippet of text surrounding the search term, along with links to other websites where the user can purchase the book.

The master plan, presumably, is to partner with as many publishers as are willing under the Partner Program, and suck up the rest of the worlds books under the Library Project. However, Google has also offered a third fate to books opt out. Publishers can notify Google of books that it does want included in the database, and Google will only provide standard bibliographic information (or nothing at all) to users in response to searches.

The following vanity search shows the responses you would receive under each of the three scenarios.

And here is what a snippet looks like:

So in summary, Google Book Search is a 21st century attempt to resurrect the Royal Library of Alexandria, hopefully now impervious to Roman torches and other special interests.

The Controversy

As indicated above, the Library Project is controversial. That it is controversial is evidenced by two lawsuits filed against Google by particular stakeholders that have a problem with Google scanning books without permission, namely book publishers. In a position not unlike Viacoms in the YouTube lawsuit, the book publishers position is that the burden of determining whether or not a book can be scanned is Googles (Google needs to ask, or more likely pay a fee), not the publishers (they should not have to be proactive to keep their books out of the database).

As the basis for this position, the publishers point to the exclusive right of reproduction granted under the Copyright Act, which on the face of it seems eminently reasonable. Google, after all, is copying a book and storing that copy in its database. There is also some mention in the complaints of the display of snippets in response to searches, but such de minimus use is likely to gain traction, and consequently the focus is on the reproduction of the work in the database.

So in real terms, exactly how bad is it for the publishers? The two complaints filed against Google are remarkably non-specific about what their downside is in this deal. The Guild complaint alleges that Googles act causes injury through:

1) continued infringement
2) depreciation in the value and ability to license and sell their works
3) lost profits and/or opportunities
4) damage to their goodwill and reputation

Given the popularity of Google, having Google present a sentence or two from a book to a potential buyer as a relevant response to a query that the potential buyer actively posed would only serve to appreciate the value of the book, increase the publishers ability to sell the book, increase profits and opportunities as potential buyers mine the long tail now available to them, and increase the publishers goodwill and reputation as buyers have much more concrete and compelling reasons to buy.

The only other reference to injury in the Guild complaint is that Googles use will cause irreparable harm by depriving them of both the right to control the reproduction and/or distribution of their copyrighted Works and to receive revenue therefrom. And here is revealed the true reason for their discontent; given that the economic arguments cut against them, the only remaining harm is the deprivation of the right to control the reproduction.

In the McGraw-Hill case, it is alleged that Googles continuing and future infringements are likely to usurp Publishers present and future business relationships and opportunities for the digital copying, archiving, searching and public display of their works. The Google Library Project, and similar unrestricted and wide-spread conduct of the sort engaged in by Google, whether by Google or others, will result in a substantially adverse impact on the potential market for Publishers books.

The second claim of adverse market impact, as indicated above, is probably quite incorrect. However, the first claim goes to the issue of Google interloping on McGraw-Hills own future Book Search Project. However, this is speculative in the extreme.

The Purpose of Copyright

This, more than any other copyright case in recent memory, begs the question as to what is the purpose of copyright. When weighing the obvious social and cultural benefit of Google Book Search against the statutory right to control reproduction, it may be helpful to pop up a level and map the competing objectives against the Constitutional intention.

In this country at least, we have a fundamentally utilitarian view of copyright. Unlike other countries that may have a more romantic view of the author and his intrinsic right to control his works, America doesnt. America doesnt grant exclusive rights to authors because it thinks they have some inherent entitlement to them, but because it are betting that if it does, that the authors will enrich society with more creative works. When America was founded, it was a full-on copyright-pirate nation. It copied books from abroad because it needed the information and culture, but British authors were not compensated because there was no pragmatic reason for doing so. It was not until the American IP market was somewhat mature that it became expedient to protect it, which necessitated reciprocal protection. The enrichment of society is the goal; the reward of authors with exclusive rights is a means to the ends.

The courts have stated this explicitly. Starting with Fox Film Corp. v. Doyal, 286 U.S. 123, 127, Chief Justice Hughes commented on the exclusive rights granted by Congress, The sole interest of the United States and the primary object in conferring the monopoly lie in the general benefits derived by the public from the labors of authors.

Then in United States v. Paramount Pictures, Inc., 334 U.S. 131 (1948), the Supreme Court confirmed that The copyright law, like the patent statutes, makes reward to the owner a secondary consideration.

And more recently, the court noted in Whelan v. Jaslow, We must remember that the purpose of the copyright law is to create the most efficient and productive balance between protection (incentive) and dissemination of information, to promote learning, culture and development.

The Mission

The above is all a rather long-winded way of saying that Googles Fair Use argument is all but assured of prevailing. The copyright law, propelled by the nature of its own genesis, cannot determine otherwise.

In this instance, Googles mission statement to organize the worlds information and make it universally accessible and useful is completely convergent with the Constitutions intent to promote the progress of science and the useful arts. And, of course, we can confirm empirically the validity of Googles fair use argument using the Fair Use Visualizer, which as shown below, calculates an impressive score of 73.