The Internet Archive loses its appeal to lend e-books without permission

The Library of Alexandria via Wikimedia Commons

The Internet Archive has lost again in its bid to continue offering access to e-books for free and without any compensation to publishers or authors.

The U.S. Court of Appeals for the Second Circuit, based in New York, ruled on Wednesday that U.S. District Court Judge John Koeltl had acted correctly in finding for four major book publishers who sued the Archive for copyright infringement. Emma Roth has the story at The Verge.

The nonprofit Archive is one of the most useful corners of the internet, offering free access to web pages that otherwise would have disappeared and working with copyright holders to keep defunct publications available for viewing — such as, for example, The Boston Phoenix, one of my former haunts.

But the Archive chose a very odd hill to defend by insisting that it had a right to offer e-books without paying for a license from publishers, as libraries typically must do. The Archive claimed that it was in compliance with copyright law because it limited e-book borrowing to correspond with physical books that it had in its collection or that was owned by one of its partner libraries. That’s not the way it works, though.

Please support this free source of news and commentary by becoming a member for just $5 a month. You’ll receive a weekly newsletter with exclusive early content and other goodies.

As the appeals court’s decision observes, public and academic libraries must purchase licenses for e-books even though they also hold physical copies of those books. “Critically, IA [the Internet Archive] and its users lack permission from copyright holders to engage in any of these activities,” according to the decision. “They do not license these materials from publishers, nor do they otherwise compensate authors in connection with the digitization and distribution of their works.”

The Archive claimed a fair-use exception to copyright law, a four-part test that the courts apply to determine whether copyrighted material can be used without permission. The court ruled in favor of the publishers on all four tests, mainly because the Archive had copied books in their entirety rather than just excerpts and because that practice could harm the potential market for those books. The decision concludes with some fairly harsh language:

IA asks this Court to bless the large scale copying and distribution of copyrighted books without permission from or payment to the Publishers or authors. Such a holding would allow for widescale copying that deprives creators of compensation and diminishes the incentive to produce new works. This may be what IA and its amici prefer, but it is not an approach that the Copyright Act permits.

The Archive has responded by removing some 500,000 books from its online library, explaining:

We understand that this is a devastating loss for our patrons, and we are fighting back through the courts to restore access to these books. Fortunately, other countries and international library organizations are moving to support controlled digital lending. We appreciate your patience and understanding as we fight this long battle.

I’m not sure what legal steps are available to the Archive other than appealing to the Supreme Court. Given that both Judge Koeltl and the Court of Appeals simply applied existing copyright law in a straightforward manner, it’s hard to imagine that the Supremes would be interested unless they possess some previously undetected enthusiasm for upending the law in its entirety.

My views should not be taken as a value judgment. The folks at the Internet Archive have always been among the good guys of digital culture — one of the last pure outposts from the early days of internet idealism, along with Wikipedia and very few others. The giant book publishers simply want to maximize their profits, and authors are not going to benefit from Wednesday’s decision outside a few bestselling behemoths at the top. Journalist Dan Gillmor put it this way on Mastodon:

Others have said this, but the Internet Archive’s appeals-court loss to Big Publishing is a disaster for everyone but the cartel of companies and a tiny number of A list authors.

The publishers will tolerate libraries only as long as they can control everything about how books can be loaned. If public libraries were being invented today, the cartel would make their core functions illegal.

The problem, though, is that it is the job of judges to apply the law, not offer a critique of capitalism.

There’s nothing in The Verge story or the Court of Appeals’ decision specifying what penalties the Archive will have to pay. I hope there are none. And though it’s probably too much to hope that the publishers will rethink their approach to e-books in their moment of triumph, they really ought to make some changes.

Digital distribution should have led to an increase in the availability of knowledge. Instead, it’s led to a regime of top-down control that is more restrictive than what prevails in the world of physical books. Try lending an e-book to a friend. That may be one of the reasons that e-books are declining in popularity while physical books are on the upswing.

All of this is playing out at a time when artificial intelligence companies are being sued for gobbling up vast quantities of text without permission. As Kate Knibbs writes for Wired:

The new verdict arrives at an especially tumultuous time for copyright law. In the past two years there have been dozens of copyright infringement cases filed against major AI companies that offer generative AI tools, and many of the defendants in these cases argue that the fair use doctrine shields their usage of copyrighted data in AI training. Any major lawsuit in which judges refute fair use claims are thus closely watched.

Needless to say, AI companies like ChatGPT, Meta and their ilk have far more power and resources at their disposal than a struggling nonprofit like the Internet Archive.

Earlier:

The Andrea Estes saga leads the list of most-read Media Nation posts in 2023

Photo (cc) 2020 by Busdriver666

It’s time once again to take a look at the state of Media Nation and share the most-read posts of the past year. It’s a little complicated this year — in late July, I moved the blog from WordPress.com to WordPress.org, and the numbers for January through July look different when compared to August through December. It seems to be an apples-and-oranges problem, but I can’t put my finger on it. Given that, I’m going to list the top five for the first seven months and the top five for the last five months. Presumably it will be easier to figure it out next year.

January-July 2023

1. Andrea Estes has left the Globe following an error-riddled story about the MBTA (May 4). One of The Boston Globe’s top investigative reporters was fired after the paper erroneously reported that three top managers at the MBTA were living in distant locations when in fact they were in the Boston area. Six others really were working remotely. The Globe has still not disclosed what went wrong, and, by fall, Estes was working at the Plymouth Independent, a well-funded nonprofit with some prominent Globe alumni.

2. Liz Cheney for speaker (Jan. 3). With the dysfunctional House Republicans unable to agree on a speaker, I suggested that a bipartisan coalition turn to Cheney, a hard-right conservative who had nevertheless endeared herself to some Democrats with her service on the House committee that investigated the role played by Donald Trump and others in the failed insurrection of Jan. 6, 2021.

3. An ombudsman could have explained what went wrong with the Globe’s MBTA story (April 28). Following a lengthy correction to Andrea Estes’ story about the MBTA, I urged that the Globe, as well as other news organizations, bring back the ombudsman’s position, something that nearly all news organizations had abandoned over the past 10 years. Sometimes called the public editor, the ombudsman’s role is to act as a reader advocate and look into problems with coverage, standards, tone and other matters.

4. Globe editor Nancy Barnes tells her staff she’s working to unravel the MBTA fiasco (May 4). We’re still waiting — although, to be fair, Estes’ decision to file a union grievance may make it difficult to go public with any information about what went wrong, and who was to blame, in that botched MBTA story.

5. Why the Internet Archive’s copyright battle is likely to come to a very bad end (March 21). We all love the Internet Archive. In my view, though, it’s heading down a very bad road, claiming the right to copy and lend books without first reaching a licensing agreement with the publishers, as every other library does. Early indications were that the courts would not look kindly upon the Archive’s arguments, and I doubt that’s going to change. There are many negative observations I could make about copyright law, but it is the law.

August-December 2023

1. The late Matthew Stuart’s lawyer blasts the Globe (Dec. 6). After The Boston Globe published its massive overview of the 1989 Carol Stuart case, Nancy Gertner, who had been the late Matthew Stuart’s lawyer, took to GBH Radio (89.7 FM) and blasted the Globe for suggesting that Matthew may have been directly involved in fatal shooting Carol Stuart, the wounding of her husband, Charles Stuart, or both. (A brief synopsis: Charles Stuart, who had planned the murder, blamed the shootings on “a Black man,” turning the city upside-down for weeks, and then finally jumped to his death off the Tobin Bridge as police were moving in.) Several days after Gertner’s remarks, Globe columnist Adrian Walker, who worked closely on the project and narrated the accompanying podcast, appeared on GBH to defend the Globe’s reporting and assert that the paper did not draw any conclusions about Matthew Stuart’s role.

2. The Globe announces expanded regional coverage of Greater Boston (Sept. 6). The Boston Globe is among a tiny handful of regional newspapers that are growing and hiring — and the paper took another step in September by announcing more coverage in Cambridge, Somerville and the suburbs. The Globe already has bureaus in Rhode Island and New Hampshire. Good news all around, although it’s no substitute for detailed coverage of local government, schools, development and the like. Some communities are now being well-covered by startup news outlets, most of them nonprofit; others, though, have little or nothing.

3. A devastating portrayal of Elon Musk raises serious questions about capitalism run amok (Aug. 23). The world’s richest person was unavoidable in 2023, mainly for his destruction of Twitter, the plaything he bought the previous fall. Ronan Farrow, writing in The New Yorker, took a deep dive into Musk’s life and career, describing him as an out-of-control egomaniac with scant regard for safety at SpaceX and Tesla, his grandiosity fed by what may be his overindulgence in ketamine. Walter Isaacson’s biography of Musk got more attention, but Farrow delivered the goods.

4. More evidence that Woodrow Wilson was among our very worst presidents (Oct. 9, 2022). Why this post from 2022 popped up is a mystery to me, but it’s nevertheless heartening to see that Wilson’s reputation continues to disintegrate. I shared a New York Times review of a Wilson biography by Adam Hochschild. The reviewer, Thomas Meaney, wrote that the book deals mainly with Wilson’s “terror campaign against American radicals, dissidents, immigrants and workers makes the McCarthyism of the 1950s look almost subtle by comparison.” And lets not forget that Wilson was also a vicious racist.

5. Nobel winner weighs in on a shocking police raid against a newspaper: ‘It’s happening to you now’ (Aug. 12). One of several posts I wrote about a police raid of the offices of the Marion County Record in rural Kansas as well as the homes of the publisher and a city official. Publisher Eric Meyer’s mother, Joan Meyer, still involved in the paper at the age of 97, died the next day, apparently because of stress. “It’s happening to you now,” said Maria Ressa, the Filipino journalist who won the 2021 Nobel Peace Prize for her courageous resistance to her own country’s authoritarian regime. The ostensible reason for the police department’s thug-like action involved supposedly confidential driver’s records belong to a local restaurateur; more likely, it involved the paper’s investigation of Police Chief Gideon Cody’s alleged misconduct at his previous job. Two months after the raid, Cody resigned.

This might be my final post of 2023. Thank you, as always, for reading. And I wish all of you health and happiness in the year ahead.

Leave a comment | Read comments

A federal judge delivers an easily predicted rebuke to the Internet Archive

Photo (cc) 2020 by Nenad Stojkovic

Good story, bad headline. “The Internet Archive has lost its first fight to scan and lend e-books like a library,” proclaims The Verge. In fact, U.S. District Court Judge John Koeltl ruled Friday that if the Archive wants to lend e-books, it must do so like a library — by purchasing a license and thus compensating publishers and authors for their work.

The only surprise is that Koeltl acted just a few days after a hearing in his New York courtroom. Then again, maybe it shouldn’t be a surprise. When an argument is so wrong-headed as the one advanced by the Archive, there is no reason for justice not to be swift and certain. It was ludicrous for the folks at the Archive to believe they could simply scan books they own and lend them out. I mean, why didn’t the local public library think of that?

Brewster Kahle, the Archive’s founder, has this to say:

Libraries are more than the customer service departments for corporate database products. For democracy to thrive at global scale, libraries must be able to sustain their historic role in society — owning, preserving, and lending books.

This ruling is a blow for libraries, readers, and authors and we plan to appeal it.

Good grief. Certainly there’s a critique to be made of the restrictive manner in which publishers make e-books available to libraries. But simply ignoring the law is not a smart strategy for dealing with that. I just hope the Archive is able to survive this incredibly wrong-headed gamble.

Earlier, with more background:

Why the Internet Archive’s copyright battle is likely to come to a very bad end

The Library of Alexandria via Wikimedia Commons.

Simply as a matter of copyright law, I’m afraid that the Internet Archive — one of the most valuable corners of the internet — is about to fall off a cliff, taking with it our access to countless old websites, newspapers and other content.

Let me explain. On Monday, a federal judge in Manhattan heard opening arguments in a lawsuit brought by four major book publishers who argue that the Internet Archive is violating copyright law by digitizing books in its possession and lending them for free. Blake Brittain reports for Reuters that the proceedings did not appear to go well for the Archive, with U.S. District Judge John Koeltl asking “pointed questions.”

“You avoid the question of whether the library has the right to reproduce the book that it otherwise has the right to possess, which is really at the heart of the case,” Koeltl reportedly told the Archive’s lawyer, Joe Gratz. “The publisher has a copyright right to control reproduction.” Yikes.

The Archive ramped up its lending during the COVID-19 pandemic and has not cut back even though life has more or less returned to normal. The Archive argues that it’s doing what any library does — it’s lending books that it owns, and it’s controlling how many people can borrow a book at any given time. In other words, it’s not simply making electronic versions of its books available for mass download. That may show some desire to act responsibly on the Archive’s part, but that doesn’t make it legal.

By contrast, a library typically buys one or more hard copies of a book and lends them out, or buys the right to lend e-books to its patrons. The operative word in both cases is “buys.” Money changes hands. Publishers and authors are compensated. Buying a hard copy of a book, digitizing it without any additional payment, and then lending it out is illegal, regardless of whether the lending is controlled or not. I find it kind of stunning that the Archive would put its entire free service at risk over such an obviously wrong stand.

“If this conduct is normalized, there would be no point to the Copyright Act,” Maria Pallante, chief executive of the Association of American Publishers, told (free link) Erin Mulvaney and Jeffrey A. Trachtenberg of The Wall Street Journal. Indeed, the Journal story notes that Google won its own legal battle over Google Books only by limiting what you can find to snippets of books, not the entire text.

I should point out that the Archive is not without some powerful friends of its own. The Electronic Frontier Foundation is providing legal assistance. In addition, Inside Higher Ed published a commentary written by a number of Archive supporters who argue that the Archive is a legitimate library, and that its “controlled digital lending” system, which limits lending to one user at a time, is covered by the fair use provision of copyright law.

“The argument that the Internet Archive isn’t a library is wrong,” according to the Inside Higher Ed essay. “If this argument is accepted, the results would jeopardize the future development of digital libraries nationwide.”

Oh, and by the way: Inside Higher Ed limits users to five free articles a month before you have to pay for a subscription — which, of course, it has every right to do.

I looked up my own books and found that two of the three, “Little People” (2003) and “The Wired City” (2013), are available for borrowing. I don’t mind. Whatever economic value they had has long since expired, and if someone would like to read them for free without using a traditional library, that’s fine. But I certainly would have objected during the first couple of years after they were published. Rodale paid me a decent advance for “Little People,” which funded the time off I took in order to research and write it. “The Wired City” was published by the University of Massachusetts Press, an academic publisher that survives from sales to libraries, both in hard copy and electronic form.

The Internet Archive is a godsend. Just recently I used it to look up the original version of a New York Times editorial that prompted Sarah Palin’s unsuccessful libel suit. The Archive has also digitized nearly every print edition of The Boston Phoenix through an arrangement with Northeastern University, which holds the copyright thanks to the generosity of Stephen Mindich, the late publisher. Along with Wikipedia, the Archive is one of the last uncorrupted places on the internet.

Ideally I’d like to see the Archive work out an arrangement with the book publishers that might limit but not shut down its book-lending program. My fear, though, is that this is headed for a very bad end.

A disappeared alt-weekly highlights the challenge of saving digital archives

Paul Farhi of The Washington Post has an amazing story (free link) about The Hook, an alternative-weekly that used to publish in Charlottesville, Virginia. Its online archives disappeared after they were sold to a mystery buyer. Circumstantial evidence suggests that the buyer was a litigious deep-pockets guy who wanted to make invisible The Hook’s reporting about a sexual-assault case he was involved in years earlier.

Keeping online archives active and usable is a real challenge. Though what happened to The Hook was pretty unusual, it’s not unheard-of for valuable digital resources simply to disappear. Fortunately, the defunct alt-weekly I worked for, The Boston Phoenix, is available online through Northeastern University and the Internet Archive. You can find the Phoenix here.

It’s even more of a problem when the resource was digital-only and there was no print component that can be saved on microfilm. For instance, Blue Mass Group, a progressive political website that was a big deal in Massachusetts at one time, has been seeking a new digital home as the last of the co-founders, Charley Blandy, prepares to leave. Charley writes: “Plans are afoot for the site to be thoroughly crawled and archived. It won’t just disappear. The site will stay up, at least for a while, but for the purpose of archiving, commenting and posting will be disabled on 12/31/22.”

These resources need to be saved.

After a long delay, most of The Boston Phoenix print archives are now online

The Boston Phoenix’s archives have taken a giant step closer to becoming accessible and usable.

A few weeks ago I learned from Giordana Mecagni, the head of special collections and university archivist at Northeastern, that a deal had been struck with the Internet Archive to make print editions of the Phoenix available — and searchable — online. On Wednesday, it became official. Caralee Adams has the details at the Internet Archive’s blog.

I’m really thrilled that this has happened. I was on staff at the Phoenix from 1991 to 2005, most of that time as the media columnist, and I continued to write for the paper occasionally up until it closed in 2013. Two years later, the Phoenix’s founder and publisher, Stephen Mindich, donated the archives to Northeastern, a gift I helped arrange.

Unfortunately, Stephen died in 2018, and the hopes we all had of digitizing the collection stalled out. A couple of years ago there was talk of a grant proposal, but that didn’t go anywhere, either. So what happened? Adams explains:

As it turns out, the Internet Archive owned the master microfilm for the Phoenix and it put the full collection online in a separate collection: The Boston Phoenix 1973-2013. Initially, the back issues were only available for one patron to check out at a time through Controlled Digital Lending. Once Northeastern learned about the digitized collection, it extended rights to the Archive to allow the Phoenix to be downloaded without controls.

“All of a sudden it was free to the public. It was wonderful,” Mecagni told Adams. “We get tons and tons of research requests for various aspects of the Phoenix, so having it available online for free for people to download is a huge help for us.”

I’ve been playing with the new collection the last few weeks, and though it’s not perfect, it’s a big step forward. It encompasses papers starting in 1973, when Mindich, the publisher of a competing alt-weekly called Boston After Dark, acquired The Phoenix and renamed it The Boston Phoenix, up until the closing in March 2013.

There are some significant gaps; there appear to be no issues from 2011 or ’12, and just 33 from 2010, for instance. (I’ll bet there are ways of fixing that. I know that the Boston Public Library has the Phoenix in its microfilm collection, and perhaps it’s more complete than what the Internet Archive has.) And BAD, the pre-Mindich Phoenix and The Real Paper, founded by former staff members of The Phoenix following the 1973 acquisition, are all absent as well.

But this is a huge, huge step forward. As Carly Carioli, the last editor of the Phoenix, told Adams: “It’s a dream come true. The Phoenix was invaluable in its own time, and I think it will be invaluable for a new generation who are just discovering it now.”

Giordana Mecagni deserves huge thanks. From the beginning, she has understood the value of the Phoenix. This is a big step forward for her vision as well.

That link, once again, is right here. Enjoy!

The demise of Adobe Flash broke the 9/11 web. But it’s just the tip of a bigger issue.

This is an important story — not just because some crucial 9/11 coverage has been lost or even because the demise of Adobe Flash means that parts of the internet are now broken. Rather, it illustrates that the internet is, in many ways, an ephemeral medium, meaning that we simply can’t preserve and archive our history the way we could during the print era.

Clare Duffy and Kerry Flynn report for CNN.com that The Washington Post, ABC News and CNN itself are among the news organizations whose interactive presentations in the aftermath of 9/11 no longer work properly.

Become a member of Media Nation for just $5 a month!

As they recount, Flash was a real advance in the early days of the web, as it was an important step forward for video and interactive graphics. But the late Steve Jobs, criticizing Flash’s security flaws, decreed that Apple’s iPhone and iPad would not run Flash. At that point the platform began to crumble, and Adobe pulled support for it at the end of 2020.

Duffy and Flynn write that some efforts are under way to use Flash emulators in order to bring some old content back to life. Adobe, which is worth $314 billion, ought to spend a few nickels to help with that effort.

More broadly, though, the problem with Flash illustrates how the internet decays over time. Link rot is an ongoing frustration — you link to something, go back a year or five later, and find that the content has moved or been taken down. Publications go out of business, taking their websites with them. Or they change content-management systems, resulting in new URLs for everything.

We’re all grateful for the work that the Internet Archive does in preserving as much as it can. Here, for instance, is the home page of The New York Times on the evening of Sept. 11, 2001.

But what’s available online isn’t nearly as complete as what’s in print. For the moment, at least, we can still go to the library and look at microfilm of print editions for publications that pay little attention to preserving their digital past. It won’t be too many years, though, before digital is all we’ve got.