
Meta claims torrenting pirated books isnt illegal without proof of seeding
arstechnica.com
Defending a bad ratio Meta claims torrenting pirated books isnt illegal without proof of seeding Metas copyright defense may hinge on court ignorance of torrenting terminology. Ashley Belanger Feb 20, 2025 3:02 pm | 100 A peer who downloads more data than they upload on torrenting networks is known as a "leech" sucking up data without contributing to the swarm. Credit: phototrip | iStock / Getty Images Plus A peer who downloads more data than they upload on torrenting networks is known as a "leech" sucking up data without contributing to the swarm. Credit: phototrip | iStock / Getty Images Plus Story textSizeSmallStandardLargeWidth *StandardWideLinksStandardOrange* Subscribers only Learn moreJust because Meta admitted to torrenting a dataset of pirated books for AI training purposes, that doesn't necessarily mean that Meta seeded the file after downloading it, the social media company claimed in a court filing this week.Evidence instead shows that Meta "took precautions not to 'seed' any downloaded files," Meta's filing said. Seeding refers to sharing a torrented file after the download completes, and because there's allegedly no proof of such "seeding," Meta insisted that authors cannot prove Meta shared the pirated books with anyone during the torrenting process.Whether or not Meta actually seeded the pirated books could make a difference in a copyright lawsuit from book authors including Richard Kadrey, Sarah Silverman, and Ta-Nehisi Coates. Authors had previously alleged that Meta unlawfully copied and distributed their works through AI outputsan increasingly common complaint that so far has barely been litigated. But Meta's admission to torrenting appears to add a more straightforward claim of unlawful distribution of copyrighted works through illegal torrenting, which has long been considered established case-law.Authors have alleged that "Meta deliberately engaged in one of the largest data piracy campaigns in history to acquire text data for its LLM training datasets, torrenting and sharing dozens of terabytes of pirated data that altogether contain many millions of copyrighted works." Separate from their copyright infringement claims opposing Meta's AI training on pirated copies of their books, authors alleged that Meta torrenting the dataset was "independently illegal" under California's Computer Data Access and Fraud Act (CDAFA), which allegedly "prevents the unauthorized taking of data, including copyrighted works."Meta, however, is hoping to convince the court that torrenting is not in and of itself illegal, but is, rather, a "widely-used protocol to download large files." According to Meta, the decision to download the pirated books dataset from pirate libraries like LibGen and Z-Library was simply a move to access "data from a 'well-known online repository' that was publicly available via torrents."To defend its torrenting, Meta has basically scrubbed the word "pirate" from the characterization of its activity. The company alleges that authors can't claim that Meta gained unauthorized access to their data under CDAFA. Instead, all they can claim is that "Meta allegedly accessed and downloaded datasets that Plaintiffs did not create, containing the text of published books that anyone can read in a public library, from public websites Plaintiffs do not operate or own."While Meta may claim there's no evidence of seeding, there is some testimony that might be compelling to the court. Previously, a Meta executive in charge of project management, Michael Clark, had testified that Meta allegedly modified torrenting settings "so that the smallest amount of seeding possible could occur," which seems to support authors' claims that some seeding occurred. And an internal message from Meta researcher Frank Zhang appeared to show that Meta allegedly tried to conceal the seeding by not using Facebook servers while downloading the dataset to "avoid" the "risk" of anyone "tracing back the seeder/downloader" from Facebook servers. Once this information came to light, authors asked the court for a chance to depose Meta executives again, alleging that new facts "contradict prior deposition testimony."Torrenting terminology may confuse courtHow successful Meta's torrenting defense will be is still up in the air, but authors pointed out that even if Meta somehow managed to avoid seeding any of the torrented books, the social media giant still participated in an "online piracy ring." Further, in a footnote, the authors told the court that "IP pirates like Meta also upload or share files with others during (leeching) and after (seeding) downloading." Additionally, TorrentFreak noted that Meta "taking precautions is not the same as preventing" seeding.Authors will likely push to persuade the court that merely by torrenting the file, Meta made "pirated works available to other users worldwide" while making it clear that even Meta can't claim to have prevented all seeding. A lawyer representing the authors declined to comment on whether ongoing discovery may surface more evidence to help prove the seeding claims. Lack of evidence could be a problem since TorrentFreak suggested the torrenting terminology may be foreign to the court, potentially muddying what authors feel otherwise is a straightforward claim that Meta allegedly knew it was violating laws by torrenting the pirated books.Meta has been silent so far on claims about sharing data while "leeching" (downloading) but told the court it plans to fight the seeding claims at summary judgment.At this time, Meta has moved to dismiss the authors' CDAFA claim as being preempted by copyright law, but unsurprisingly, the authors told the court that they strongly disagree."Had Meta bought Plaintiffs works in a bookstore or borrowed them from a library and then trained its LLMs on them without a license, it would have committed copyright infringement, but no CDAFA violation," the authors alleged. "Metas decision to bypass lawful acquisition methods and become a knowing participant in an illegal peer-to-peer piracy network provides the 'extra element' and is 'qualitatively different' to establish an independent CDAFA violation."Authors further linked the CDAFA claim to their copyright infringement claim opposing Meta's AI training. They alleged that by torrenting their works "from pirated databases in lieu of executing lawful licensing arrangements, Meta not only deprived Plaintiffs of that licensing revenue, but it also deprived Plaintiffs of additional revenue they could have generated from 'other users worldwide' because Meta simultaneously made the copyrighted works available to download by any interested Internet user in the process of acquiring Plaintiffs data" to train AI.Meta did not immediately respond to Ars' request for comment.Ashley BelangerSenior Policy ReporterAshley BelangerSenior Policy Reporter Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience. 100 Comments
0 Comentários
·0 Compartilhamentos
·50 Visualizações