Raw Story Media v. Open AI, Inc. was a case filed in the Southern District of New York involving the generative artificial intelligence app ChatGPT. The court dismissed the case before the parties even started discovery.
A group of news media companies sued the collection of companies that make up ChatGPT. The organizations suing Open AI alleged that thousands of their journalistic works were scraped from the internet, stripped of their copyright management information (CMI), and fed to at least three of Open AI’s training sets.
Plaintiffs contended that their articles had been downloaded and incorporated into the database of ChatGPT — with their CMI removed. CMI, from the Digital Millennium Copyright Act (DMCA), has three identification components: 1) the author, 2) the title, and 3) the notice of copyright.
Defendants moved to dismiss the plaintiffs’ complaint based on the lack of Article III standing. “Standing” addresses whether a plaintiff has the right to bring a lawsuit. The Court noted that under the U.S. Constitution, Article III (which sets up the judiciary system in the U.S.), a plaintiff only has standing if there is a concrete injury, even if the plaintiff’s claim is in the context of a statutory violation (here, an alleged violation of the Copyright Act). It’s not enough to say that someone’s rights have been violated; the plaintiff must show that they’ve suffered an actual, not theoretical, injury.
Without standing, the court wouldn’t have subject matter jurisdiction over the case and would not be able to adjudicate the dispute.
The plaintiffs admitted that they didn’t know whether a copy of their work had actually been disseminated by ChatGPT. Despite that problem, they tried to make analogies to buttress their contention that the unauthorized removal of CMI from their work gave rise to a concrete injury. They claimed the removal of CMI was closely related to copyright infringement and that, according to the common law, plaintiffs caused a concrete injury by interfering with such property. (Merely trespassing, for example, is enough to say there’s an injury.)
The court didn’t buy any of the plaintiffs’ analogies, saying that section 1202 of the DMCA, which is the part that protects the author from the removal of the CMI, only protects them from the injury of removing the CMI. The mere removal of identifying information doesn’t have any historical analogy to the common law.
Most importantly, the plaintiffs did not allege any actual adverse effects stemming from the alleged violation. There was no concrete harm (and therefore no standing) because plaintiffs couldn’t point to any examples where their content was actually used by ChatGPT. Absent any damages, plaintiffs didn’t have Article III standing, so the damages part of the case was dismissed.
The media companies’ claim also sought injunctive relief, that is, to require the defendants to remove all copies of their copyrighted work from which the CMI was removed, claiming that there was a substantial risk that ChatGPT will reproduce their work in the future.
The defendants successfully countered that there weren’t sufficient facts to support the plaintiffs’ claim of such risk. The ChatGPT companies argued that scraping the internet includes massive amounts of information from innumerable sources on almost any given subject, and reiterated that facts and information are not protectable. At best, the court found, ChatGPT would synthesize this information and its repository into a response, but “Given the quantity of information contained in the repository, the likelihood that ChatGPT would output plagiarized content from one of the plaintiffs’ articles seems remote.”
Although there are third-party statistics indicating that earlier versions of ChatGPT generated responses that contained significant amounts of plagiarized content, the plaintiffs did not plausibly show that there was a substantial risk that the current version of ChatGPT would generate a plagiarized response from any of the plaintiffs’ articles.
The judge granted dismissal with leave to file an amended complaint but said that he was doubtful that plaintiffs would be able to bring a damages claim that would survive the court’s analysis. The judge noted that rather than seeking to redress the alleged injury – removal of CMI – what plaintiffs really wanted was to prevent defendants from using the articles to develop ChatGPT without compensating plaintiffs. That type of injury might satisfy the injury requirement.
This case is more evidence that it’s a full-time job to protect your content. And according to this trial court, a plaintiff won’t be able to protect their work from generative AI solely by claiming that the voracious software removed the plaintiff’s authorship and other identifying information from the work. This case was either a victory for computer-generated progress or scary stuff for content-providers, depending on which side of content-generation you reside.