AI vs. Copyright: Media Takes OpenAI to Court

AI vs. Copyright: Media Takes OpenAI to Court

Initial Complaint

Earlier this year, Raw Story Media, Inc. and AlterNet Media, Inc. filed a lawsuit against OpenAI in the Southern District of New York, claiming that the company used their journalistic content to train ChatGPT without permission. The plaintiffs argued that this action violated the U.S. Constitution’s Copyright Clause, which they believed was designed to protect creative works. They felt that OpenAI’s approach not only disregarded these protections but also undermined their investments in quality journalism.

The complaint revolved around OpenAI’s choices in its AI training process. They could have either maintained the copyright management information (CMI) as required by the Digital Millennium Copyright Act (DMCA) or chose to remove it entirely, generating outputs that could be seen as plagiarizing the original works without giving proper credit to the authors. Spoiler alert: they opted for the latter. One striking claim noted that “nearly 60% of the responses provided by Defendants’ GPT-3.5 product… contained some form of plagiarized content.” As a result, the plaintiffs sought to obtain significant compensation for these alleged infringements, either through statutory damages or the full amount of their losses. On top of that, they sought an injunction to prevent OpenAI from utilizing their copyrighted material in the future, emphasizing the need for their journalistic investments to be respected rather than diminished by AI practices.

OpenAI’s Defense

In response, OpenAI filed a motion to dismiss the lawsuit, contending that the plaintiffs had not shown a concrete injury-in-fact, which is an essential requirement for standing before a federal court. The court referenced Article III of the U.S. Constitution, which outlines the requirements for federal jurisdiction, including the concept of “standing.” The ruling made it clear that the plaintiffs needed to demonstrate an injury that was “concrete, particularized, and actual or imminent.” To establish standing, the court outlined three necessary elements: (1) an injury in fact, (2) a causal connection between the injury and the defendant’s behavior, and (3) a likelihood that a favorable judicial decision would remedy the injury.

The court also underscored the importance of a “close relationship” between the alleged injury and a harm that has traditionally served as grounds for lawsuits in American courts. It inquired, “What makes a harm concrete for the purposes of Article III?” suggesting that the plaintiffs had to identify a “close historical or common-law analogue” for their claimed injury.

Decision

Ultimately, NY District Judge Colleen McMahon concluded that the plaintiffs failed to demonstrate a concrete injury linked to the alleged removal of CMI. They did not provide sufficient evidence that their copyrighted works were disseminated by OpenAI’s ChatGPT, leading to the dismissal of both their claims for damages and their request for an injunction.

Future Implications

McMahon stated, “Given the quantity of information contained in the repository, the likelihood that ChatGPT would output plagiarized content from one of the Plaintiff’s articles seems remote. And while Plaintiffs provide third-party statistics indication that an earlier version of ChatGPT generated responses containing significant amounts of plagiarized content, Plaintiffs have not plausibly alleged that there is ‘substantial risk’ that the current version of ChatGPT will generate a response plagiarizing one of Plaintiffs’ articles.”

This ruling could provide OpenAI a significant advantage, allowing them to incorporate training data without fear of repercussions as long as they argue that the likelihood of generating plagiarized content is slim, or that the current iteration of ChatGPT has been updated. This decision might even influence whether publishers can successfully establish standing to sue over AI training practices in the future.