Copyright Infringement for Artificial Intelligence Software

Artificial Intelligence in Copyright Infringement.

In Thomson Reuters v. Ross Intelligence, a federal judge denied summary judgment and ordered a jury trial on fair use of AI and copyright infringement.

Ruling

In a landmark ruling on September 25, 2023, a federal judge in Delaware denied summary judgment over a legal research software companies use of copyrighted material to train its artifical intelligence database.  The federal court ruled that it will be up to a jury to decide whether Ross Intelligence infringed Thomson Reuters’ copyrights by copying material from its legal research platform, Westlaw. The case has been closely watched by legal experts, as it could set a precedent for how copyright law applies to artificial intelligence (AI).

In his ruling, Judge Stephanos Bibas found that there were genuine factual disputes that could not be resolved on summary judgment. For example, he noted that the parties disagreed about the extent to which Ross Intelligence copied the headnotes and whether its use of the headnotes was transformative.

Judge Bibas also held that the fair use defense was a jury question. He explained that the four factors courts consider when evaluating fair use – the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the portion used, and the effect of the use on the potential market for or value of the copyrighted work – are all factual questions.

Background

Thomson Reuters alleged that Ross Intelligence copied its copyrighted headnotes, which are summaries of legal concepts in court judgments. Ross Intelligence argued that it fairly used the headnotes to train its AI platform, which is used by lawyers to research legal issues.

Plaintiff  owns the Westlaw database and compiles judicial opinions according to Westlaw’s “Key Number System” adding headnotes that briefly summarize the relevant points of law that appear in the opinion. Plaintiff had a copyright registration for its Key Number System and Headnotes. Defendant sought to create a “natural language search engine” using machine learning and artificial intelligence.  The court described Ross’s system as, “Users would ask questions and its search engine would spit out quotations from judicial opinions—no commentary necessary.” To develop its AI system, it needed to train it on content.  At first, it sought a license to use Westlaw, but Plaintiff refused. So, Ross turned to a third-party contractor to create memos with answers to legal questions that a lawyer would ask. This “Bulk Memo Project” resulted in approximately 25,000 question-and-answer sets. The third-party contractor created the memos both manually and, for a time, with the help of a text-scraping bot. The contractor also sent Ross a list of 91 legal topics from Westlaw’s Key Number System. Ross admits that it “considered” these topics when creating its own set of 38 topics that were used in an experiment, but ultimately abandoned that project. Finally, the contractor sent Ross 500 judicial opinions, including Westlaw’s headnotes, key numbers, and other annotations. Ross claimed it did nothing with these opinions.

Thomson Reuters sued, contending that the questions in Ross’s Bulk Memo Project were nothing more than Westlaw headnotes with question marks at the end. Ross responded that the headnotes “influenced” the questions but that lawyers had ultimately drafted them instead of copying them. The parties brought a total of five motions and cross-motions for summary judgment, each addressed to discrete issues.

The jury trial is scheduled to begin in May 2024. The outcome of the case could have significant implications for the development and use of AI in the legal industry.

Copyright Infringement

In their summary judgment motion, Defendants argued that because Plaintiff had just one copyright registration comprising hundreds of thousands of headnotes, copying only a few thousand headnotes was not infringement. The court rejected this argument, noting that the copyright in a compilation extends to each of the copyrightable elements in that compilation. Because “[h]eadnotes are just short written works, authored by Thomson Reuters…, they could receive standalone, individual copyright protection.” However, the court found a genuine issue of disputed fact on whether the headnotes follow the uncopyrightable judicial opinions so closely as to be unoriginal which needed to be resolved by a jury.  On the issue of copying, the court held that Ross had copied portions of the Westlaw headnotes, both because Ross had admitted some copying and because Westlaw had shown access and probative similarity to the database. However, Judge Bibas went on to hold that the issue of substantial similarity of protected expression was a question for the jury: He simply could not decide as a matter of law whether there were similarities in copyrightable expression, especially in light of conflicting expert testimony.

Fair Use

The parties brought cross-motions for summary judgment on the four issues related to fair use.

As to the first factor, the court found that Ross’s use was, as a matter of law, commercial. Refusing to “overread” Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, 143 S. Ct. 1258 (2023), however, the court nonetheless considered the question of “transformative use” highly relevant. Among other arguments, Ross contended that it had engaged only in intermediate copying to reverse engineer, which a number of previous decisions have found to be fair use. The court refused to apply a rigid rule that intermediate copying is always transformative, instead holding: “It was transformative intermediate copying if Ross’s AI only studied the language patterns in the headnotes to learn how to produce judicial opinion quotes.” Thus, the issue of transformative use was a question for the jury.

On the second fair use factor, the nature of the copyrighted work, the court held that because headnotes are not at the core of intended copyright protection, this factor tended to weigh in favor of fair use. However, the court also found this issue to be a jury question because of the uncertainty as to the headnotes’ originality. As to the third fair use factor—substantiality of the use—a disputed issue of fact existed because it was unclear how much Ross actually took of the copyrighted material.

Finally, on the fourth factor, harm to Thomson Reuters’ potential market, the court rejected Plaintiff’s argument that Ross was a direct competitor. Seizing on language from Google LLC v. Oracle Am., Inc., 141 S. Ct. 1183 (2021), Judge Bibas gave great weight to conflicting evidence as to whether Ross’s AI platform had a “public benefit.”

Implications for AI development

The Thomson Reuters v. Ross Intelligence case is one of the first major copyright lawsuits involving AI. The outcome of the case could have a significant impact on the development and use of AI in the legal industry.

If Thomson Reuters is successful, it could make it more difficult for AI companies to develop and use copyrighted materials to train their AI systems. This could slow the development of AI in the legal industry and make it more expensive for AI companies to develop and maintain their products.

On the other hand, if Ross Intelligence is successful, it could send a message that AI companies can fairly use copyrighted materials to train their AI systems. This could accelerate the development of AI in the legal industry and make AI products more affordable for lawyers and other users.

The case is also being watched closely by other industries that are developing and using AI. The outcome of the case could have implications for AI development and use in a wide range of industries.

Conclusion

The Thomson Reuters v. Ross Intelligence case is a landmark case that could have a significant impact on the development and use of AI in the legal industry and other industries. The outcome of the case will be closely watched by legal experts and businesses alike.