Firm Insights

What Issues Arise When AI Uses Copyrighted Works?

Author: Scarinci Hollenbeck, LLC

Date: September 11, 2023

Key Contacts

Scarinci Hollenbeck, LLC

The Firm

201-896-4100 info@sh-law.com

Back

What Issues Arise When AI Uses Copyrighted Works?

Questions surrounding artificial intelligence (AI) and copyright are evolving quickly...

Questions surrounding artificial intelligence (AI) and copyright are evolving quickly. One of the key issues and intricacies involves content produced by “generative AI” computer programs (discussed below), whether the content is entitled to copyright protection, and how training and using these programs may infringe existing copyrights.

Stand-up comedian Sarah Silverman is one of many content creators who have filed lawsuits alleging that AI platforms were trained on their copyrighted works without authorization or license from the rights holders. Silverman, along with authors Christopher Golden and Richard Kadrey, contend that defendants OpenAI and Meta Platforms copied the authors’ published books to train their AI products ChatGPT and LLaMA “without consent, without credit, and without compensation.”

How Generative AI Works

OpenAI and Meta Platforms both offer AI software products known as large language models (LLM). Rather than being programmed by software engineers, large language models are “trained” by copying massive amounts of text and extracting expressive information from such text. As the U.S. Patent and Trademark Office (USPTO) has described, this process “will almost by definition involve the reproduction of entire works or substantial portions thereof.” OpenAI, for example, acknowledges that its programs are trained on “large, publicly available datasets that include copyrighted works” and that this process “necessarily involves first making copies of the data to be analyzed.”

Once properly “trained,” platforms like ChatGPT and LLaMA allow users to enter text prompts. The AI platforms then attempt to respond with a coherent and fluent response that closely mimics human language. To produce text outputs, LLMs rely on information extracted from their training datasets, along with patterns and connections drawn from the data. For example, if an LLM is prompted to generate a writing in the style of a certain author, the LLM would construct and generate content based on patterns and connections it learned from analysis of that author’s work within its training data. Importantly, a user can also ask ChatGPT or LLaMA to summarize a copyrighted book and the programs do so based on the training data acquired by the program.

Copyright Infringement Lawsuits Against AI Platforms

In the lawsuits, Plaintiffs Silverman, Golden, and Kadrey maintain that they did not consent to the use of their copyrighted books as training material for ChatGPT or LLaMA. They further allege that the LLMs are themselves infringing derivative works, made without the plaintiffs’ permission and in violation of their exclusive rights under the Copyright Act.

According to their complaint, ChatGPT provided accurate summaries of the plaintiffs’ books when prompted, which demonstrates that the program was trained using their copyrighted works. “Indeed, when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works—something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works,” their complaint against OpenAI states. The suit further alleges that “at no point did ChatGPT reproduce any of the copyright management information Plaintiffs included with their published works.”

Both suits were filed in California district court and seek class-action status. They allege claims of copyright infringement and violations of the section 1202(b) of the Digital Millennium Copyright Act (DMCA), as well as common law claims of unjust enrichment, unfair competition, and negligence. For example, the lawsuit against Meta argues that the company “breached its duties by negligently, carelessly, and recklessly collecting, maintaining and controlling [theirs] and [others’] infringed works and engineering, designing, maintaining and controlling systems – including LLaMA – which are trained on [theirs] and [others’] infringed Works without their authorization.”

While OpenAI and Meta Platforms have not yet officially responded to the lawsuits, the AI platforms will likely raise a fair use defense. As discussed in prior articles, fair use is determined on case-by-case basis and requires evaluation of the following four factors:

The purpose and character of the use (including whether it is transformative, commercial, non-profit, or educational);
The nature of the copyrighted work;
The amount and substantiality of the portion to be used; and
The effect upon the potential market for the copyrighted work.

In a recent report, the Congressional Research Service noted that AI companies have previously argued that their training processes constitute fair use and are therefore non-infringing, writing:

Some stakeholders argue that the use of copyrighted works to train AI programs should be considered a fair use under these factors. Regarding the first factor, OpenAI argues its purpose is “transformative” as opposed to “expressive” because the training process creates “a useful generative AI system.” OpenAI also contends that the third factor supports fair use because the copies are not made available to the public but are used only to train the program. For support, OpenAI cites The Authors Guild, Inc. v. Google, Inc., in which the U.S. Court of Appeals for the Second Circuit held that Google’s copying of entire books to create a searchable database that displayed excerpts of those books constituted fair use.

Of course, fair use analysis requires courts to weigh all four fair use factors, and the plaintiffs will likely contend several factors tip the scale in their favor. For example, they may argue that ChatGPT and LLaMA are commercial products, which weighs against fair use under the first statutory factor. They may also argue that by providing summaries of the books, the programs undermine the market for the original works, weighing against fair use under the fourth factor.

Key Takeaway

Artificial intelligence, particularly generative AI, raises novel and complex copyright issues. In addition to the question of whether generative AI programs infringe copyrights in existing works, the availability of copyright protection for AI-generated works also remains unsettled. Because cases involving generative AI are in their infancy, we are unlikely to find answers to many of these copyright issues in the short term. In the meantime, this area of copyright law warrants close monitoring by content owners as well as AI platform creators and users and Scarinci Hollenbeck remains at the forefront of this issue.

If you have questions, please contact us

If you have any questions or if you would like to discuss the matter further, please contact me, Albert J. Soler, or the Scarinci Hollenbeck attorney with whom you work, at 201-896-4100.

Practices: Intellectual Property, Copyright
Locations: Red Bank, NJ, New York City, Little Falls, NJ

No Aspect of the advertisement has been approved by the Supreme Court. Results may vary depending on your particular facts and legal circumstances.

See all

Corporate Governance Reviews: A Practical Guide for New Jersey Companies

Every New Jersey company should periodically evaluate its governance framework. Strong corporate governance protects directors and officers, builds investor confidence, reduces litigation exposure, and positions a company for sustainable growth. The first quarter of the year is a great time to evaluate your corporate governance practices and perform any routine maintenance needed to keep that […]

Author: Ken Hollenbeck

March 10, 2026

What to Do After Being Served with a Lawsuit: Steps to Protect Your Legal Rights

Being served with a lawsuit is one of the most stressful legal events a business or individual can face. Whether the claim involves a contract dispute, an employment matter, an intellectual property issue, or another legal challenge, the actions you take in the first few days can significantly shape the outcome of your case. Acting […]

Author: Robert E. Levy

March 9, 2026

Will 2026 Be a Banner Year for SPACs? Understanding the Risks and Opportunities

Special Purpose Acquisition Companies (SPACs) continue to gain momentum as we move through 2026. After enduring a significant contraction following the 2021 boom and the regulatory scrutiny that followed, SPAC activity rebounded sharply in 2025 and now carries forward into 2026 with real momentum. The SPAC resurgence reflects broader improvements in both market conditions and the […]

Author: Dan Brecher

February 25, 2026

Why Compliance Monitoring Matters for NY and NJ Businesses

Compliance programs are no longer judged by how they look on paper, but by how they function in the real world. Compliance monitoring is the ongoing process of reviewing, testing, and evaluating whether policies, procedures, and controls are being followed—and whether they are actually working. What Is Compliance Monitoring? In today’s heightened regulatory environment, compliance […]

Author: Dan Brecher

February 4, 2026

When Are New Jersey Business Owners Personally Liable for Corporate Debt?

New Jersey personal guaranty liability is a critical issue for business owners who regularly sign contracts on behalf of their companies. A recent New Jersey Supreme Court decision provides valuable guidance on when a business owner can be held personally responsible for a company’s debt. Under the Court’s decision in Extech Building Materials, Inc. v. […]

Author: Charles H. Friedrich

January 30, 2026

Commercial Real Estate Trends to Watch in 2026

Commercial real estate trends in 2026 are being shaped by shifting economic conditions, technological innovation, and evolving tenant demands. As the market adjusts to changing interest rates, capital flows, and workplace models, investors, owners, tenants, and developers must understand how these trends are influencing opportunities and risk in the year ahead. Overall Outlook for Commercial […]

Author: Michael J. Willner

January 29, 2026