Demystifying licensing debates: Should GenAI developers pay to train their models on copyright protected content?

Share
Generative AI (Gen AI) models rely on vast amounts of original content for training, raising fundamental questions about licensing and intellectual property rights. In this article, Jorge Padilla and Kadambari Prasad [1] draw on their experience analysing licensing disputes across industries to break down the debate into distinct questions that parties and policy makers must address. They argue that while the principles guiding these debates are well established, the answers vary depending on market dynamics, making a clear economic framework essential for navigating the discussion.
This paper is based on ongoing research. The views expressed in this paper are the views of the authors only and do not necessarily represent the views of Compass Lexecon, its management, its subsidiaries, its affiliates, its employees or its clients.
Introduction
Consumers already flock to content provided by generative AI (“GenAI”) models – such as the GPT series (Open AI), DeepMind (Google), the Claude series (Anthropic) and LLaMA (Meta). Nonetheless, this is a nascent industry; the value of the content that GenAI will provide in the future is expected to soar further, which in part has contributed to the dramatic rise in the stock prices of the “Magnificent Seven”: Nvidia, Tesla, Amazon, Apple, Alphabet/Google, Microsoft and Meta.[2]
This content did not emerge from a vacuum. Developers train their models on the original content that others create – whether that is text, audio, images or data. That training allows the models to “learn” the patterns and structures they require to produce new content. So, the value of what GenAI provides to consumers partly depends on the value of the original content that others provide to it.
Although some of the original content that GenAI learns from is freely available, much of it is not. Often high-quality original content is proprietary, protected by intellectual property rights such as copyright. Currently, there is an intense debate about whether those property rights affect the developers of GenAI, and if so, in what way.
The issues in this licensing debate are complex but they are not new. Although they can be difficult to unpick when looking at GenAI in isolation, the same questions and issues appear in the licensing disputes that occur in other industries, including those between the innovators and implementers of wireless communication technology, which is protected by standard essential patents (“SEPs”);[3] news publishers and operators of search engines or social media platforms;[4] and between telecommunications operators and large traffic originators such as Netflix, YouTube or TikTok.[5]
In this article, we draw on our experience analysing licensing disputes in the context of cellular SEPs to demystify the debate about licensing copyrighted material to the developers of GenAI models. We do so by separating “the debate” into distinct questions that parties and policy makers must address.
Although the questions are common to both these sectors, the answers vary. At first glance, that may appear troubling, but it isn’t. The economic principles that one should apply to the facts are the same in each case. But the conclusions that one reaches from applying those principles vary, as the facts and circumstances that one applies vary. It is critical to consistently apply the relevant principles to the facts, whatever they are. It is an error to consistently reach the same conclusion, regardless of what the facts are.
Question 1: do GenAI developers need to compensate original content owners for using their property?
Whether or not the users of intellectual property must pay its owners is a legal question, not an economic one. For that reason, the main arguments currently being made against licensing in the context of GenAI and copyright protected content are formalistic.[6]
Economics, however, can explain the situations where the users of intellectual property should pay to use it, on the basis that it would be beneficial for them to do so, and conversely, that it would be harmful, for the parties, the market and for consumers, if they did not.
The “free-rider” debate – what incentives do content providers need to create and provide access to their content?
In general, paying to use intellectual property is beneficial for parties and consumers in two circumstances:
- Using the intellectual property adds value to the user’s product; and
- Paying for that use incentivises parties to develop and supply intellectual property where it adds value.
Circumstance 1: when does intellectual property add value to a user’s product?
The fact that someone uses intellectual property strongly indicates that doing so adds value to their product or service, making it more attractive to consumers than it would be without it.
For example, this is clear in the case of licences for the SEPs that protect cellular wireless communication technology. Consumers value a phone with fast download speeds more than they would value the same phone with slower speeds. Therefore, phone manufacturers benefit from using the latest licensed cellular technology because it provides the enhanced functionality that increases the value of their phones, the amount consumers are willing to pay for them, and phone manufacturers’ profits.[7]
In the specific case of GenAI, it should be uncontroversial that training models on original content increases the quality and value of the new content that those models can produce. If developers were unable to access that original content, their models would have inferior training and – much like a poorly trained human – the content they produce would be less valuable to consumers. That is why developers train their model on copyrighted material; if it didn’t add any value, they wouldn’t bother.
Circumstance 2: why does “free-riding” damage incentives to provide value for others?
Our default position should be that when users pay licence fees, they both incentivise the development of intellectual property that adds value and incentivise owners to share their property where it adds value.
This basic point is crucial in licensing debates, as the prospect of “free-riding” threatens incentives to supply. Developing valuable intellectual property is hard, risky and expensive. But using intellectual property that someone else has already developed is relatively easy, cheap and difficult to prevent. That is why intellectual property rights were created: to incentivise people to develop valuable ideas, by assuring them that the law would protect them from free-riding.
The threat of free-riding is clear in technology sectors, where patent protected innovations can take tens of years, billions of dollars, and many failed attempts to invent. But is also true in the context of creative industries, as the investment required to develop successful content can be upfront, risky and sunk. Creative endeavours do not have any guarantees of success before they are offered to the public. Indeed, many of them do not earn any revenues. Therefore, content creators (or, rather, the companies that invest in developing those creators) have to be given the assurance that they will be able to protect their successful investments from free-riders. Copyright is the intellectual property right that provides them with that assurance and incentive.
Are there exceptions that make free-riding permissible in the case of GenAI?
Intuitively then, we would expect it to be beneficial for developers to pay for the original content they use to train their GenAI models. Supporting that intuition, some of the economic literature has formally modelled the logic behind it, demonstrating that licensing negotiations would create the strongest incentives for producers of original content to invest.[8]
Despite that, the importance of incentivising investment in developing intellectual property often gets obscured in licensing debates. Typically, the difficulty people have is that the intellectual property being discussed has already been successfully brought to market at the point the terms of any licence would be negotiated. That means the incentive that was required to generate that property gets undervalued, if not taken for granted entirely. We have seen this mistake in the telecommunications industry, [9] and can already see a similar mistake in logic appearing in the debate about licensing copyrighted material to GenAI developers.
The main economic argument made in favour of allowing “free use” is that licensing revenues from GenAI developers are unnecessary to incentivise the development of content protected by copyright. The reasoning is this: the original content that currently exists was developed before producers anticipated its use by GenAI developers, and so those pre-existing revenue streams are already sufficient to develop original content. Any further payment for this particular and additional use of that content is not required, the argument claims; it would simply increase costs for GenAI developers and provide an unanticipated windfall to copyright owners.[10]
This argument is flawed because it ignores the fact that prices provide forward-looking incentives, i.e., payments that content providers receive today, set their incentives to develop and provide access to the content that users, including GenAI models, will need tomorrow.
The importance of forward-looking incentives manifests in two ways.
First, incentives are not binary. They can be too low, which leads to too little innovation, or too high which leads to too much, beyond what consumers would find valuable. If GenAI offers no remuneration for the value it receives, then it provides no specific or additional incentive to serve that particular use of original content. It simply free-rides on the spillover effect that payments by others provide for the specific value they receive. Even if we assume that those legacy revenue streams remain, then (a) there is no reason why latecomers should get a free ride while others continue to pay their way, and (b) content providers will be undercompensated compared with the total value their content provides, meaning they will underinvest in developing new content.
Second, forward-looking incentives are resilient to changing circumstances, as prices reward content wherever it adds value and in proportion to the value it adds. With that approach, incentives remain as long as content adds value somewhere, even if its traditional applications and revenue streams erode. This is particularly important in the case of copyright material, as it is likely that content created by GenAI will start to compete with the traditional revenue streams that currently incentivise creators to develop original content. Therefore, the traditional revenue streams that incentivised the original content that GenAI developers used to train their current models may not be sufficient to incentivise the development of the content they will need to train their models on in future.
In contrast, if markets operate under the basic principle that each party that uses intellectual property will pay for the benefits they receive, in proportion to those benefits, then the incentives in that market should be both sufficient and flexible. Parties will develop content that they expect to add value, and they will provide access to it where it adds value.
Question 2: Should the terms of access to intellectual property be left to market forces?
The answer to our first question is usually straightforward: users of intellectual property should pay for the benefits it provides them with. This second question is often more complicated: should the terms for access to intellectual property be left to market forces, or should the rights of licensors or licensees be constrained in some ways to achieve better outcomes?
When can market forces determine reasonable incentives for themselves?
Typically, parties can be left to determine reasonable and mutually beneficial licences for themselves, but not always. If there is a market failure, one party may have excessive leverage, allowing them to extract terms that favour themselves, but harm incentives to either develop or use intellectual property, which ultimately hurts consumers. Economics, however, helps identify those situations and choose the appropriate level and form of intervention.
It is uncontroversial that market transactions between willing buyers and willing sellers, each with competitive alternatives to an agreement, typically lead to outcomes that provide the right incentives of both to participate, which aligns with the market’s long-term interests.
Parties can typically determine by themselves a price that reasonably balances their incentives to cooperate with each other. For three reasons, that may be challenging in the case for GenAI. Firstly, the precise value that the content adds to any particular use case is difficult to determine. Secondly, GenAI has many different use-cases, to which copy-right protected material no doubt adds more value to some than others. Thirdly, GenAI is a nascent industry where the range of use-cases is changing and expanding.
Nonetheless, with clear property rights, limited transaction costs, and symmetric information about the value the contents adds (even if that information is uncertain), parties will agree mutually beneficial terms in the light of their competitive alternatives.[11] That is because, ultimately, it is in the interests of both sides to agree a price that incentivises the other to contribute, rather than select one of its alternatives to the agreement. If they do not, they will both lose out in the long run.
However, that scenario breaks down where there are market failures, such as externalities, a lack of alternatives (which creates excessive market power), information asymmetry or excessive transaction costs. In the presence of failures, it may be better to constrain the behaviour of licensors or licensees. Two common questions raised in such cases are:
- should licensing be mandatory; and
- should licensors or licensees be mandated to negotiate access terms collectively?
We address these in turn below.
The “mandatory licence” debate – when should (licensed) access to copyrighted material be mandatory?
The essence of intellectual property is that it provides the owner with the right to exclude others from using it without consent. However, where that right gives a licensor enough leverage to extort potential users of its property, it is typically constrained by a mandatory licence requirement. The licensor may receive reasonable compensation in return for providing access to its property, but it may not refuse to grant access altogether.
Whether or not that right should be protected or constrained depends on how that would affect the parties’ incentives to agree mutually beneficial terms. Ultimately, ensuring balance between their bargaining power is what matters. Depending on the circumstances, mandatory licences can either correct a power imbalance or introduce one.
When is the right to exclude potential users useful?
The purpose of the right to exclude is not to restrict the use of intellectual property. On the contrary, it is intended to encourage both the creation and proliferation of intellectual property – it achieves this by granting the owner of intellectual property the same leverage that a supplier of physical property would have in a negotiation: the ability to restrict supply until the parties agree mutually beneficial terms.
Unlike physical property, intellectual property is not inherently excludable. Unless a court enforces intellectual property rights, parties negotiate a price while the user already enjoys the benefits of using it. This is problematic because there is little to keep the user honest in a negotiation. If it has unconstrained access to the benefits of using the property whether it has agreed to mutually acceptable terms or not, then its only incentive is to pay as little as possible for them, if it pays at all. The negotiation is imbalanced.
In contrast, when the IP owner can exclude the user, such that neither party can benefit until an agreement is reached, then they both have an incentive to agree mutually beneficial terms as soon as possible. The negotiation is less imbalanced.
When is restricting the right to exclude potential users useful?
The main circumstance where mandatory licences are necessary is when users have no alternative to using the intellectual property because it is an “essential input”. In that case, the user has no ability to walk away from unfavourable or exploitative terms. In this scenario, permitting the licensor’s right to exclude would give it excessive leverage in the negotiation – meaning that it could impose a price that exceeds the value of the benefits that access to its property provides to the user.
We can see the need for mandatory licences in the cellular communication industry. There, the technology is an essential input, not because the technology itself is irreplaceable, but because it is adopted as a market-wide standard. Once any particular technology is adopted, all alternative technologies that could have been implemented instead cease to be viable. Crucially, the implementers of the standard are “irreversibly committed” to taking a licence before they agree the specific terms of that licence. So, they may have to accept a licence on terms they would not have agreed to beforehand, or worse, be excluded if the licensor also competes with them downstream and seeks to raise its rivals’ costs.
This risk is known as “the hold-up problem”. The concern is not only that implementers can be held to ransom or excluded by extortionate rates; it is also that no company with foresight would put itself in such a position and so a competitive market fails to emerge in the first place.
To address this risk, licensors of SEPs are typically required to commit to guaranteeing access to their technology on Fair, Reasonable, and Non-Discriminatory (“FRAND”) terms. In the context of an essential input, therefore, a mandatory licence balances parties’ leverage, because it places them in the same position: neither side can walk away.
There are no market failures in training GenAI that would justify mandatory licensing
The circumstances between GenAI developers and the owners of copyright protected content appear to be importantly different from the situation for SEPs.
Original content is not an essential input to which GenAI developers are irreversibly committed. Developers can walk away from licensors that demand excessive terms, either in favour of alternative providers of content or by opting not to use copyright-protected content at all. As such, licensors lack the ability to impose terms that exceed the value their content provides to users. In this context, therefore, there is no excessive bargaining power that mandatory licensing would be required to address.
Furthermore, in GenAI licensing, there is a risk that mandatory licensing would exacerbate an imbalance in negotiation power, not correct it. Mandatory licensing invariably weakens the power of the licensor, leading to lower rates in subsequent agreements. Yet, given who the companies developing GenAI models are, broadly speaking they already have a lot of leverage in negotiations to prevent excessive rates.
The “licensing pool” debate – should the terms of licensed access be granted collectively or bilaterally?
The second concern relates to who the licence negotiations should be between. Licensing is typically bilateral: between a single company that owns intellectual property and a single company that wants to use it. In that situation, the two parties negotiate the scope and terms of the licence for themselves. However, intellectual property owners can also license collectively, forming a single “pool” that parties can license from, and users of intellectual property can also licence collectively from a single licensor.
However, the crucial issue is not which is preferable – they are each preferable in different circumstances. The issue is whether collective licensing should be required, even if owners of the intellectual property prefer to license bilaterally.
When is each approach preferable?
Bilateral negotiations have a clear benefit that should not be given up lightly: they allow parties the flexibility to tailor the scope of a licence, and its terms and conditions, to their specific circumstances. This is particularly useful where the value of the licensed content, and the interests of the parties themselves, differ between negotiations. It matters less where the quality of the licensed material, the value it provides to licensees, and the parties’ circumstances and requirements tend to be similar in each case.
We can already see that tailoring contracts to individuals’ circumstances would be desirable in the context of licensing copyright to the developers of GenAI models. Copyright holders, and the quality of content, differ substantially from each other. As such, the terms that one of them considers reasonable will not necessarily suit another.
The main appeal of collective licensing is that it reduces transaction costs. In the context of GenAI training, proponents of collective licensing worry that transaction costs would be prohibitively high if developers had to agree a licence with each and every copyright holder individually – which might constitute thousands of separate negotiations.[12] In that case, pooling the intellectual property and licensing it collectively, reduces the transaction costs and makes access to it more affordable.
When can parties decide for themselves whether to licence bilaterally?
Typically, bilateral licensing is the default, but parties will choose to license collectively if it is in their interests to do so. They do not need to be compelled. For example, small independent copyright holders agree to license collectively because it is in their interests to reduce transactions costs, that would otherwise deter potential licensees. In contrast, large copyright holders tend to license bilaterally, as they have sufficient scale to get the benefits of tailoring licences without excessive drag from transactions costs.
Therefore, absent a market failure, we should expect companies to voluntarily adopt the best approach for their circumstances, where they are left free to do so.
When should collective licensing be mandatory?
The only reason for collective licensing to be compulsory is where it addresses a market failure. This, for instance, is a risk where intellectual property is not only fragmented, but also where:
- each fragment of the intellectual property is an essential input; and
- the licences to each one are complements – meaning the demand for one fragment falls as the price of another increases (as opposed to competing substitutes, where the demand for one rises as the price of a rival fragment increases).
This problem with essential complements is clear with licences for SEPs. First, as described above, implementers of a technology standard – such as 5G or Wi-Fi – require every patented technology that is essential to the standard. Second, the patents are strict complements, rather than substitutes. The result is that there is a risk of “royalty stacking”, where the sum of the prices that each patent owner seeks individually will add up to more than even an aggressive monopolist would seek if it licensed all the complements together. In an extreme case, the sum of the bilateral negotiations may exceed a price that any licensee can afford to pay, resulting in market failure.
In that case, the licensors should form a pool. By licensing the complements collectively, they avoid charging more than licensees can afford to pay. Despite that risk, however, collective licensing is not compulsory in the case of SEPs: the threat of royalty stacking is not so great that there is a market failure. Cellular SEP owners choose to collectivise in some markets – such as the automotive industry [13] – but not in others. Licensing in the smartphone market, for example, is mostly bilateral. The decision depends on how the benefits of tailoring and reducing transaction costs interact in each specific market.
There are no market failures in GenAI that would justify mandatory collective licensing
In the specific context of training GenAI models, there is no market failure to address.
“Royalty stacking” is not even a potential concern. Firstly, as described above, developers of GenAI do not require a licence from all copyright holders. Each portfolio of content would be separately useful and add value to a GenAI model, but no portfolio of content is essential. Secondly, copyright holders’ portfolios are not complements; they are competing substitutes. If a content creator demands terms that a developer is not willing to pay, then it can choose to take a licence from a rival content creator in the same industry, as neither has a monopoly on the patterns that developers need their GenAI model to identify and learn. In turn, that competition should encourage each rival to reduce its demands to a reasonable level.
Further, compelling the providers of substitutes to license collectively may harm developers more than it helps them. Even if it reduces transaction costs, it also removes the competitive tension between competing licensors, grouping them into a monopoly. In the absence of any clear benefits, mandating collective licensing could therefore represent a type of anti-competitive infringement that is prohibited under Article 101.[14] In the context of SEPs, collective bargaining would address the problem of royalty stacking, and not reduce licensees’ bargaining power as they already have no alternative to taking a licence for each complementary SEP portfolio. In the context of GenAI developers, mandatory collective licensing has no market failure to address, and it may introduce a problem where none previously existed, reducing developers’ ability to counter terms they are unwilling to pay.
Summary
Whatever the industry, it is very rare that parties and consumers are better off when the users of intellectual property don’t pay for the benefits that it enables and provides. That is true in industries that depend on innovators to develop and share technology protected by patents. It is also true in industries that depend on creative industries to produce and share content protected by copyright. Relying on a “free lunch” is not a good thing; it means there will be not enough food.
Once that point is accepted, licensing debates get more complicated. Each of the debates discussed here matters. Coming to the wrong conclusion might benefit a particular party in the short term but ultimately harms the users and providers of intellectual property in the long term, as well as the consumers of the products that depend on that intellectual property.
Although the answers to those questions may differ in each market, the economic principles that determine those answers are the same: essentially, it is the value created for consumers, and balancing parties’ incentives to participate in creating that value, that matter.
How that is best achieved will vary. But unless there is a market failure that fundamentally distorts the parties’ relative bargaining power, they will typically work out the best approach in their circumstances for themselves.
References
-
Jorge Padilla is a Senior Managing Director at the economic consultancy Compass Lexecon and a Senior Fellow of the GW Innovation and Competition Lab, George Washington University, and CEMFI in Madrid. Kadambari Prasad is a Vice President at Compass Lexecon. Jorge Padilla and Kadambari Prasad represent clients who have an interest in the issues discussed in this paper. This paper is based on ongoing research. It does not necessarily represent the views of Compass Lexecon or Compass Lexecon’s clients. Jorge Padilla and Kadambari Prasad thank the Compass Lexecon Research Team for their comments and support preparing this paper.
-
See, for example, Sebastian, A. (2024). The Magnificent Seven stocks: still a great opportunity or overpriced and set to fall? The times. https://www.thetimes.com/money..., and Archer, C. (2025). What’s next for the Magnificent 7 stocks? IG. https://www.ig.com/uk/trading-...
-
See, for example, Saaskhilahti, P. and Tuffin, A. (2024). Validating that royalties inferred from “comparable” SEP licences are FRAND. Compass Lexecon. https://www.compasslexecon.com..., and Saaskhilahti, P. and Tuffin, A. (2023). What the ex-ante benchmark reveals about the reasonable price for SEP licences. Compass Lexecon.https://www.compasslexecon.com...
-
See, for example, Padilla, J., Nilausen, L. & Tuffin, A. (2024). A primer on the value exchange between news publishers and search engines. Available at SSRN: https://ssrn.com/abstract=4867...
-
See, for example, Duquesne, G. & Nardini, C. (2023). Data traffic and network infrastructure investment: the debate. Compass Lexecon. https://www.compasslexecon.com..., and Padilla, J., Vasas, Z., & Condorelli, D. (2023). Another look at the debate on the ‘Fair Share’ proposal: an economic viewpoint. Compass Lexecon. https://www.compasslexecon.com...
-
For instance, developers may be able to use protected content without a licence if training generative AI models on that material is (a) covered by existing exceptions, such as the Text and Data Mining (“TDM”) exception in the EU; (b) permitted by ‘fair use’; or (c) outside the scope of copyright, as training could be classified a ‘non-expressive’ act – meaning that the models only use copyrighted content to get better at recognising the patterns they need to produce new content, rather than expressing or distributing the original content itself.
-
See “The benefits of technology itself” in Saaskilahti, P. and Tuffin, A. (2023). What the ex-ante benchmark reveals about the reasonable price for SEP licences. Compass Lexecon. https://www.compasslexecon.com...
-
Gans, J. S. (2024). Copyright policy options for generative artificial intelligence (Working paper No. 32106). National Bureau of Economic Research. https://doi.org/10.3386/w32106
-
See “The benefits of technology itself” in Saaskilahti, P. and Tuffin, A. (2023). What the ex-ante benchmark reveals about the reasonable price for SEP licences. Compass Lexecon. https://www.compasslexecon.com... which discusses the view expressed, for instance, in Melamed, D. and Shapiro, C. (2018) How Antitrust Law Can Make FRAND Commitments More Effective. Yale Law Journal https://faculty.haas.berkeley.... that “typically, a new technology is licensed only after it has been developed, whether or not it has been included in an industry standard. By the time the owner of the new technology negotiates licenses with users, the owner has already incurred various R&D expenses. This is common in the development of products of all types. In effect, technology developers make speculative investments. Technology developers typically bear a risk that, having made a speculative investment, their technology will not be sufficiently compensated by an arms’ length market bargain to provide an attractive return on investment.”
-
See Lemley, M. A., & Casey, B. (2021). Fair learning. Texas Law Review, 99(4), 743–785. https://texaslawreview.org/wp-..., and Martens, B. (2024). Economic arguments in favour of reducing copyright protection for generative AI inputs and outputs (Working Paper No. 09/2024). Bruegel. https://www.bruegel.org/workin...
-
This reflects the general principle of the Coase Theorem, outlined in Coase, R. H. (1960). The problem of social cost. Journal of Law and Economics, 3, 1-44. https://doi.org/10.1086/466560
-
See Lemley, M. A., & Casey, B. (2021). Fair learning. Texas Law Review, 99(4), 743–785. https://texaslawreview.org/wp-..., and Martens, B. (2024). Economic arguments in favour of reducing copyright protection for generative AI inputs and outputs (Working Paper No. 09/2024). Bruegel. https://www.bruegel.org/workin...
-
For example, Avanci’s single global license “Avanci Vehicle” covers the majority of cellular SEPs used by connected vehicles. https://www.avanci.com/vehicle...
-
European Commission: Guidelines on the application of Article 101 of the Treaty on the Functioning of the European Union to technology transfer agreements, 28 March 2014, para 253.