(4 days, 2 hours ago)
Grand CommitteeMy Lords, I support Amendments 204, 205 and 206, to which I have attached my name. In doing so, I declare my interest as someone with a long-standing background in the visual arts and as an artist member of the Design and Artists Copyright Society.
These amendments, tabled and superbly moved by my noble friend and supported by the noble Lords, Lord Stevenson and Lord Clement-Jones, seek to address a deep crisis in the creative sector whereby millions upon millions of creative works have been used to train general-purpose or generative AI models without permission or pay. While access to data is a fundamental aspect of this Bill, which in many cases has positive and legitimate aims, the unauthorised scraping of copyright-protected artworks, news stories, books and so forth for the use of generative AI models has significant downstream impacts. It affects the creative sectors’ ability to grow economically, to maximise their valuable assets and to retain the authenticity that the public rely on.
AI companies have used artists’ works in the training, development and deployment of AI systems without consent, despite this being a requirement under UK copyright law. As has been said, the narrow exception to copyright for text and data mining for specific research purposes does not extend to AI models, which have indiscriminately scraped creative content such as images without permission, simply to build commercial products that allow users to generate their own versions of a Picasso or a David Hockney work.
This amendment would clarify the steps that operators of web crawlers and general-purpose AI models must take to comply with UK copyright law. It represents a significant step forward in resolving the legal challenges brought by rights holders against AI companies over their training practices. Despite high-profile cases arising in the USA and the UK over unauthorised uses of content by AI companies, the reality is that individual artists simply cannot access judicial redress, given the prohibitive cost of litigation.
DACS, which represents artists’ copyright, surveyed its members and found that they were not technophobic or against AI in principle but that their concerns lay with the legality and ethics of current AI operators. In fact, 84% of respondents would sign up for a licensing mechanism to be paid when their work is used by an AI with their consent. This amendment would clarify that remuneration is owed for AI companies’ use of artists’ works across the entire development life cycle, including during the pre-training and fine-tuning stages.
Licensing would additionally create the legal certainty needed for AI companies to develop their products in the UK, as the unlawful use of works creates a litigation risk which deters investment, especially from SMEs that cannot afford litigation. DACS has also been informed by its members that commissioning clients have requested artists not to use AI products in order to avoid liability issues around its input and output, demonstrating a lack of trust or uncertainty about using AI.
This amendment would additionally settle ongoing arguments around whether compliance with UK copyright law is required where AI training takes place in other jurisdictions. By affirming its applicability where AI products are marketed in the UK, the amendment would ensure that both UK-based artists and AI companies are not put at a competitive disadvantage due to international firms’ ability to conduct training in a different jurisdiction.
One of the barriers to licensing copyright is the lack of transparency over what works have been scraped by AI companies. The third amendment in this suite of proposals, Amendment 206, seeks to address this. It would require operators of web crawlers and general-purpose AI models to be transparent about the copyright works they have scraped.
Currently, artists and creators face significant challenges in protecting their intellectual property rights in the age of AI. While tools such as Spawning AI’s “Have I Been Trained?” attempt to help creators identify whether their work has been used in AI training datasets, these initiatives provide only surface-level information. Creators may learn that their work was included in training data, but they remain in the dark about crucial details—specifically, how their work was used and which companies used it. This deeper level of transparency is essential for artists to enforce their IP rights effectively. Unfortunately, the current documentation provided by AI companies, such as data cards and model cards, falls short of delivering this necessary transparency, leaving creators without the practical means to protect their work.
Amendment 206 addresses the well-known black box issue that currently plagues the AI market, by requiring the disclosure of information about the URLs accessed by internet scrapers, information that can be used to identify individual works, the timeframe of data collection and the type of data collected, among other things. The US Midjourney litigation is a prime example of why this is necessary for UK copyright enforcement. It was initiated only after a leak revealed the names of more than 16,000 non-consenting artists whose works were allegedly used to train the tool.
Creators, including artists, should not find themselves in a position where they must rely on leaks to defend their intellectual property rights. By requiring AI companies to regularly update their own records, detailing what works were used in the training process and providing this to rights holders on request, this amendment could also create a vital cultural shift towards accountability. This would represent an important step away from the “Move fast and break things” culture pervasive amongst the Silicon Valley-based AI companies at the forefront of AI development, and a step towards preserving the gold-standard British IP framework.
Lastly, I address Amendment 205, which requires operators of internet crawlers and general-purpose AI models to be transparent about the identity and purpose of their crawlers, and not penalise copyright holders who choose to deny scraping for AI by down ranking their content in, or removing their content from, a search engine. Operators of internet crawlers that scrape artistic works and other copyright-protected content can obscure their identity, making it difficult and time-consuming for individual artists and the entities that represent their copyright interests to identify these uses and seek redress for illegal scraping.
Inclusion in search-engine results is crucial for visual artists, who rely on the visibility these provide for their work to build their reputation and client base and generate sales. At present, web operators that choose to deny scraping by internet crawlers risk the downrating or even removal of their content from search engines, as the most commonly used tools cannot distinguish between do-not-train protocols added to a site. This amendment will ensure that artists who choose to deny scraping for AI training are not disadvantaged by current technical restrictions and lose out on the exposure generated by search engines.
Finally, I will say a few words about the Government’s consultation launched yesterday, because it exposes a deeply troubling approach to creators’ IP rights, as has already been said so eloquently by the noble Baroness. For months, we have been urged to trust the Government to find the right balance between creators’ rights and AI innovation, yet their concept of balance has now been revealed for what it truly is: an incredibly unfair trade-off that gives away the rights of hundreds of thousands of creators to AI firms in exchange for vague promises of transparency.
Their proposal is built on a fundamentally flawed premise—promoted by tech lobbyists—that there is a lack of clarity in existing copyright law. This is completely untrue: the use of copyrighted content by AI companies without a licence is theft on a mass scale, as has already been said, and there is no objective case for the new text and data-mining exception. What we find in this consultation is a cynical rebranding of the opt-out mechanism as a rights reservation system. While they are positioning this as beneficial for rights holders through potential licensing revenues, the reality is that this is not achievable, yet the Government intend to leave it to Ministers alone to determine what constitutes
“effective, accessible, and widely adopted”
protection measures.
This is deeply concerning, given that no truly feasible rights reservation system for AI has been implemented anywhere in the world. Rights holders have been unequivocal: opt-out mechanisms—whatever the name they are given—are fundamentally unworkable in practice. In today’s digital world, where content can be instantly shared by anyone, creators are left powerless to protect their work. This hits visual artists particularly hard, as they must make their work visible to earn a living.
The evidence from Europe serves as a stark warning: opt-out provisions have failed to protect creators’ rights, forcing the EU to introduce additional transparency requirements in the recent AI Act. Putting it bluntly, simply legalising unauthorised use of creative works cannot be the answer to mass-scale copyright infringement. This is precisely why our proposed measures are crucial: they will maintain the existing copyright framework whereby AI companies must seek licences, while providing meaningful transparency that enables copyright holders to track the use of their work and seek proper redress, rather than blindly repeating proven failures.
My Lords, I speak in support of my noble friend Lady Kidron’s amendments. I declare an interest as a visual artist, and of course visual creators, as my noble friend Lord Freyberg has very well described, are as much affected by this as musicians, journalists and novelists. I am particularly grateful to the Design and Artists Copyright Society and the Authors’ Licensing and Collecting Society for their briefings.
A particular sentence in the excellent briefing for this debate by the News Media Association, referred to by my noble friend Lady Kidron, caught my eye:
“There is no ‘balance’ to be struck between creators’ copyrights and GAI innovation: IP rights are central to GAI innovation”.
This is a crucial point. One might say that data does not grow on a magic data tree. All data originates from somewhere, and that will include data produced creatively. One might also say that such authorship should be seen to precede any interests in use and access. It certainly should not be something tagged on to the end, as an afterthought. I appreciate that the Government will be looking at these things separately, but concerns of copyright should really be part of any Bill where data access is being legislated for. As an example, we are going to be discussing the smart fund a bit later in an amendment proposed by the noble Lord, Lord Bassam, but I can attest to how tricky it was getting that amendment into a Bill that should inherently be accommodating these interests.
My Lords, I support my noble friend Lady Kidron’s Amendment 211, to which I have put my name. I speak not as a technophobe but as a card-carrying technophile. I declare an interest as, for the past 15 years, I have been involved in the development of algorithms to analyse NHS data, mostly from acute NHS trusts. This is possible under current regulations, because all the research projects have received medical research ethics approval, and I hold an honorary contract with the local NHS trust.
This amendment is, in effect, designed to scale up existing provisions and make sure that they are applied to public sector data sources such as NHS data. By classifying such data as sovereign data assets, it would be possible to make it available not only to individual researchers but to industry—UK-based SMEs and pharmaceutical and big tech companies—under controlled conditions. One of these conditions, as indicated by proposed new subsection (6), is to require a business model where income is generated for the relevant UK government department from access fees paid by authorised licence holders. Each government department should ensure that the public sector data it transfers to the national data library is classified as a sovereign data asset, which can then be accessed securely through APIs acting
“as bridges between each sovereign data asset and the client software of the authorized licence holders”.
In the time available, I will consider the Department of Health and Social Care. The report of the Sudlow review, Uniting the UK’s Health Data: A Huge Opportunity for Society, published last month, sets out what could be achieved though linking multiple NHS data sources. The Academy of Medical Sciences has fully endorsed the report:
“The Sudlow recommendations can make the UK’s health data a truly national asset, improving both patient care and driving economic development”.
There is little difference, if any, between health data being “a truly national asset” and “a sovereign asset”.
Generative AI has the potential to extract clinical value from linked datasets in the various secure data environments within the NHS and to deliver a step change in patient care. It also has the potential to deliver economic value, as the application of AI models to these rich, multimodal datasets will lead to innovative software products being developed for early diagnosis and personalised treatment.
However, it seems that the rush to generate economic value is preceding the establishment of a transparent licensing system, as in proposed new subsection (3), and the setting up of a coherent business model, as in proposed new subsection (6). As my noble friend Lady Kidron pointed out, the provisions in this amendment are urgently needed, especially as the chief data and analytics officer at NHS England is reported as having said, at a recent event organised by the Health Service Journal and IBM, that the national federated data platform will soon be used to train different types of AI model. The two models mentioned in the speech were OpenAI’s proprietary ChatGPT model and Google’s medical AI, which is based on its proprietary large language model, Gemini. So, the patient data in the national federated data platform being built by Palantir, which is a US company, is, in effect, being made available to fine-tune large language models pretrained by OpenAI and Google—two big US tech companies.
As a recent editorial in the British Medical Journal argued:
“This risks leaving the NHS vulnerable to exploitation by private technology companies whose offers to ‘assist’ with infrastructure development could result in loss of control over valuable public assets”.
It is vital for the health of the UK public sector that there is no loss of control resulting from premature agreements with big tech companies. These US companies seek privileged access to highly valuable assets which consist of personal data collected from UK citizens. The Government must, as a high priority, determine the rules for access to these sovereign data assets along the lines outlined in this amendment. I urge the Minister to take on board both the aims and the practicalities of this amendment before any damaging loss of control.
My Lords, I support Amendment 211 moved by my noble friend Lady Kidron, which builds on earlier contributions in this place made by the noble Lords, Lord Mitchell, Lord Stevenson, Lord Clement-Jones, and myself, as long ago as 2018, about the need to maximise the social, economic and environmental value that may be derived from personal data of national significance and, in particular, data controlled by our NHS.
The proposed definition of “sovereign data assets” is, in some sense, broad. However, the intent to recognise, protect and maximise their value in the public interest is readily inferred. The call for a transparent licensing regime to provide access to such assets and the mention of preferential access for individuals and organisations headquartered in the UK also make good sense, as the overarching aim is to build and maintain public trust in third-party data usage.
Crucially, I fully support provisions that would require the Secretary of State to report on the value and anticipated financial return from sovereign data assets. Identifying a public body that considered itself able or willing to guarantee value for money proved challenging when this topic was last explored. For too long, past Governments have dithered and delayed over the introduction of provisions that explicitly recognise the need to account for and safeguard the investment made by taxpayers in data held by public and arm’s-length institutions and associated data infrastructure—something that we do as a matter of course where the tangible assets that the National Audit Office monitors and reports on are concerned.
In recent weeks, the Chancellor of the Exchequer has emphasised the importance of recovering public funds “lost” during the Covid-19 pandemic. Yet this focus raises important questions about other potential revenue streams that were overlooked, particularly regarding NHS data assets. In 2019, Ernst & Young estimated that a curated NHS dataset could generate up to £5 billion annually for the UK while also delivering £4.6 billion in yearly patient benefits through improved data infrastructure. This begs the question: who is tracking whether these substantial economic and healthcare opportunities are being realised? Who is ensuring that these projected benefits—both financial and clinical—are actually flowing back into our healthcare system?
As we enter the age of AI, public discourse often fixates on potential risks while overlooking a crucial opportunity—namely, the rapidly increasing value of publicly controlled data and its potential to drive innovation and insights. This raises two crucial questions. First, how might we capitalise on the upside of this technological revolution to maximise the benefits on behalf of the public? Secondly, and more specifically, how will Parliament effectively scrutinise any eventual trade deal entered into with, for example, the United States of America, which might focus on a more limited digital chapter, in the absence of either an accepted valuation methodology or a transparent licensing system for use in providing access to valuable UK data assets?
Will the public, faced with a significant tax burden to improve public services and repeated reminders of the potential for data and technology to transform our NHS, trust the Government if they enable valuable digital assets to be stripped today only to be turned tomorrow into cutting-edge treatments that we can ill afford to purchase and that benefit companies paying taxes overseas? To my mind, there remains a very real risk that the UK, as my noble friend Lady Kidron, rightly stated, will inadvertently give away potentially valuable digital assets without there being appropriate safeguards in place. I therefore welcome the intent of Amendment 211 to put that right in the public interest.
(1 week, 5 days ago)
Grand CommitteeMy Lords, I have tabled Amendment 60 to add to our discussion and establish some further clarity from the Minister on the impact of widening the scope of the interpretation of scientific research to include commercial and private activities. I thank her for her letter of 27 November to all noble Lords who spoke at Second Reading, a copy of which was placed in the Lords Library; it provides some reassurance that scientific research activities must still pass a reasonableness test. However, I move this probing amendment out of concern that the change in definition may have unintended consequences for copyright law. It is vital that we do not just look at this Bill in isolation but consider the wider impact that changing definitions and interpretations will have on other aspects of legislation.
Research activities are identified under the Copyright, Designs and Patents Act 1988. Some researchers require access to and reproduction of data and copyright-protected material for research purposes. Under Section 29A, researchers can avail themselves of an exemption from copyright which allows data mining and analysis of copyright-protected works for non-commercial research only, without permission from the copyright holder. The UK copyright framework is popularly known as the “gold standard” internationally, as it carefully balances the rights of copyright holders with the need for certain uses to take place, such as non-commercial research, educational uses and those that protect free speech. That balance is fragile, and we must be very careful not to disrupt it unintentionally.
The previous Government sought to widen Section 29A of the Act by allowing text and data mining of copyright-protected works for commercial purposes, but this recommendation was quickly reversed when the Government considered that the decision was made without appropriate evidence. That was a sensible move. The current Government are still due to consult with stakeholders on the exemption to the law, against the backdrop of AI companies using copyright-protected works for training large language models without permission or fair pay. Given the global presence of AI, it is expected that this consultation will consider how the UK policy on copyright works within an international context. Therefore, while the Government are carefully considering this, we must ensure that we do not fast forward to a conclusion before that important work has taken place.
If the Minister can confirm that this definition has no impact on existing copyright law, I will happily withdraw this amendment. However, if there are potential implications on the Copyright, Designs and Patents Act 1988, I would urge the Minister to table her own amendment to explicitly preserve the current definition of “scientific research” within that Act. This would ensure that we maintain legal clarity while the broader international considerations are fully examined. I beg to move.
I advise the Committee that, if this amendment is agreed, I cannot call Amendment 61 by reason of pre-emption.
Let me put it this way: other things may be coming before it. I think I promised at the last debate that we would have something on copyright in the very, very, very near future. This may not be as very, very, very near future as that. We will tie ourselves in knots if we carry on pursuing this discussion.
On that basis, I hope that this provides noble Lords with sufficient reassurance not to press their amendments.
I thank your Lordships for this interesting debate. I apologise to the Committee for degrouping the amendment on copyright, but I thought it was important to establish from the Minister that there really was no effect on the copyright Act. I am very reassured that she has said that. It is also reassuring to hear that there will be more of an opportunity to look at this issue in greater detail. On that basis, I beg leave to withdraw the amendment.
(1 month ago)
Lords ChamberMy Lords, it is a great pleasure to follow the noble Lord, Lord Davies, and what he had to say on health data, much of which I agree entirely with. The public demand that we get this right and we really must endeavour to do all we can to reassure the public in this area.
I speak as someone deeply rooted in the visual arts and as an artist member of DACS—the Design and Artists Copyright Society. In declaring my interests, I also express gratitude for the helpful briefing provided by DACS.
The former Data Protection and Digital Information Bill returns to this House after its journey was interrupted by July’s general election. While this renewed Bill’s core provisions remain largely unchanged, the context in which we examine them has shifted significantly. The rapid advancements in artificial intelligence compel us to scrutinise this legislation not just for its immediate impact but for its long-term consequences. Our choices today will shape how effectively we safeguard the rights and interests of our citizens in an increasingly digital society. For this reason, the Bill demands meticulous and thorough examination to ensure that it establishes a robust governance framework capable of meeting present and future challenges.
Over the past year, Members of this House have carefully considered the opportunities and risks of large language models which power artificial intelligence applications—work that is still ongoing. I note that even today, the Lords Communications and Digital Committee, chaired by the noble Baroness, Lady Stowell of Beeston, is holding an evidence session on the role of AI in creative tech.
The committee’s previous inquiry into large language models stressed a need for cautious action. Drawing on expert testimony, its recommendations highlighted critical gaps in our current approach, particularly in addressing immediate risks in areas such as cybersecurity, counterterrorism, child protection, and disinformation. The committee rightly stressed the need for stronger assessments and guardrails to mitigate these harms, including in the area of data protection.
Regrettably, however, this Bill moves in the opposite direction, and instead seeks to lighten the regulatory governance of data processing and relaxes rules around automated decision-making, as other noble Lords have referred to. Such an approach risks leaving our legislative framework ill prepared to address the potential risks that our own committee has so carefully documented.
The creative industries, which contribute £126 billion annually to the UK economy, stand particularly exposed. Evidence submitted to the committee documented systematic unauthorised use of copyrighted works by large language models, which harvest content across the internet while circumventing established licensing frameworks and creator permissions.
This threat particularly impacts visual artists—photographers, illustrators, designers, et cetera—many of whom already earn far below the minimum wage, as others, including the noble Baroness, Lady Kidron, and the noble Lords, Lord Bassam and Lord Holmes, have already highlighted. These creators now confront a stark reality: AI systems can instantaneously generate derivative works that mimic their distinctive styles and techniques, all without attribution or compensation. This is not merely a theoretical concern; this technological displacement is actively eroding creative professionals’ livelihoods, with documented impacts on commission rates and licensing revenues.
Furthermore, the unauthorised use of reliable, trusted data, whether from reputable news outlets or authoritative individuals, fuels the spread of disinformation. These challenges require a solution that enables individuals and entities, such as news publishers, to meaningfully authorise and license their works for a fair fee.
This Bill not only fails to address these fundamental challenges but actively weakens existing protections. Most alarmingly, it removes vital transparency requirements for personal data, including data relating to individual creators, when used for research, archival and statistical purposes. Simultaneously, it broadens the definition of research to encompass “commercial” activities, effectively creating a loophole ripe for exploitation by profit-driven entities at the expense of individual privacy and creative rights.
Finally, a particularly troubling aspect of the Bill is its proposal to dissolve the Information Commissioner’s Office in favour of an information commission—a change that goes far beyond mere restructuring. Although I heard what the Minister said on this, by vesting the Secretary of State with sweeping powers to appoint key commission members, the Bill threatens to compromise the fundamental independence that has long characterised our data protection oversight. Such centralised political influence could severely undermine the commission’s ability to make impartial, evidence-based decisions, particularly when regulating AI companies with close government ties or addressing sensitive matters of national interest. This erosion of regulatory independence should concern us all.
In summary, the cumulative effect of this Bill’s provisions exposes a profound mismatch between the protections our society urgently needs and those this legislation would actually deliver. At a time when artificial intelligence poses unprecedented challenges to personal privacy and creative rights, this legislation, although positive on many fronts, appears worryingly inadequate.