Data (Use and Access) Bill [HL] Debate
Full Debate: Read Full DebateBaroness Kidron
Main Page: Baroness Kidron (Crossbench - Life peer)Department Debates - View all Baroness Kidron's debates with the Department for Business and Trade
(1 day, 11 hours ago)
Lords ChamberMy Lords, I declare my interests as chair of the 5Rights Foundation and as an adviser to the Institute for Ethics in AI at Oxford.
I start by expressing my support for the removal of some of the egregious aspects of the last Bill that we battled over, and by welcoming the inclusion of access to data for researchers—although I believe there are some details to discuss. I am extremely pleased finally to see provisions for the coroner’s access to data in cases where a child has died. On that alone, I wish the Bill swift passage.
However, a Bill being less egregious is not sufficient on a subject fundamental to the future of UK society and its prosperity. I want to use my time this afternoon to ask how the Government will not simply make available, but secure the full value of, the UK’s unique datasets; why they do not fully make the UK AI-ready; and why proposals that they did not support in opposition have been included and amendments that they did support have been left out.
We have a unique opportunity, as the noble Lord, Lord Markham, just described, with unique publicly held datasets, such as the NHS’s. At a moment at which the LLMs and LMMs that will power our global future are being built and trained, these datasets hold significant value. Just as Britain’s coal reserves fuelled global industrial transformation, our data reserves could have a significant role to play in powering the AI transformation.
However, we are already giving away access to national data assets, primarily to a handful of US-based tech companies that will make billions selling the products and services built upon them. That creates the spectre of having to buy back drugs and medical innovations that simply would have not been possible without the incalculably valuable data. Reimagining and reframing publicly held data as a sovereign asset accessed under licence, protected and managed by the Government acting as custodian on behalf of UK citizens, could provide direct financial participation for the UK in the products and services built and trained on its data. It could give UK-headquartered innovators and researchers privileged access to nationally held data sets, or to investing in small and medium-sized specialist LLMs, which we will debate later in the week. Importantly, it would not simply monetise UK data but give the UK a seat at the table when setting the conditions for use of that data. What plans do the Government have to protect and value publicly held data in a way that maximises its long-term value and the values of the UK?
Similarly, the smart data schemes in the Bill do not appear to extend the rights of individual data holders to use their data in productive and creative ways. The Minister will recall an amendment to the previous data Bill, based on the work of associate professor Reuben Binns, that sought to give individuals the ability to assign their data rights to a third party for agreed purposes. The power of data is fully realised only when it is combined. Creating communal rights for UK data subjects could create social and economic opportunities for communities and smaller challenger businesses. Again, this is a missed opportunity to support the Government’s growth agenda.
My second point is that the Bill fails to tackle present-day or anticipated uses of data by AI. My understanding is that the AI Bill is to be delayed until the Government understand the requirements of the new American Administration. That is concerning on many levels, so perhaps the Minister can say something about that when she winds up. Whatever the timing, since data is, as the Minister said, in the DNA of AI infrastructure, why does the Bill so spectacularly fail to ensure that our data laws are AI-ready? As the News Media Association says, the Bill is silent on the most pressing data policy issue of our time: namely, that the unlicensed use of data created by the media and broader creative industries by AI developers represents IP theft on a mass scale.
Meanwhile, a single-sentence petition that says,
“The unlicensed use of creative works for training generative AI is a major, unjust threat to the livelihoods of the people behind those works, and must not be permitted”,
has been signed by nearly 36,000 organisations and individuals from the creative community. This issue was the subject of a cross-party amendment to which Labour put its name, which would have put the voluntary web standards represented by the robots.txt protocol on a mandatory opt-in basis—likely only one of several amendments needed to ensure that web indexing does not become a proxy for theft. In 2022, it was estimated that the UK creative industries generated £126 billion in gross value added to the economy and employed 2.4 million people. Given their importance to our economy, our sense of identity and our soft power, why do we have a data Bill that is silent on data scraping?
In my own area of particular concern, the Bill does not address the impact of generative AI on the lives and rights of children. For example, instead of continuing to allow tech companies to use pupil data to build unproven edtech products based on drill-and-practice learning models—which in any other form is a return to Victorian rote learning but with better graphics—the Bill could and should introduce a requirement for evidence-based, pedagogically sound paradigms that support teachers and pupils. In the recently announced scheme to give edtech companies access to pupil data, I could not see details about privacy, quality assurance or how the DfE intends to benefit from these commercial ventures which could, as in my previous NHS example, end with schools or the DfE having to buy back access to products built on UK pupil data. There is a quality issue, a safety issue and an ongoing privacy issue in our schools, and yet nothing in the Bill.
The noble Baroness and I met to discuss the need to tackle AI-generated sexual abuse, so I will say only that each day that it is legal to train AI models to create child sexual abuse material brings incalculable harm. On 22 May, specialist enforcement officers and I, along with the noble Viscount, Lord Camrose, were promised that the ink was almost dry on a new criminal offence. It cannot be that what was possible on that day now needs many months of further drafting. The Government must bring forward in this Bill the offence of possessing, sharing, creating or distributing an AI file that is trained on or trained to create CSAM, because this Bill is the first possible vehicle to do so. Getting this on the books is a question of conscience.
My third and final point is that the Bill retains some of the deregulatory aspects of its predecessor, while simultaneously missing the opportunity of updating data law to be fit for today. For example, the Bill extends research exemptions in the GDPR to
“any research that can reasonably be described as scientific”,
including commercial research. The Oxford English Dictionary says that “science” is
“The systematic study of the structure and behaviour of the physical and natural world through observation, experimentation, and the testing of theories against the evidence obtained”.
Could the Minister tell the House what is excluded? If a company instructs its data scientists and computing engineers to develop a new AI system of any kind, whether a tracking app for sport or a bot for an airline, is that scientific research? If their behavioural scientists are testing children’s response to persuasive design strategies to extend the stickiness of their products, is that scientific research? If the answer to these questions is yes, then this is simply an invitation to tech companies to circumvent privacy protections at scale.
I hope the noble Baroness will forgive me for saying that it will be insufficient to suggest that this is just tidying up the recitals of the GDPR. Recital 159 was deemed so inadequate that the European Data Protection Supervisor formally published the following opinion:
“the special data protection regime for scientific research is understood to apply where … the research is carried out with the aim of growing society’s collective knowledge and wellbeing, as opposed to serving primarily one or several private interests”.
I have yet to see that the Government’s proposal reflects this critical clarification, so I ask for some reassurance and query how the Government intend to account for the fact that, by putting a recital on the face of the Bill, it changes its status.
In the interests of time, I will put on the record that I have a similar set of issues about secondary processing, recognised legitimate interests, the weakening of purpose limitation, automated decision-making protections and the Secretary of State’s power to add to the list of special category data per Clause 74. These concerns are shared variously by the ODI, the Ada Lovelace Institute, the Law Society, Big Brother Watch, Defend Digital Me, 5Rights, Connected by Data and others. Collectively, these measures look like the Government are paving a runway for tech access to the private data of UK citizens or, as the Secretary of State for DSIT suggested in his interview in the Times last Tuesday, that the Government no longer think it is possible to regulate tech giants at all.
I note the inclusion of a general duty on the ICO to consider the needs of children, but it is a poor substitute for giving children wholesale protection from any downgrading of their existing data rights and protections, especially given the unprecedented obligations on the ICO to support innovation and stimulate growth. As the Ada Lovelace Institute said,
“writing additional pro-innovation duties into the face of the law … places them on an almost equivalent footing to protecting data subjects”.
I am not sure who thinks that tech needs protection from individual data rights holders, particularly children, but unlike my earlier suggestion that we protect our sovereign data assets for the benefit of UK plc, the potential riches of these deregulation measures disproportionately accrue to Silicon Valley. Why not use the Bill to identify and fix the barriers the ICO faces in enforcing the AADC? Why not use it to extend existing children’s privacy rights into educational settings, as many have campaigned for? Why not allow data subjects more freedom to share their data in creative ways? The Data (Use and Access) Bill has little in it for citizens and children.
Finally, but by no means least importantly, is the question of the reliability of computers. At col. GC 576 of Hansard on 24 April 2024, the full tragedy of the postmasters was set out by the noble Lord, Lord Arbuthnot, who is in his place and will say more. The notion that computers are reliable has devastated the lives of postmasters wrongly accused of fraud. The Minister yesterday, in answer to a question from the noble Lord, Lord Holmes, suggested that we should all be “more sceptical” in the face of computer evidence, but scepticism is not legally binding. The previous Government agreed to find a solution, albeit not a return to 1999. If the current Government fail to accept that challenge, they must shoulder responsibility for the further miscarriages of justice which will inevitably follow. I hope the noble Baroness will not simply say that the reliability of computers and the other issues raised are not for this Bill. If they are not, why not? Labour supported them in opposition. If not, then where and how will these urgent issues be addressed?
As I said at the outset, a better Bill is not a good Bill. I question why the Government did not wait a little longer to bring forward a Bill that made the UK AI ready, understood data as critical infrastructure and valued the UK’s sovereign data assets. It could have been a Bill that did more work in reaching out to the public to get their consent and understanding of positive use cases for publicly held data, while protecting their interests—whether as IP holders, communities that want to share data for their own public good or children who continue to suffer at the hands of corporate greed. My hope is that, as we go to Committee, the Government will come forward with the missing pieces. I believe there is a much more positive and productive piece of legislation to be had.
With respect, it is the narrow question that a number of us have raised. Training the new AI systems is entirely dependent on them being fed vast amounts of material which they can absorb, process and reshape in order to answer questions that are asked of them. That information is to all intents and purposes somebody else’s property. What will happen to resolve the barrier? At the moment, they are not paying for it but just taking it—scraping it.
Perhaps I may come in too. Specifically, how does the data protection framework change it? We have had the ICO suggesting that the current framework works perfectly well and that it is the responsibility of the scrapers to let the IP holders know, while the IP holders have not a clue that it is being scraped. It is already scraped and there is no mechanism. I think we are a little confused about what the plan is.
I can certainly write to noble Lords setting out more details on this. I said in response to an Oral Question a few days ago that my honourable friend Minister Clark in DSIT and Chris Bryant, whom the noble Lord, Lord Russell, mentioned, are working jointly on this. They are looking at a proposal that can come forward on intellectual property in more detail. I hope that I can write to noble Lords and set out more detail on that.
On the question of the Horizon scandal and the validity of computers, raised, quite rightly, by the noble Lords, Lord Arbuthnot and Lord Holmes, and the noble Baroness, Lady Kidron, I think we all understand that the Horizon scandal was a terrible miscarriage of justice, and the convictions of postmasters who were wrongly convicted have been rightly overturned. Those Post Office prosecutions relied on assertions that the Horizon system was accurate and reliable, which the Post Office knew to be wrong. This was supported by expert evidence, which it knew to be misleading. The issue was not, therefore, purely about the reliability of the computer-generated evidence. Almost all criminal cases rely to some extent on computer evidence, so the implications of amending the law in this area are far- reaching, a point made by several noble Lords. The Government are aware that this is an issue, are considering this matter very carefully and will announce next steps in due course.
Many noble Lords, including the noble Lords, Lord Clement-Jones, Lord Vaux and Lord Holmes of Richmond, and the noble and learned Lord, Lord Thomas, raised automated decision-making. I noted in my opening speech how the restored accountability framework gives us greater confidence in ADM, so I will not go over that again in detail. But to explain the Bill’s drafting, I want to reassure and clarify for noble Lords that the Bill means that the organisation must first inform individuals if a legal or significant decision has been taken in relation to them based solely on automated processing, and then they must give individuals the opportunity to challenge such decisions, obtain human intervention for them and make representations about them to the controller.
The regulation-making powers will future-proof the ADM reforms in the Bill, ensuring that the Government will have the powers to bring greater legal certainty, where necessary and proportionate, in the light of constantly evolving technology. I reiterate that there will be the right to human intervention, and it will be on a personal basis.
The noble Baroness, Lady Kidron, and the noble Lords, Lord Russell of Liverpool and Lord Clement-Jones, raised concerns about edtech. The Government recognise that concerns have been raised about the amount of personal data collected by education technology used in schools, and whether this is fully transparent to children and parents. The Department for Education is committed to improving guidance and support for schools to help them better navigate this market. For example, its Get Help with Data Protection in Schools project has been established to help schools develop guidance and tools to help them both understand and comply with data protection legislation. Separately, the ICO has carried out a series of audits on edtech service providers, assessing privacy risks and potential non-compliance with data protection regulations in the development, deployment and use of edtech solutions in schools.
The creation of child sexual abuse material, CSAM, through all mediums including AI—offline or online—is and continues to be illegal. This is a forefront priority for this Government and we are considering all levers that can be utilised to fight child sexual abuse. Responsibility for the law in this area rests with the Home Office; I know it is actively and sympathetically looking at this matter and I understand that my colleague the Safeguarding Minister will be in touch with the noble Baroness, Lady Kidron, and the noble Lord, Lord Bethell, ahead of Committee.
I can see that I am running out of time so, rather than testing noble Lords’ patience, will draw my comments to a close. I have not picked up all the comments that colleagues made, but I thank everybody for their excellent contributions. This is the beginning of a much longer conversation, which I am very much looking forward to, as I am to hearing all those who promised to participate in Committee. I am sure we will have a rich and interesting discussion then.
I hope I have persuaded some noble Lords that the Bill is not only wide ranging but has a clear and simple focus, which is about growing the economy, creating a modern, digital government and, most importantly, improving people’s lives, which will be underpinned by robust personal data protection. I will not say any more at this stage. We will follow up but, in the meantime, I beg to move.