Data (Use and Access) Bill [Lords] Debate
Full Debate: Read Full DebateDavid Davis
Main Page: David Davis (Conservative - Goole and Pocklington)Department Debates - View all David Davis's debates with the Department for Science, Innovation & Technology
(2 days, 1 hour ago)
Commons ChamberI would like to thank colleagues in the other place and in this House who have worked so hard to improve the Bill. By modernising data infrastructure and governance, this Bill seeks to unlock the secure, efficient use of data while promoting innovation across sectors. As a tech evangelist, as well as the Chair of the Science, Innovation and Technology Committee, I welcome it, and I am pleased to see colleagues from the Select Committee, my hon. Friend the Member for Stoke-on-Trent South (Dr Gardner) and the right hon. Member for North West Hampshire (Kit Malthouse), here for this debate.
Having spent many unhappy hours when working for Ofcom trying to find out where British Telecom’s ducts were actually buried, I offer a very personal welcome to the national underground asset register, and I thank the Minister for his work on this Bill as well as for his opening comments.
I agree with the Minister that there is much to welcome in this Bill, but much of the Second Reading debate was consumed by discussion on AI and copyright. I know many Members intend to speak on that today, so I will just briefly set out my view.
The problem with the Government’s proposals on AI and copyright are that they give all the power to the tech platforms who—let us be frank—have a great deal of power already, as well as trillions of dollars in stock market capitalisation and a determination to return value to their shareholders. What they do not have is an incentive to design appropriate technology for transparency and rights reservation if they believe that in its absence they will have free access to our fantastic creators’ ingenuity. It is essential that the Minister convinces them that if they do not deliver this technology—I agree with him that it is highly possible to do so—then he will impose it.
Perhaps the Minister could announce an open competition, with a supplier contract as the prize, for whichever innovative company designs something. The Science, Innovation and Technology Committee, sitting with the Culture, Media and Sport Committee, heard from small companies that can do just that. The tech giants might not like it, but I often say that the opposite of regulation is not no regulation—it is bad regulation. If the tech platforms do not lead, they will be obliged to follow because the House will not allow the copyright of our fantastic creators to be put at risk. The Minister knows that I think him extremely charismatic and always have done, but I do not believe that “Chris from DSIT” can prevail against the combined forces of Björn from Abba and Paul from The Beatles.
The prospects for human advancement opened by using data for scientific research are immense. As a world-leading science powerhouse, the UK must take advantage of them. That is why, despite being a strong advocate of personal data rights, I welcome the Bill’s proposals to allow the reuse of data without consent for the purposes of scientific research. I am concerned, however, that the exemption is too broad and that it will be taken advantage of by data-hungry tech companies using the exemption even if they are not truly advancing the cause of scientific progress but simply, as with copyright, training their AI models.
Huge amounts of data is already collected by platforms, such as direct messages on Instagram or via web-scraping of any website that contains an individual’s personal data such as published records or people’s public LinkedIn pages. We know it can be misused because it has been, most recently with Meta’s controversial decision to use Instagram-user data to train AI models, triggering an Information Commissioner’s Office response because of the difficulty users encountered in objecting to it. Then there is the risk of data collected via tracking cookies or the profiling of browsing behaviour, which companies such as Meta use to fingerprint people’s devices and track their browsing habits. Could the data used to create ads also be freely reusable under this exemption? The US tech firm Palantir has the contract for the NHS federated data platform. Amnesty International has already raised concerns about the potential for patients’ data being mishandled. Does the Bill mean that our health data could be reused by Palantir for what it calls research purposes?
Before the hon. Lady moves on from Palantir, I think the House should know that it is an organisation with its origins in the American security state—the National Security Agency and the Central Intelligence Agency—and I cannot understand for the life of me why we are willing to commit the data of our citizens to an organisation like that.
That is exactly what is at the heart of this matter—the data that drives that addictiveness and commercialises our children’s attention is not the way forward.
Many amazing organisations have gathered evidence in this area, and it is abundantly clear that the overuse of children’s data increases their risk of harm. It powers toxic algorithms that trap children in cycles of harmful content, recommender systems that connect them with predators, and discriminatory AI systems that are used to make decisions about them that carry lifelong consequences. Health Professionals for Safer Screens—a coalition of child psychiatrists, paediatricians and GPs— is pleading for immediate legislative action.
This is not a partisan issue. So many of us adults can relate to the feeling of being drawn into endless scrolling on our devices—I will not look around the Chamber too much. Imagine how much more difficult it is for developing minds. This is a cross-party problem, and it should not be political, but we need action now.
Let me be absolutely clear: this change is not about restricting young people’s digital access or opposing technology and innovation; it is about requiring platforms to design their services with children’s safety as the default, not as an afterthought. For years we have watched as our children’s wellbeing has been compromised by big tech companies and their profits. Our call for action is supported by the National Society for the Prevention of Cruelty to Children, 5rights, Healthcare Professionals for Safer Screens, Girlguiding, Mumsnet and the Online Safety Act network. This is our chance to protect our children. The time to act is not 18 months down the line, as the Conservatives suggest, but now. I urge Members to support new clause 1 and take the crucial steps towards creating a digital world where children can truly thrive.
To protect our children, I have also tabled amendment 45 to clause 80, which seeks to ensure that automated decision-making systems cannot be used to make impactful decisions about children without robust safeguards. The Bill must place a child’s best interests at the heart of any such system, especially where education or healthcare are concerned.
We must protect the foundational rights of our creators in this new technological landscape, which is why I have tabled new clause 2. The UK’s creative industries contribute £126 billion annually to our economy and employ more than 2.3 million people—they are vital to our economy and our cultural identity. These are the artists, musicians, writers and creators who inspire us, define us and proudly carry British creativity on to the global stage. Yet today, creative professionals across the UK watch with mounting alarm as AI models trained on their life’s work generate imitations without permission, payment or even acknowledgment.
New clause 2 would ensure that operators of web crawlers and AI models comply with existing UK copyright law, regardless of where they are based. This is not about stifling innovation; it is about ensuring that innovation respects established rights and is good for everyone. Currently, AI companies are scraping creative works at an industrial scale. A single AI model may be trained on thousands of copyrighted works without permission or compensation.
The UK company Polaron is a fantastic example, creating AI technology to help engineers to characterise materials, quantify microstructural variation and optimise microstructural designs faster than ever before. Why do I bring up Polaron? It is training an AI model built from scratch without using copyright materials.
I am emphatically on the hon. Lady’s side in her intent to protect British creativity, but how does she respond to the implicit threat from artificial intelligence providers to this and other elements of the Bill to effectively deny AI to the UK if they find the regulations too difficult to deal with?
We have a thriving innovation sector in the UK, so those companies are not going anywhere—they want to work with the UK. We actually have a system now that has a fantastic creative industry and we have innovation and business coming in. There are many ways to incentivise that. I talk a lot about money, skills and infrastructure—that is what these innovative companies are looking for. We can make sure the guardrails are right so that it works for everyone.
By ensuring that operators of web crawlers and AI models comply with existing UK copyright law, we are simply upholding established rights in a new technological context. The UK led the world in establishing trustworthy financial and legal services, creating one of the largest economies by taking a long-term view, and we can do the same with technology. By supporting new clause 2, we could establish the UK as a base for trustworthy technology while protecting our creative industries.
Finally, I will touch on new clause 4, which would address the critical gap in our approach to AI regulation: the lack of transparency regarding training data. Right now, creators have no way of knowing if their work has been used to train AI models. Transparency is the foundation of trust. Without it, we risk not only exploiting creators, but undermining public confidence in these powerful new technologies. The principle is simple: if an AI system is trained using someone’s creative work, they deserve to know about it and to have a say in how it is used. That is not just fair to creators, but essential for building an AI ecosystem that the public trust. By supporting new clause 4, we would ensure that the development of AI happens in the open, allowing for proper compensation, attribution and accountability. That is how we will build responsible AI that serves everyone, not just the tech companies.
On the point of transparency, I will touch briefly on a couple of other amendments. We must go further in algorithmic decision making. That is why I have tabled amendment 46, which would ensure that individuals receive personalised explanations in plain language when an automated decision system affects them. We cannot allow generic justifications to stand in for accountability.
I thank the Minister for that reassurance. I did take part in a Westminster Hall debate on this matter a couple of weeks ago, but one of his colleagues was responding. I made the same point then. Quite often in the media or more generally, AI seems to be pitted against our creative industries, which should not be the case, because we know that our creative industries embrace technology virtually more than any other sector. They want to use AI responsibly. They do not want to be replaced by it. The question before us is how lawmakers can ensure that AI is used ethically without this large-scale theft of IP. We are today discussing amendments that go somewhere towards providing an answer to that question.
On this issue of Luddites, surely one of the problems for English language creators is that what they create is of more value because of the reach of the English language over others. Therefore, they are more likely to have their product scraped and have more damage done to them.
My right hon. Friend makes a very good observation, but the fact is that so much content has already been scraped. Crawlers are all over the intellectual property of so many of our creators, writers and publishers—so much so that we are almost in a position where we are shutting the gate after the horse has bolted. Nevertheless, we need to do what we can legislatively to get to a better place on this issue.
New clause 2 would simply require anyone operating web crawlers for training and developing AI models to comply with copyright law. It is self-evident and incontrovertible that AI developers looking to deploy their systems in the UK should comply with UK law, but they often claim that copyright is not very clear. I would argue that it is perfectly clear; it is just that sometimes they do not like it. It is a failure to abide by the law that is creating lawsuits around the world. The new clause would require all those marketing their AI models in the UK to abide by our gold-standard copyright regime, which is the basis that underpins our thriving creative industries.
New clause 3 would require web crawler operations and AI developers to disclose the who, what, why, and when crawlers are being used. It also requires them to use different crawlers for different purposes and to ensure that rights holders are not punished for blocking them. A joint hearing of the Culture, Media and Sport Committee and the Science, Innovation and Technology Committee heard how publishers are being targeted by thousands of web crawlers with the intention of scraping content to sell to AI developers. We heard that many, if not most, web crawlers are not abiding by current opt-out protocols—robots.txt, for example. To put it another way, some developers of large language models are buying data scraped by third-party tech companies, in contravention of robots.txt protocols, to evade accusations of foul play. All this does is undermine existing licensing and divert revenues that should be returning to our creative industries and news media sector. New clause 3 would provide transparency over who is scraping copyrighted works and give creators the ability to assert and enforce their rights.
New clause 4 would require AI developers to be transparent about what data is going into their AI models. Transparency is fundamental to this debate. It is what we should all be focusing on. We are already behind the drag curve on this. California has introduced transparency requirements, and no one can say that the developers are fleeing silicon valley just yet.
New clause 20, tabled by the official Opposition, also addresses transparency. It would protect the AI sector from legal action by enabling both sides to come to the table and get a fair deal. A core part of this new clause is the requirement on the Secretary of State to commit to a plan to help support creators where their copyright has been used in AI by requiring a degree of transparency.
New clause 5 would provide the means by which we could enforce the rules. It would give the Information Commissioner the power to investigate, assess and sanction bad actors. It would also entitle rights holders to recover damages for any losses suffered, and to injunctive relief. Part of the reason why rights holders are so concerned is that the vast majority of creators do not have deep enough pockets to take on AI developers. How can they take on billion-dollar big tech companies when those companies have the best lawyers that money can buy, who can bog cases down in legislation and red tape? Rights holders need a way of enforcing their rights that is accessible, practical and fair.
The Government’s AI and copyright consultation says that it wants to ensure
“a clear legal basis for AI training with copyright material”.
That is what the new clauses that I have spoken to would deliver. Together they refute the tech sector’s claims of legal uncertainty, while providing transparency and enforcement capabilities for creators.
Ultimately, transparency is the main barrier to greater collaboration between AI developers and creators. Notwithstanding some of the unambitious Government amendments, the Opposition’s amendments would provide the long-overdue redress to protect our creative industries by requiring transparency and a widening of the scope of those who are subject to copyright laws.
The amendments would protect our professional creators and journalists, preserve the pipeline of young people looking to make a career in these sectors themselves, and cement the UK as a genuine creative industries superpower, maintaining our advantage in the field of monetising intellectual property. One day we may make a commercial advantage out of the fact that we are the place where companies can set up ethical AI companies—we could be the envy of the world.
My right hon. Friend makes a formidably important point. The amendment highlights one of the extraordinary weaknesses of the Bill, which is that it in effect reverses GDPR on a large number of citizen protections. To reiterate the point he gently made, that enormous fine will not stop TikTok, because it operates under legal compulsion. Even though it paid £450 million, it will continue to commit the criminal offence for which it has just been convicted.
I agree with my right hon. Friend: that is the peculiarity. The Minister knows only too well about the nature of what goes on in countries such as China. Chinese companies are frankly scared stiff of cutting across what their Government tell them they have to do, because what happens is quite brutal.
We have to figure out how we protect data from ill use by bad regimes. I use China as an example because it is simply the most powerful of those bad regimes, but many others do not observe data protection in the way that we would assume under contract law. For example, BGI’s harnessing of the data it has gleaned from covid tests, and its dominance in the pregnancy test market, is staggering. It has been officially allowed to take 15% of the data, but it has taken considerably more, and that is just one area.
Genomics is a huge and vital area right now, because it will dominate everything in our lives, and it populates AI with an ability to describe and recreate the whole essence of individuals, so this is not a casual or small matter. We talk about AI being used in the creative industries—I have a vested interest, because my son is in the creative industries and would support what has been said by many others about protecting them—but this area goes a whole quantum leap in advance of that. We may not even know in the future, from the nature of who they are, who we are talking to and what their vital statistics are.
This amendment is not about one country; it is about providing a yardstick against which all third countries should be measured. If we are to maintain the UK’s standing as a nation that upholds privacy, the rule of law, democracy and accountability, we must not allow data to be transferred to regimes that fundamentally do not share those values. It is high time that we did this, and I am glad to see the Minister nodding. I hope therefore that he might look again at the amendment. Out of old involvement in an organisation that he knows I am still part of, he might think to himself that maybe this is worth doing or finding some way through.
I thank the hon. Member for making that important point, and of course she is right.
I go back to this question of the threats to the database, which are not simply the product of my imagination; they are real. First, all data can be monetised, but this database is so large that huge commercial interests are now trying to get access to that health data. I do not want to cause offence to any hon. Members, all of whom I know follow the rules, but it is interesting that nearly £3 million from the private health sector was made available to over 150 different Members of Parliament. I do not suggest that any Member has done anything inappropriate—that would be wrong of me—but one wonders how almost £3 million was found by a private sector that has no commercial interest in pursuing those investments.
Secondly, on commercial interests, will the Minister confirm that at no stage will any data or any other aspect of the NHS be up for sale as part of negotiations with the United States on a trade deal? Will the Government provide some guidance on that? If the House reflects on private sector interests—which are not necessarily in the best interests of humanity—and how they make money, there is an interesting thought about health insurance. A party represented in the House is led by an individual who has suggested that we should end the way that we fund the NHS and replace it with an insurance system. If the insurance industry got access to the data held on all of us by the NHS, they would be able to see the genome of each person or of groups of people, and provide differential rates of insurance according to people’s genetic make-up. That is a serious threat. I do not think the party that has recently entered the House has thought that through, but companies providing insurance could commercialise that data. That is one reason we must never follow the track towards a national insurance system to replace the NHS.
Yesterday, the Secretary of State for Health and Social Care told the House that we will not be privatising the NHS, and I welcome that statement. Reference has already been made to Palantir—the right hon. Member for Goole and Pocklington (David Davis) mentioned it earlier—and the contract that we inherited from the previous Government. It is extraordinary that Palantir, a company that has deep roots in the United States defence establishment, should be handling the data of millions of people, when its chair has said that he is completely opposed to the central principle of the NHS and that he effectively wants a private health system in the UK. How could a £500 million contract to handle our personal data have been handed over to such a company, led by a person whose purpose seems to be to destroy the principles of our NHS? How our data is handled should be our decision, in the United Kingdom.
The Information Commissioner says that it is important that this precious and vital data, which is personal to each of us, should be protected against any possibility of cyber-attacks. However, there has already been a cyber-attack. Qilin—the way I am pronouncing it makes it sound as if someone is trying to commit murder, but there may be another way of saying it—is a Russian cyber-criminal group that obtained access to 400 GB of private information held by a company dealing with pathology testing. That is an enormous amount of data. Qilin attempted to extort from the company that held the data a financial interest. I do not know whether enough provision is made in the Bill for the protection of our data, so I suggest that there should be a new public interest test, with a report to Parliament within six months, which we can all debate and then examine whether the legislation has gone far enough.
Finally, the Information Commissioner says three things. First, the database must retain public confidence. Media discussions and opinion polling show that people are losing confidence that their personal data is secure, and I understand why that should be the case. Secondly, data should be properly protected and built from the beginning with proper safeguards against cyber-attacks. Thirdly, and perhaps most importantly, the Bill refers to an effective exemption for scientific research. As my hon. Friend the Member for Newcastle upon Tyne Central and West (Chi Onwurah) said, private companies, and perhaps US companies, might use the idea of promoting scientific research as a fig leaf to hide their search for profit from the precious commodity—data—that we have because we created our NHS. That is a very dangerous thought, and the Information Commissioner says he is not convinced that the definition of scientific research in the Bill is sufficiently strong to protect us from predatory activity by other state actors or private companies.
The hon. Gentleman is making an excellent speech and some very perceptive points. I remind him that previous attempts by the NHS to create a single data standard have all failed, because the GPs did not believe that the security levels were sufficient. It is not just the Information Commissioner; the GPs refused to co-operate, which highlights the powerful point that the hon. Gentleman is making.
I am grateful to the right hon. Gentleman for making that very serious point. When the clinicians—whose duty is to protect their patients—say they are not convinced about the safety of data being handed over to a central database, we have to listen to their reactions.
I do not intend to press my new clause to the vote, but it is important that we continue to debate this matter, because this enormous database—which can contribute to the general welfare of all humanity—must be protected in such a way that it retains confidence and ensures the security of the whole system. With that, I leave the discussion to continue on other matters.