Data (Use and Access) Bill [HL] Debate
Full Debate: Read Full DebateViscount Colville of Culross
Main Page: Viscount Colville of Culross (Crossbench - Excepted Hereditary)Department Debates - View all Viscount Colville of Culross's debates with the Department for Business and Trade
(2 days, 11 hours ago)
Grand CommitteeMy Lords, I have tabled Amendments 92, 93, 101 and 105, and I thank the noble Lord, Lord Clement-Jones for adding his name to them. I also support Amendment 137 in the name of my noble friend Lady Kidron.
Clause 77 grants an exemption to the Article 13 and 14 rights of data subjects to be told within a set timeframe that their data will be reused for scientific research, if it would be impossible or involve disproportionate effort to do so. These amendments complement those I proposed to Clause 67. They aim to ensure that “scientific research” is limited in its definition and that the large language AI developers cannot say that they are doing scientific research and that the GDPR requirements involve too much effort to have to contact data subjects to reuse their data.
It costs AI developers time and money to identify data subjects, so this exemption is obviously very valuable to them and they will use it if possible. They will claim that processing and notifying data subjects from such a huge collection of data is a disproportionate effort, as it is hard to extract the identity of data subjects from the original AI model.
Up to 5 million data subjects could be involved in reusing data to train a large language model. However, the ICO requires data controllers to inform subjects that their data could be reused even if it involves contacting 5 million data subjects. The criteria set out in proposed new subsection (6) in Clause 77 play straight into the hands of ruthless AI companies that want to take advantage of this exemption.
Amendments 92 and 101 would ensure that the disproportionate effort excuse is not used if the number of data subjects is mentioned as a reason for deploying the excuse. Amendments 93 and 105 would clarify the practices and facts that would not qualify for the disproportionate effort exemption—namely,
“the fact the personal data was not collected from the data subject, or any processing undertaken by the controller that makes the effort involved greater”.
Without this wording, the Bill will mean that the data controller, when wanting to reuse data for training another large language model, could process the personal data on the original model and then reuse it without asking permission from the original subjects. The AI developer could say, “I don’t have the original details of the data subject, as they were deleted when the original model was trained. There was no identification of the original data subjects; only the data weight”. I fear that many companies will use this excuse to get around GDPR notification expectations.
Noble Lords should recognise that these provisions affect only AI developers seeking to reuse data under the scientific research provisions. These will mainly be the very large AI developers, which tend to use scrape data to train their general purpose models. Controllers will still be able to use personal data to train AI systems when they have lawful grounds to do so—they either have the consent of the data subject or there is a legitimate interest—but I want to make it clear that these provisions will not inhibit the legitimate training of AI models.
These amendments would ensure that organisations, especially large language AI developers, are not able to reuse data at scale, in contradiction to the expectations and intentions of data subjects. Failure to get this right will risk setting off a public backlash against the use of personal data for AI use, which would impede this Government’s aims of making this country an AI superpower. I beg to move.
Discussions with the ICO are taking place at the moment about the scope and intention of a number of issues around AI, and this issue would be included in that. However, I cannot say at the moment that that intention is specifically spelled out in the way that the noble Baroness is asking.
This has been a wide-ranging debate, with important contributions from across the Committee. I take some comfort from the Minister’s declaration that the exemptions will not be used for web crawling, but I want to make sure that they are not used at the expense of the privacy and control of personal data belonging to the people of Britain.
That seems particularly so for Amendment 137 in the name of the noble Baroness, Lady Kidron. I was particularly taken by her pointing out that children’s data privacy had not been taken into account when it came to AI, reinforced by the noble Baroness, Lady Harding, telling us about the importance of the Bill. She said it was paramount to protect children in the digital age and reminded us that this is the biggest breakthrough of our lifetime and that children need protecting from it. I hope very much that there will be some successful meetings, and maybe a government amendment on Report, responding to these passionate and heartfelt demands. On that basis, I sincerely hope the Minister will meet us all and other noble Lords to discuss these matters of data privacy further. On that basis, I beg leave to withdraw my amendment.
My Lords, Amendment 119 is in my name, and I thank the noble Lord, Lord Knight, for adding his name to it. I am pleased to add my name to Amendment 115A in the name of noble Viscount, Lord Camrose.
Transparency is key to ensuring that the rollout of ADM brings the public and, most importantly, public trust with it. I give the Committee an example of how a lack of transparency can erode that trust. The DWP is using a machine learning model to analyse all applications for a loan, as an advance on a benefit to pay bills and other costs, while a recipient waits for their first universal credit payment. The DWP’s own analysis of the model concluded that for all of the protected characteristics that were analysed, including age, marital status and disability, it found disparities in who was most likely to be incorrectly referred by the model.
It is difficult to assess whether the model is discriminatory, effective or even lawful. When the DWP rolled it out, it was unable to reassure the Comptroller and Auditor-General that its anti-fraud models treated all customer groups fairly. The rollout continues despite these concerns. The DWP maintains that the analysis does not present
“any immediate concerns of discrimination, unfair treatment or detrimental impact on customers”.
However, because so little information is available about the model, this claim cannot be independently verified to provide the public with confidence. Civil rights organisations, including the Public Law Project, are currently working on a potential claim against the DWP, including in relation to this model, on the basis that they may consider it may be unlawful.
The Government’s commitment to rolling out ADM has been accompanied by a statement in the other place in November by AI Minister Feryal Clark that the mandatory requirement for the use of the ATRS has been seen as a significant acceleration towards adopting the standard. In response to a Written Question, the Secretary of State confirmed that, as part of the rollout of ADM phase 1 to the 16 largest ministerial departments plus HMRC, there is a deadline for them to publish their first ATRS records by the end of July 2024. Despite the Government’s statement, only eight ATRS reports have been published on the hub. The Public Law Project’s TAG project has discovered at least 74 areas in which ADM is being used, and they are only the ones that it has been able to uncover by freedom of information requests and from tip-offs by affected people. There is clearly a shortfall in the implementation and rolling out of the use of the ATRS across government departments.