Data (Use and Access) Bill [HL]

Viscount Colville of Culross Excerpts
Tuesday 19th November 2024

(1 day, 9 hours ago)

Lords Chamber
Read Full debate Read Hansard Text Watch Debate Read Debate Ministerial Extracts
Viscount Colville of Culross Portrait Viscount Colville of Culross (CB)
- View Speech - Hansard - -

I, too, thank the Minister for her introduction to this welcome Bill. I feel that most noble Lords have an encyclopaedic knowledge of this subject, having been around the course not just once but several times. As a newcomer to this Bill, I am aware that I have plenty to learn from their experience. I would like to thank the Ada Lovelace Institute, the Public Law Project, Connected by Data and the Open Data Institute, among others, which have helped me get to grips with this complicated Bill.

Data is the oil of the 21st century. It is the commodity which drives our great tech companies and the material on which the large language models of AI are trained. We are seeing an exponential growth in the training and deployment of AI models. As many noble Lords have said, it has never been more important than now to protect personal data from being ruthlessly exploited by these companies, often without the approval of either the data owners or the creators. It is also important that, as we roll out algorithmic use of data, we ensure adequate protections for people’s data. I, too, hope this Bill will soon be followed by another regulating the development of AI.

I would like to draw noble Lords’ attention to a few areas of the Bill which cause me concern. During the debates over the last data protection Bill, I know there were worries over the weakening of data subjects’ protection and the loosening of processing of their data. The Government must be praised for losing many of these clauses, but I am concerned, like some other noble Lords, to ensure adequate safeguards for the new “recognised legitimate interests” power given to data processors. I support the Government’s growth agenda and understand that this power will create less friction for companies when using data for their businesses, but I hope that we will have time later in the passage of the Bill to scrutinise the exemption from the three tests for processing data, particularly the balancing test, which are so important in forcing companies to consider the data rights of individuals. This is especially so when safeguarding children and vulnerable people. The test must not be dropped at the cost of the rights of people whose data is being used.

This concern is reinforced by the ICO stating in its guidance that this test is valuable in ensuring companies do not use data in a way that data subjects would not reasonably expect it to be used. It would be useful in the Explanatory Notes to the Bill to state explicitly that when a data processor uses “recognised legitimate interests”, their assessment includes the consideration of proportionality of the processing activity. Does the Minister agree with this suggestion?

The list of four areas for this exemption has been carefully thought through, and I am glad that the category of democratic engagement has been removed. However, the clause does give future Ministers a Henry VIII power to extend the list. I am worried; I have heard some noble Lords say that they are as well, and that the clause’s inclusion in the previous Bill also concerned other noble Lords. It could allow future Ministers to succumb to commercial interests and add new categories, which might be to the cost of data subjects. The Minister, when debating this power in the previous data Bill, reminded the House that the Delegated Powers and Regulatory Reform Committee said of these changes:

“The grounds for lawful processing of personal data go to the heart of the data processing legislation and therefore in our view should not be capable of being changed by subordinate legislation”.


The Constitution Committee’s report called for the Secretary of State’s powers in this area to be subject to primary and not secondary legislation. Why do these concerns not apply to Clause 70 in this Bill?

I welcome the Government responding to the scientific community’s demand that they should be able to reuse data for scientific, historic or statistical research. There will be many occasions when data was collected for the study of a specific disease and the researchers want to reuse it years later for further study, but they have been restricted by the narrow distinctions between the original and the new purpose. The Government have incorporated recitals from the original GDPR in the Bill, but the changes in Clause 67 must be read against the developments taking place in AI and the way in which it is being deployed.

I understand that the Government have gone to great efforts to set out a clear definition of scientific research in this clause. One criterion is the

“processing for the purposes of technological development or demonstration … so far as those activities can reasonably be described as scientific”,

and another is the publication of scientific papers from the study. But my fear is that AI companies, in their urgent need to scrape datasets for training large language models, will go beyond the policy intention in this clause. They might posit that their endeavours are scientific and may even be supported by academic papers, but when this is combined with the inclusion of commercial activities in the Bill, it opens the way for data reuses in creating AI data-driven products which claim they are for scientific research. The line between product development and scientific research is blurred because of how little is understood about these emerging technologies. Maybe it would help if the Bill set out what areas of commercial activity should not be considered scientific research. Can the Minister share with the House how the clause will stop attempts by AI developers to claim they are doing scientific research when they are reusing data to increase model efficiency and capabilities, or studying their risks? They might even be producing scientific papers in the process.

I have attended a forum with scientists and policymakers from tech companies using the training data for AI who admitted that it is sometimes difficult to define the meaning of scientific research in this context. This concern is compounded by Clause 77, which provides an exemption to Article 13 of the UK GDPR for researchers and archivists to provide additional information to a data subject when reusing their data for different purposes if it requires disproportionate effort to obtain the required information. I understand these provisions are drawn to help reuse medical data, but they could also be used by AI developers to say that contacting people for the reuse of datasets from an already trained AI model requires disproportionate effort. I understand there are caveats around this exemption. However, in an era when AI companies are scraping millions of pieces of data to train their models, noble Lords need to bear in mind it is often difficult for them to get permission from the data subjects before reusing the information for AI purposes.

I am impressed by the safeguards for the exemption for medical research set out in Clause 85. The clause says that medical research should be supervised by a research ethics committee to assess the ethical reuse of the data. Maybe the Government should think about using some kind of independent research committee with standards set by UKRI before commercial researchers are allowed to reuse data.

Like many other noble Lords, I am concerned about the changes to Article 22 of the UK GDPR put forward in Clause 80. I quite understand why the Government want to expand solely automated decision-making in order for decisions to be made quickly and efficiently. However, these changes need to be carefully scrutinised. The clause removes the burden on the data controller to overcome tests before implementing ADM, outside of the use of sensitive information. The new position requires the data subject to proactively ask if they would like a human to be involved in the decision made about them. Surely the original Article 22 was correct in making the processor think hard before making a decision to use ADM, rather than putting the burden on the data subject. That must be the right way round.

There are other examples, which do not include sensitive data, where ADM decisions have been problematic. Noble Lords will know that, during Covid, algorithms were used to predict A-level results which, in many cases, were flawed. None of that information would have been classified as sensitive, yet the decisions made were wrong in too many cases.

Once again, I am concerned about the Henry VIII powers which have been granted to the Secretary of State in new Article 22D(1) and (2). This clause is already extending the use of ADM, but it gives Secretaries of State in the future the power to change by regulation the definition of “meaningful human involvement”. This potentially allows for an expansion of the use of ADM; they could water down the effectiveness of human involvement needed to be considered meaningful.

Likewise, I am worried by the potential for regulations to be used to change the definition of a decision having a “significant adverse effect” on a data subject. The risk is that this could be used to exclude them from the relevant protection, but the decision could nevertheless still have a significant harmful effect on the individual. An example would be if the Secretary of State decided to exclude from the scope of a “significant decision” interim, rather than final, decisions. This could result in the exclusion of a decision taken entirely on the basis of a machine learning predictive tool, without human involvement, to suspend somebody’s universal credit pending an investigation and final decision of whether fraud had actually been committed. Surely some of the anxiety about this potential extension of ADMs would be assuaged by increased transparency around how they are used. The Bill is a chance for the Government to give greater transparency to how ADMs process our information. The result would be to greatly increase public trust.

The Algorithmic Transparency Recording Standard delivers greater understanding about the nature of tools being used in the public sector. However, of the 55 ADM tools in operation, only 9 reports have currently been subject to the ATRS. In contrast, the Public Law Project’s Tracking Automated Government register has identified at least 55 additional tools, with many others still to be uncovered. I suggest that the Government make it mandatory for public bodies to publish information about the ADM systems that they are using on the ATRS hub.

Just as importantly, this is a chance for people to obtain personal information about how an automated decision is made. The result would be that, if somebody is subject to a decision made or supported by AI or an algorithmic tool, they should be notified at the time of the decision and provided with a personalised explanation of how and why it was reached.

Finally, I will look at the new digital verification services trust framework being set up in Part 2. The Government must be praised for setting up digital IDs, which will be so useful in the online world. My life, and I am sure that of many others, is plagued by the vagaries of getting access to the various websites we need to run our lives, and I include the secondary security on our phones, which so often does not work. The effectiveness of this ID will depend on the trust framework that is created and on who is involved in building it.

At the moment, in Clause 28, the Secretary of State must consult the Information Commissioner and such other persons as the Secretary of State sees appropriate. It seems to me that the DVS will be useful only if it can be used across national boundaries. Interoperability must be crucial in a digital world without frontiers. I suggest that an international standards body should be included in the Bill. The most obvious would be W3C, the World Wide Web Consortium, which is the standards body for web technology. It was founded by Sir Tim Berners-Lee and is already responsible for the development of a range of web standards, from HTML to CSS. More than that, it is used in the beta version of the UK digital identity and attributes trust framework and has played a role in both the EU and the Australian digital identity services frameworks. I know that the Government want the Secretary of State to have flexibility in drawing up this framework, but the inclusion of an international standards body in the Bill would ensure that the Minister has them in the forefront of their mind when drawing up this much-needed framework.

The Bill is a wonderful opportunity for our country to build public trust in data-driven businesses and their development. It is a huge improvement on its predecessor; it goes a long way to ensure that the law has protections for data subjects and sets out how companies can lawfully use and reuse data. It is just as crucial in the era of AI that, during the passage of the Bill through the House, we do not leave the door open for personal data to be ruthlessly exploited by the big tech companies. We would all be damaged if that was allowed to happen.