(3 days, 20 hours ago)
Lords Chamber
Lord Vallance of Balham (Lab)
I thank the noble Lord for his questions, and I echo his points about the Front-Bench spokesmen on the other side. It is very clear that everyone has the same intent here, which is to try to sort this out, and I welcome those inputs.
I agree that this is cultural as well as technical; those points need to be looked at and will be as part of the review. There is an unwavering commitment to UK Biobank; it is an extraordinarily important resource for the future health of the country and for ensuring that new discoveries are made. We will continue to support UK Biobank.
Lord Tarassenko (CB)
I declare two conflicts of interest. First, I was a participant in UK Biobank, having been recruited in Oxford in 2007. My wife is also a participant and has been far more assiduous—for example, she underwent whole-body imaging about two years ago, which took a whole day. We are both very glad to be participants, but I do not think that the risk of reidentification is that high. It is not zero, but it is a very low probability. That is why I am happy to declare that conflict of interest.
Secondly, I was a UK Biobank principal investigator in a study carried out about 10 years ago, which was to develop AI algorithms for the automated identification of atrial fibrillation and which had about 100,000 participants, who undertook a test on an exercise bike as part of the UK Biobank dataset acquisition. In those days—before 2024, as the Minister said—data would be transferred and held securely on our servers under a material transfer agreement. When the paper was eventually published in 2020, we deleted all data on our servers, as all principal investigators are meant to do. As we have already heard, UK Biobank shifted its policy to a cloud-first model to enhance security, so what happened in the study I was involved in no longer takes place.
My question is about legacy data from before 2024. Does UK Biobank have any estimate, even a semi- accurate estimate, of non-compliant pre-2024 principal investigators? Does the Minister agree with me that UK Biobank should work with data privacy researchers —for example, those from the Oxford Internet Institute, as was mentioned by the noble Lord, Lord Clement-Jones—to be much more proactive at identifying non-compliance, as part of these investigations?
Lord Vallance of Balham (Lab)
Let me thank the noble Lord, Lord Tarassenko, and his wife for participating in UK Biobank, because the whole thing depends on that. He is quite right that, before 2024, the data was downloaded and people did their research on downloaded data. I have had this discussion with the chair of UK Biobank, which is going through a process of recontacting all the institutions—because this is an institutional agreement—to confirm that the data that was downloaded has been deleted. No further access will be granted until that is proven. That process is important, because that residual downloaded data is most vulnerable.
Lord Vallance of Balham (Lab)
UK Biobank, for all the reasons stated, is expensive to run, and it is run with a mix of funding from government, charities and industry, with the major funders being the UK Government and the Wellcome Trust over many years. The principle of it has been to give access to people; therefore, there is not a big cost put on its users. On our approach, we knew that the leak was in China, and we therefore immediately asked the embassy in China to link to the Government there to see if they could help us get these taken off the website. We did not make any conclusion about where they had come from; we just thought that that would probably be the fastest way to get these removed.
Lord Tarassenko (CB)
May I ask the Minister a follow-up question to the one I asked previously? He is absolutely right that UK Biobank should get in touch with the institutions where the principal investigators are based, but a lot of inadvertent leakage, if you will, of the data occurs from the researchers themselves—the principal investigators—who, believe it or not, will put the data on GitHub. They may leave the institution and go and work somewhere else while the data remains on their GitHub. That is why I asked whether the UK Biobank board could be a little more proactive and ask researchers from the Oxford Internet Institute, for example, who are very capable at looking at those types of issues, to look at individual GitHub sites and other sites where the data may still be, even though the institutions which those principal investigators were at would not be aware of it.
Lord Vallance of Balham (Lab)
Yes, we are very aware of the possibility that there are things on GitHub. There has been a GitHub issue related to this, which was identified earlier this year, and that will be part of what UK Biobank looks at. Going forward, that will not be possible because of the inability to download.
(10 months, 3 weeks ago)
Lords Chamber
Lord Vallance of Balham (Lab)
Oddly enough, I am aware of how difficult this is and how much work is needed. The requirements range from data to the ability to get it into a form that can be read and be interoperable; that is behind the national data library and the health data research service which we have announced. There are skills issues right across the Civil Service and elsewhere which need to be addressed, with skills increased, along with the application uptake of AI by businesses across the UK. All those are part of the AI Opportunities Action Plan, and there are things under way in each of those areas. I do not think this is straightforward. It will require some experimentation. There will be some things that will not work as well as expected and others that we will need to move faster on. I expect this to be a very dynamic field over the next few years.
Lord Tarassenko (CB)
My Lords, a strong emphasis in the AI Opportunities Action Plan is the development of human capital to maintain the UK as one of the leading countries in the world for AI. In the last four years for which the figures are available there has been a decrease of 39% in the percentage of UK-domiciled computer science graduates undertaking doctorates. This year, the situation is likely to be even worse, as for the first time EU students finishing a four-year undergraduate course in the UK will no longer count as home students. Is the Minister as concerned as I am by the sharp decrease in home students undertaking PhDs in computer science and AI? Are the Government considering any measures to reverse this trend, perhaps by reducing the interest rate on undergraduate loans to zero while graduate students are doing their PhDs?
Lord Vallance of Balham (Lab)
As the noble Lord points out, there has been a decrease in PhD funding through UKRI from 2018 to 2022. The overall number of PhD students has not gone down, but the sources of funding have become more diversified. It is an important issue for the UK to be good and capable in the numbers of PhD students we have. Two new programmes are being developed as part of the AI opportunities plan: the AI fellowship programme and the AI scholarship programme. Both will be important to ensure that we have the skills we need to deliver on the plan. I take the point about the number of students who have gone from computer science into PhDs. That is an area that we need to look at and understand. As the noble Lord is aware, some of it is a classification question, in relation to EU students, but there is no doubt that we need to keep the number of students doing PhDs up.