(3 days, 2 hours ago)
Lords ChamberMy Lords, I speak in support of the noble Baroness, Lady Kidron, on Amendment 58, to which I have also put my name. Given the time, I will speak only about NHS datasets.
There have been three important developments since the Committee stage of this Bill in mid-December: the 43rd annual J P Morgan healthcare conference in San Francisco in mid-January, the launch of the AI Opportunities Action Plan by the Prime Minister on Monday 13 January and the announcement of the Stargate project in the White House the day after President Trump’s inauguration.
Taking these in reverse chronological order, it is not clear exactly how the Stargate project will be funded, but several US big tech companies and SoftBank has pledged tens of billions of dollars. At least $100 billion will be available to build the infrastructure for next-generation AI, and it may even rise to $500 billion in the next four years.
The UK cannot match these sums. The AI Opportunities Action Plan instead lays out how the UK can compete by using its own advantages: a long track record of world-leading AI research in our universities and some unique, hugely valuable datasets.
At the JP Morgan conference in San Francisco, senior NHS management had more than 40 meetings with AI companies. These companies all wanted to know one thing: how and when they could access NHS datasets.
It is not surprising, therefore, that it was reported in November that the national federated data platform would soon be used to train different types of AI models. The two models mentioned were Open AI’s proprietary ChatGPT and Google’s medical AI, Med-Gemini, based on Google’s proprietary large language model, Gemini. Presumably, these models will be fine-tuned using the data stored in the federated data platform.
Amendment 58 is not about restricting access to UK datasets by Open AI, Google or any other US big tech company. Instead, it seeks to maximise their long- term value, driven by strategic goals rather than short-term, opportunistic gains. By classifying valuable public sector datasets as sovereign data assets, we can ensure that the data is made available under controlled conditions, not only to public sector employees and researchers but to industry, including US big tech companies.
We should expect a financial return when industry is given access to a sovereign dataset. A first condition is a business model such that income is generated for the relevant public body, in this case the NHS, from the access fees paid by the companies that will be the authorised licence holders.
A second condition is signposted in the AI Opportunities Action Plan, whose recommendations have all been accepted by the Government. In the third section of the action plan, “Secure our future with homegrown AI”, Matt Clifford, the author of the plan, writes that
“we must be an AI maker, not just an AI taker: we need companies … that will be our UK national champions … Generating national champions will require a more activist approach”.
Part of this activist approach should be to give companies and organisations headquartered in the UK preferential terms of access to our sovereign data assets.
These datasets already exist in the NHS as minimum viable products, so we cannot afford to delay. AI companies are keen to access data in the federated data platform, which is NHS England’s responsibility, or in the secure data environments set up by the National Institute for Health and Care Research, NIHR.
I urge the Government to accept the principles of this amendment as they will provide the framework needed now to support NHS England and NIHR in their negotiations with AI companies.
I have signed Amendment 58. I also support the other amendment spoken to by the noble Baroness, although I did not get around to signing it. They both speak to the same questions, some of which have been touched on by both previous speakers.
My route into this was perhaps a little less analytic. I used to worry about the comment lots of people used to make, wittily, that data was the new oil, without really thinking about what that meant or what it could mean. It began to settle in my mind that, if indeed data is an asset, why is it not carried on people’s balance sheets? Why does data held by companies or even the Government not feature in some sort of valuation? Just like oil held in a company or privately, it will eventually be used in some way. That releases revenue that would otherwise have to be accounted for and there will be an accounting treatment. But as an accountant I have never seen any company’s assets that ever put a value on data. That is where I came from.
A sovereign data approach, which labels assets of value to the economy held by the country rather than a company, seems to be a way of trying to get into language what is more of an accounting approach than perhaps we need to spend time on in this debate. The noble Baroness, Lady Kidron, has gone through the amendment in a way that explains the process, the protection and the idea that it should be valued regularly and able to account for any returns it makes. We have also heard about the way it features in other publications.
I want to take a slightly different part of the AI Opportunities Action Plan, which talks about data and states:
“We should seek to responsibly unlock both public and private data sets to enable innovation by UK startups and researchers and to attract international talent and capital. As part of this, government needs to develop a more sophisticated understanding of the value of the data it holds, how this value can be responsibly realised, and how to ensure the preservation of public trust across all its work to unlock its data assets”.
These are very wise words.
I end by saying that I was very struck by the figures released recently about the number of people who opted out of the NHS’s data collection. I think there are Members present who may well be guilty of such a process. I of course am happy to have my data used in a way that will provide benefit, but I do recognise the risks if it is not properly documented and if people are not aware of what they are giving up or offering in return for the value that will be extracted from it.
I am sure we all want more research and better research. We want research that will yield results. We also want value and to be sure that the data we have given up, which is held on our behalf by various agencies, is properly managed. These amendments seem to provide a way forward and I recommend them.
(2 months, 1 week ago)
Lords ChamberMy Lords, I draw the House’s attention to my registered interests as a director of Oxehealth, a University of Oxford spin-out that uses AI for healthcare applications.
It is a great pleasure to follow the noble Lord, Lord Ranger, in this debate. Like him, in the time available I will speak mostly about the opportunities for the UK, more specifically in one sector. I congratulate the noble Baroness, Lady Stowell, on her excellent report, which she would probably like to know has been discussed positively in the common rooms in Oxford with the inquiry’s expert, Professor Mike Wooldridge.
It is not even 10 months since the report was published, but we already have new data points on the likely trajectories of large language models. The focus is shifting away from the pre-training of LLMs on ever-bigger datasets to what happens at inference time, when these models are used to generate answers to users’ queries. We are seeing the introduction of chain-of-reasoning techniques, for example in GPT-4o1, to encourage models to reason in a structured, logical and interpretable way. This approach may help users to understand the LLM’s reasoning process and increase their trust in the answers.
We are also seeing the emphasis shifting from text-only inputs into LLMs to multimodal inputs. Models are now being trained with images, videos and audio content; in fact, we should no longer call them large language models but large multimodal models—LMMs.
We are still awaiting the report on the AI opportunities action plan, written by Matt Clifford, the chair of ARIA, but we already know that the UK has some extraordinary datasets and a strong tradition of trusted governance, which together represent a unique opportunity for the application of generative AI.
The Sudlow review, Uniting the UK’s Health Data: A Huge Opportunity for Society, published two weeks ago tomorrow, hints at what could be achieved though linking multiple NHS data sources. The review stresses the need to recognize that national health data is part of the critical national infrastructure; we should go beyond this by identifying it as a sovereign data asset for the UK. As 98% of the 67 million UK citizens receive most of their healthcare from the NHS, this data is the most comprehensive large-scale healthcare dataset worldwide.
Generative AI has the potential to extract the full value from this unique, multimodal dataset and deliver a step change in disease prevention, diagnosis and treatment. To unlock insights from the UK’s health data, we need to build a sovereign AI capability that is pre-trained on the linked NHS datasets and does not rely on closed, proprietary models such as Open AI’s GPT-4 or Google’s Gemini, which are pre-trained on the entire content of the internet. This sovereign AI capability will be a suite of medium-scale sovereign LMMs, or HealthGPTs if you will, applied to different combinations of de-identified vital-sign data, laboratory tests, diagnostics tests, CT scans, MR scans, pathology images, discharge summaries and outcomes data, all available from the secure data environments—SDEs—currently being assembled within the NHS.
Linked datasets enable the learning of new knowledge within a large multimodal model; for example, an LMM pre-trained on linked digital pathology data and CT scans will be able to learn how different pathologies appear on those CT scans. Of course, very few patients will have a complete dataset, but generative AI algorithms can naturally handle the variability of each linked record. In addition, each LMM dataset can be augmented by text from medical textbooks, research papers and content from trusted websites such as those maintained by, for example, NHS England or Diabetes UK.
A simple example of such an LMM will help to illustrate the power of this approach for decision support—not decision-making—in healthcare. Imagine a patient turning up at her GP practice with a hard-to-diagnose autoimmune disease. With a description of her symptoms, together with the results of lab tests and her electronic patient record—EPR—data, DrugGPT, which is currently under development, will not only suggest a diagnosis to the GP but also recommend the right drugs and the appropriate dosage for that patient. It will also highlight any possible drug-drug interactions from knowledge of the patient’s existing medications in her EPR.
Of course, to build this sovereign LMM capability, the suite of HealthGPTs such as DrugGPT will require initial investment, but within five years such a capability should be self-funding. Access to any HealthGPT from an NHS log-in would be free; academic researchers, UK SMEs and multinationals would pay to access it through a suitable API, with a variable tariff according to the type of user. This income could be used to fund the HealthGPT lab, as well as the data wrangling and data curation activities of the teams maintaining the NHS’s secure data environments, with the surplus going to NHS trusts and GP practices in proportion to the amount of de-identified data which they will have supplied to pre-train the HealthGPTs.
For the general public to understand the value of the insights generated by these sovereign LMMs, a quarterly report on key insights would be sent out through the NHS app to all its 34 million users—three-quarters of the adult population in the UK—except to those who have opted out of having their data used for research.
The time is right to build a sovereign AI capability for health based on a suite of large multimodal models, which will improve the health of the nation, delivering more accurate diagnoses and better-targeted treatments while maximising the value of our NHS sovereign data asset. I hope the Minister will agree with me that this is an opportunity which the UK cannot afford to miss.
(3 months ago)
Lords ChamberMy Lords, I declare an interest as a director of Oxford University Innovation. I thank the noble Viscount, Lord Stansgate, for securing this debate and for his thorough review of UK science and technology.
I was surprised, however, that there was no mention of the 2024 Nobel Prizes for Chemistry and Physics. Earlier this month, we celebrated the award of the Nobel Prize for Physics to Professor Geoff Hinton. The next day, we celebrated the award of the Nobel Prize for Chemistry to Sir Demis Hassabis. Both are home PhD students from British universities— Edinburgh and UCL. Both ended up working for the same US big-tech company, a company currently valued at $2 trillion.
What would it take for a deep-tech company to emerge from one of our research-intensive universities and become the UK’s first $1 trillion company? First, it would need a good supply of home PhD students. Unfortunately, there is mounting evidence that an unintended consequence of the rise in the undergraduate tuition fee to £9,000 is a loss of home PhD students in STEM subjects.
For example, from 2019 to 2022, there has been a decrease of 39% in the number of UK-domiciled computer science graduates in doctoral study 15 months after graduation. We know that the least well-off students graduate with the most debt; now it looks as though many students from disadvantaged backgrounds will no longer consider PhD study as a financially viable option. This trend must be reversed.
Secondly, the spinout ecosystem that exists around our research-intensive universities must be nurtured further, not only with human capital but with finance to start new companies, scale them and grow them. The £40 million over five years of proof-of-concept funding to seed new spinouts announced in yesterday’s Budget is welcome, but it should be targeted where it is needed—outside the golden triangle.
Thirdly, a coherent industrial strategy makes it more likely that the UK’s first trillion-dollar company will emerge by 2035. The UK on its own is not able to invest in all possible areas of science and technology. Even in my area, described as “digital and technologies” in the industrial strategy Green Paper, further choices will have to be made.
The House of Lords Select Committee report on large language models called for a “sovereign LLM capability”, but it would be pointless, because of the prohibitive costs, to compete with US big tech companies in training hyperscale LLMs such as GPT-4 or Gemini—other LLMs are also available. Instead, and this chimes in with the excellent maiden speech of the noble Baroness, Lady Freeman, we should be backing UK companies developing trustworthy AI using medium-scale LLMs and proprietary datasets, giving them privileged access to our sovereign data assets.
The final piece of the jigsaw is the scale-up funding available to British tech firms, which is still mostly missing. The target set by the Mansion House compact to have 10 of the UK’s largest pension funds invest 5% of their assets in private ventures needs to be met by 2030. If progress is also made on sorting out the issues highlighted in the Harrington report on foreign direct investment into the UK, there is a fighting chance that a British tech company, with its roots in one of our universities, will reach a trillion-dollar valuation within the next decade.