Large Language Models and Generative AI (Communications and Digital Committee Report)

Lord Tarassenko Excerpts
Thursday 21st November 2024

(3 weeks, 6 days ago)

Lords Chamber
Read Full debate Read Hansard Text Watch Debate Read Debate Ministerial Extracts
Lord Tarassenko Portrait Lord Tarassenko (CB)
- View Speech - Hansard - -

My Lords, I draw the House’s attention to my registered interests as a director of Oxehealth, a University of Oxford spin-out that uses AI for healthcare applications.

It is a great pleasure to follow the noble Lord, Lord Ranger, in this debate. Like him, in the time available I will speak mostly about the opportunities for the UK, more specifically in one sector. I congratulate the noble Baroness, Lady Stowell, on her excellent report, which she would probably like to know has been discussed positively in the common rooms in Oxford with the inquiry’s expert, Professor Mike Wooldridge.

It is not even 10 months since the report was published, but we already have new data points on the likely trajectories of large language models. The focus is shifting away from the pre-training of LLMs on ever-bigger datasets to what happens at inference time, when these models are used to generate answers to users’ queries. We are seeing the introduction of chain-of-reasoning techniques, for example in GPT-4o1, to encourage models to reason in a structured, logical and interpretable way. This approach may help users to understand the LLM’s reasoning process and increase their trust in the answers.

We are also seeing the emphasis shifting from text-only inputs into LLMs to multimodal inputs. Models are now being trained with images, videos and audio content; in fact, we should no longer call them large language models but large multimodal models—LMMs.

We are still awaiting the report on the AI opportunities action plan, written by Matt Clifford, the chair of ARIA, but we already know that the UK has some extraordinary datasets and a strong tradition of trusted governance, which together represent a unique opportunity for the application of generative AI.

The Sudlow review, Uniting the UK’s Health Data: A Huge Opportunity for Society, published two weeks ago tomorrow, hints at what could be achieved though linking multiple NHS data sources. The review stresses the need to recognize that national health data is part of the critical national infrastructure; we should go beyond this by identifying it as a sovereign data asset for the UK. As 98% of the 67 million UK citizens receive most of their healthcare from the NHS, this data is the most comprehensive large-scale healthcare dataset worldwide.

Generative AI has the potential to extract the full value from this unique, multimodal dataset and deliver a step change in disease prevention, diagnosis and treatment. To unlock insights from the UK’s health data, we need to build a sovereign AI capability that is pre-trained on the linked NHS datasets and does not rely on closed, proprietary models such as Open AI’s GPT-4 or Google’s Gemini, which are pre-trained on the entire content of the internet. This sovereign AI capability will be a suite of medium-scale sovereign LMMs, or HealthGPTs if you will, applied to different combinations of de-identified vital-sign data, laboratory tests, diagnostics tests, CT scans, MR scans, pathology images, discharge summaries and outcomes data, all available from the secure data environments—SDEs—currently being assembled within the NHS.

Linked datasets enable the learning of new knowledge within a large multimodal model; for example, an LMM pre-trained on linked digital pathology data and CT scans will be able to learn how different pathologies appear on those CT scans. Of course, very few patients will have a complete dataset, but generative AI algorithms can naturally handle the variability of each linked record. In addition, each LMM dataset can be augmented by text from medical textbooks, research papers and content from trusted websites such as those maintained by, for example, NHS England or Diabetes UK.

A simple example of such an LMM will help to illustrate the power of this approach for decision support—not decision-making—in healthcare. Imagine a patient turning up at her GP practice with a hard-to-diagnose autoimmune disease. With a description of her symptoms, together with the results of lab tests and her electronic patient record—EPR—data, DrugGPT, which is currently under development, will not only suggest a diagnosis to the GP but also recommend the right drugs and the appropriate dosage for that patient. It will also highlight any possible drug-drug interactions from knowledge of the patient’s existing medications in her EPR.

Of course, to build this sovereign LMM capability, the suite of HealthGPTs such as DrugGPT will require initial investment, but within five years such a capability should be self-funding. Access to any HealthGPT from an NHS log-in would be free; academic researchers, UK SMEs and multinationals would pay to access it through a suitable API, with a variable tariff according to the type of user. This income could be used to fund the HealthGPT lab, as well as the data wrangling and data curation activities of the teams maintaining the NHS’s secure data environments, with the surplus going to NHS trusts and GP practices in proportion to the amount of de-identified data which they will have supplied to pre-train the HealthGPTs.

For the general public to understand the value of the insights generated by these sovereign LMMs, a quarterly report on key insights would be sent out through the NHS app to all its 34 million users—three-quarters of the adult population in the UK—except to those who have opted out of having their data used for research.

The time is right to build a sovereign AI capability for health based on a suite of large multimodal models, which will improve the health of the nation, delivering more accurate diagnoses and better-targeted treatments while maximising the value of our NHS sovereign data asset. I hope the Minister will agree with me that this is an opportunity which the UK cannot afford to miss.

Science and Technology: Economy

Lord Tarassenko Excerpts
Thursday 31st October 2024

(1 month, 2 weeks ago)

Lords Chamber
Read Full debate Read Hansard Text Watch Debate Read Debate Ministerial Extracts
Lord Tarassenko Portrait Lord Tarassenko (CB)
- View Speech - Hansard - -

My Lords, I declare an interest as a director of Oxford University Innovation. I thank the noble Viscount, Lord Stansgate, for securing this debate and for his thorough review of UK science and technology.

I was surprised, however, that there was no mention of the 2024 Nobel Prizes for Chemistry and Physics. Earlier this month, we celebrated the award of the Nobel Prize for Physics to Professor Geoff Hinton. The next day, we celebrated the award of the Nobel Prize for Chemistry to Sir Demis Hassabis. Both are home PhD students from British universities— Edinburgh and UCL. Both ended up working for the same US big-tech company, a company currently valued at $2 trillion.

What would it take for a deep-tech company to emerge from one of our research-intensive universities and become the UK’s first $1 trillion company? First, it would need a good supply of home PhD students. Unfortunately, there is mounting evidence that an unintended consequence of the rise in the undergraduate tuition fee to £9,000 is a loss of home PhD students in STEM subjects.

For example, from 2019 to 2022, there has been a decrease of 39% in the number of UK-domiciled computer science graduates in doctoral study 15 months after graduation. We know that the least well-off students graduate with the most debt; now it looks as though many students from disadvantaged backgrounds will no longer consider PhD study as a financially viable option. This trend must be reversed.

Secondly, the spinout ecosystem that exists around our research-intensive universities must be nurtured further, not only with human capital but with finance to start new companies, scale them and grow them. The £40 million over five years of proof-of-concept funding to seed new spinouts announced in yesterday’s Budget is welcome, but it should be targeted where it is needed—outside the golden triangle.

Thirdly, a coherent industrial strategy makes it more likely that the UK’s first trillion-dollar company will emerge by 2035. The UK on its own is not able to invest in all possible areas of science and technology. Even in my area, described as “digital and technologies” in the industrial strategy Green Paper, further choices will have to be made.

The House of Lords Select Committee report on large language models called for a “sovereign LLM capability”, but it would be pointless, because of the prohibitive costs, to compete with US big tech companies in training hyperscale LLMs such as GPT-4 or Gemini—other LLMs are also available. Instead, and this chimes in with the excellent maiden speech of the noble Baroness, Lady Freeman, we should be backing UK companies developing trustworthy AI using medium-scale LLMs and proprietary datasets, giving them privileged access to our sovereign data assets.

The final piece of the jigsaw is the scale-up funding available to British tech firms, which is still mostly missing. The target set by the Mansion House compact to have 10 of the UK’s largest pension funds invest 5% of their assets in private ventures needs to be met by 2030. If progress is also made on sorting out the issues highlighted in the Harrington report on foreign direct investment into the UK, there is a fighting chance that a British tech company, with its roots in one of our universities, will reach a trillion-dollar valuation within the next decade.