Protection of Freedoms Bill

Debate between Lord Lucas and Baroness O'Neill of Bengarve
Thursday 12th January 2012

(13 years, 9 months ago)

Grand Committee
Read Full debate Read Hansard Text Read Debate Ministerial Extracts
Baroness O'Neill of Bengarve Portrait Baroness O'Neill of Bengarve
- Hansard - - - Excerpts

My Lords, I want to speak to Amendments 147A, 147B, 148A, 148C and 148D. I will also comment, but much more briefly, on the more comprehensive Amendment 151, in the names of the noble Baronesses, Lady Brinton, Lady Warwick and Lady Benjamin, which I support, and I will comment very briefly on one of the amendments in the name of the noble Lord, Lord Lucas. Before doing so, I would like very much to thank the Minister and the Bill team for their exemplary courtesy and helpfulness in explicating their thinking on Clause 100—not, I think, the simplest clause of the Bill. If we have not reached agreement, it is not for lack of effort on their part.

Secondly, I would like to make it entirely clear that I am in favour of making scientific data more open. Science needs openness for its own purposes; it needs to have open data so that it is possible for others to check and challenge, and openness allows data to be put to unanticipated uses. Therefore, I am in much sympathy with the overall purpose of this part of the Bill. Of course, it used to be feasible—and it was standard practice—to publish data within articles in scientific journals. That is no longer feasible because of the size and complexity of many scientific data sets, so openness now has to be sought in other ways.

However, I believe that the Bill is based on too confident a view of the effectiveness and adequacy of the system of exemptions established in the Freedom of Information Act 2000 and of their capacity to avoid undesirable and unintended effects—particularly in this area, which is essentially that of scientific databases. Clause 100 proposes a seemingly minor, but in fact very substantial, change in the application of the freedom of information requirements to the release of data sets by public authorities. I will not at this stage say anything further about the use of the term “public authority”, as I think that we all understand that this means a publicly funded authority, which may, however, be a research institution or university that also has charitable status.

On the surface, Clause 100 simply requires the release of data sets in reusable electronic form, but I believe that in practice its demands will create a number of risks and problems. Let me therefore begin with Amendment 147A. The present drafting of the clause is, I believe, ambiguous, in that it requires data to be released upon request if the data are, or form part of, a data set held by a public authority. Amendment 147A seeks to restrict that requirement to “completed” parts of a data set held by a public authority. While it is reasonable to require that completed parts of still incomplete data sets be disclosed if requested—for example, the data pertaining to a past year in a continuously updated series—there is no benefit to anybody in disclosing an incomplete part of a data set. Indeed, requiring disclosure of incomplete parts of data sets could be misleading as well as damaging to research projects and to those provided with the incomplete, and perhaps misleading, data.

The clause would currently require disclosure of data sets while data were still being entered and had not yet been checked. At that stage, the incomplete part of the data set might be misleading. To take the example of a multi-centre clinical trial, requests for disclosure of incomplete parts of the data set could lead to the release of data that related only to a distinctive subset of patients whose data happened to become available at an earlier stage than those of other subsets of patients whose results might differ—that is, after all, the reason why the structure of clinical trials is quite elaborate. Such misleading releases might, I fear, falsely raise or dash the hopes of patients suffering from a serious condition, who would read the incomplete data set released as indicating that they had grounds for hope or despair.

I think that this issue arises because the drafting actually conflates two very different types of incompleteness in data sets. A data set may be incomplete because it relates to an ongoing project. In this case, completed parts of that data set relating, for example, to completed periods or phases in the project may indeed be available and could be released upon request.

In the second case, a data set or parts of a data set may be incomplete because the data are not yet fully available for entry, have not yet been entered or have not yet been checked. It could be highly misleading to require disclosure in the second case. Amendment 147A seeks to limit such requirements to disclose to the completed parts of data sets, where the danger of misleading is less.

Secondly, Amendment 147B requires that access is provided on request to data sets in reusable electronic form. Again, I stress that this is in principle an admirable thought. Where a data set is, for example, a relatively simple spreadsheet, this requirement would create no more difficulty for research databases than it does for government data sets. However, some scientific data sets are of orders of magnitude larger and do not use standard software; even if it is feasible, it may be extremely costly to render them usable by others or, indeed, reusable even by others with technical skills. We have to remember that those of whom data are requested will not know the skills of those who request them. In such cases it may be necessary to provide metadata or to process data further in order to make access to them more feasible even for competent others. It is more usual to make research data available by archiving data sets or by setting out a publication or so called data sharing scheme that will provide access for others and also secure the crucial benefits of professional data curation and data security.

Amendment 148B will permit holders of research data to undertake to provide those data using these normal and reliable routes. At present, the Freedom of Information Act grants an exemption once data sets have already been placed in the public domain in this way, such as in a data archive or through a data sharing scheme. This amendment seeks to postpone access where such archiving is not merely foreseen but is something that data holders have undertaken to provide. In effect, it would create a temporary exemption for the data concerned. The Minister might see this as an opening for procrastination. However, if he is sympathetic to the realities of the problem, he might perhaps wish to consider at least a version of the amendment that offers a limited time for this exemption—for example, six months after the completion of the relevant research project or phase of the research project. It is a question of trading off quality for instant gratification, I suppose.

Amendment 148A concerns the charging of fees. It seeks to address the real financial implications of seeking to make large and complex data sets available for reuse. The Bill provides for the charging of fees but does not allow public authorities to take account of the real costs of making data available to others. These costs may include not only additional checking and making metadata available but above all—and this is the main concern in the scientific community—the diversion of highly skilled and specialised time from research projects to the satisfaction of freedom of information requests. I have drafted the amendment to make it clear that it is the real costs of disclosure that matter. As noble Lords will have noted from the very helpful briefing provided for this section of the Bill by Universities UK, these costs can be very significant. It would not be reasonable, in my view, to require research projects or universities to bear these costs, which they cannot in principle have known about when seeking and obtaining the funding to do the research.

The last two amendments to which I shall speak very briefly are Amendments 148C and 148E, which are relatively uncontroversial. At present, the Bill restricts the operations that may be performed on data sets prior to required disclosure to calculation. That is just unrealistic. Those who compile data sets also need to check the data, which will be done using a variety of methods, and take steps to ensure data integrity and security, particularly at the point at which data are to be disclosed on request. Amendment 148C provides for this; Amendment 148E is consequential on Amendment 148C.

On Amendment 148, tabled by the noble Lord, Lord Lucas, from what I have already said and what the UUK briefing—now supported by the Academy of Medical Sciences, the Wellcome Trust and other scientific and medical bodies—has documented, the complexity of scientific databases rules out a solution along these lines. It would be very nice if it were feasible, but I believe that it is not feasible.

Amendment 151, tabled by the noble Baronesses, Lady Brinton, Lady Benjamin and Lady Warwick, is a substantial amendment. It takes the more radical step of seeking to define an additional exemption to freedom of information requirements and in the process achieves a number of the specific objectives that I have tried to achieve by more economical means in the amendments that I have tabled. However, their approach has one great advantage, which I believe—although I have racked my brains on this one—cannot be achieved by the more modest approach that I have taken. It recognises the risks to UK science and business and to the personal safety of researchers in certain fields—for example, involving work with animals—and to research subjects that will be created by Clause 100 if it is not amended. We are simply being naive if we imagine that we can rely on all those who request data respecting the intellectual property of those whose efforts produce data sets. We no longer live in a world where that is true, and we can all imagine many scenarios in which data disclosure is sought on behalf of others who work in jurisdictions where intellectual property is widely disrespected, with the aim of getting a free ride on the basis of work done by others without the payment of any fees. In those jurisdictions, legal remedies are not effective. I look forward to hearing a great deal more about Amendment 151. I beg to move.

Lord Lucas Portrait Lord Lucas
- Hansard - -

My Lords, I have a clutch of amendments in this group. I will not at this moment comment on those proposed by the noble Baroness, Lady O’Neill, although I am looking forward to listening to others’ contributions on that subject. But it is very important that when a group of scientists ask us as a Government or community to take action based on results that they have published, the data underlying those results must be open to scrutiny. I understand that that has a difficult interaction with the questions raised by the noble Baroness, but I look forward to others’ contribution on how to solve that.

The first amendment that I have in the group is Amendment 148. I should declare that I am an extensive user of freedom of information legislation, particularly as regards universities, which I have found unutterably tiresome and difficult to deal with. One of their more tiresome habits is to refuse to provide information in anything other than PDF format. They get it in Excel, or whatever form, and translate it into PDF to provide it to me, merely to cause me extra work. I have to buy a program to suck it out of the PDF again. PDF is not a transmissible format, as it were, and they are merely trying to make life difficult by putting it in that format. So I would like to be sure that when data are provided they are provided in a properly reusable format. I have never come across a data set that cannot be reduced to tabbed, delimited text. Maybe that happens in a collection of tables, but data are essentially a simple thing. Although the data may be held in an immensely complex form in the program that the scientists are using, in any program that I have come across it should be easy—if only for the purposes of sharing with other people—to drop out at least the base data into relatively simple form.