test tubes

A group of researchers publishes an article based on their taxpayer-supported research. The article is full of analysis, graphs, charts and statistical comparisons. But none of the underlying “raw” data is in the report or anywhere in the public domain.

An interested and informed researcher thinks something is odd in the observations or the analysis, or sees a way to possibly extract more results from the data. He emails the researchers and asks to see the data. The request is refused.

What next? Well, in Australia, it’s the end of the story. The national policy guideline only “recommends” data generated with public funds and used in a publication be given to anyone making a reasonable request. In Britain, Canada and the US, it is the “expectation” — and, in some cases, the obligation — that researchers meet the request.

Open access to already-used data has two benefits. First, it allows any reader to check the calculations made by researchers in a published article. Although scientific articles are reviewed by other knowledgeable scientists prior to publication, these peer reviewers almost never ask for the original data to see if the data support the reported calculations and relationships. They just take the calculations on faith.

This state of affairs might suit some researchers, as it works to hide errors. Other researchers in the field might sense something is wrong and just avoid giving too much credence to the result. But this leaves other researchers and the general public none the wiser. The publication on the CV continues to look as robust as any other.

Without open access to the data used in a publication there is, in effect, no way to verify the researchers’ claims, and they can use their unchecked and effectively uncheckable claims to advance their careers and seek additional public funding.

The second major benefit of open access to used data is that a reader may see a better way to do the analysis or see another interesting consequence of the data. A further insight may embarrass the original researchers, but the protection of a few egos hardly outweighs the benefits to society of another researcher squeezing additional value out of the data society paid for in the first place.

Occasionally, there may be a legitimate need to withhold the data used in an article. Perhaps a second article is planned or a patent is pending. But, if so, this could be stated in the article and an embargo period — say, no more than a year — specified. In reality, however, second articles are rarely based exclusively on the original data set (why not let it all hang out at first go?) and, when blended with new data, are protected from pre-emption by the rightly privileged new data.

There is now much talk about encouraging innovation in Australia. A policy of open access to already published data would increase the quality and productivity of Australian science and lead to the greater innovative use of that science.

*Allen Greer is a biologist who writes about nature and science. He became interested in the issue discussed here when he was refused a small data set from a published article based on research supported by private donations and two Australian Research Council grants worth a total of nearly $1 million.

Peter Fray

Fetch your first 12 weeks for $12

Here at Crikey, we saw a mighty surge in subscribers throughout 2020. Your support has been nothing short of amazing — we couldn’t have got through this year like no other without you, our readers.

If you haven’t joined us yet, fetch your first 12 weeks for $12 and start 2021 with the journalism you need to navigate whatever lies ahead.

Peter Fray
Editor-in-chief of Crikey