By Andrea Chiarelli
I have just returned from the 12th International Digital Curation Conference (IDCC), which took place in Edinburgh between February 20th and February 23rd, 2017. It was a great chance to catch up with some colleagues and we presented a poster on where and how researchers currently share research data – the outcome of a Digital Assessment Framework survey we ran for Jisc in 2016.
In this post, I would like to take a moment to reflect upon the main topics that were discussed at the 2017 IDCC and share some thoughts with you. This year’s theme was “Upstream, downstream: embedding digital curation workflows for data science, scholarship and society” and the stars of the conference were Research Data Management (RDM) and digital curation. While the former has been around for a while, it is still far from being a mainstream practice and this is closely related to the low penetration of the latter in academic workflows. To put it in layman’s terms, the lack of digital curation leads to the equivalent of going to a library where the books haven’t been edited and properly sorted. Would you be able to easily find what you’re after and re-use it as a starting point for further work? Probably not!
Making the case for RDM
It seems that, at this point, the benefits of RDM should be clear. Nonetheless, they remain largely qualitative and little factual evidence is available. At the IDCC, Paul Stokes (senior co-design manager at Jisc) presented an interesting poster, partly informed by some work we did for Jisc last year, on the cost and value of RDM. The main feeling is that a robust case for RDM still needs to be made, especially considering that the UK government will invest £26 billion in research and scientific infrastructure over the next five years (see Paul’s poster for more details). I strongly believe that RDM should be pursued for a mix of somewhat idealised reasons such as data reproducibility, re-use, and long-term access. However, it feels like so far nobody has successfully managed to quantify the impact this would have on research and society.
Integrating the RDM culture within institutions
The question that follows is: How can institutions effectively advocate for RDM and data curation without resorting to qualitative and generic messages?
The University of Delft built an interesting representation of the advocacy and integration process by describing academic institutions as intricate machines, with the research community separated from the library services by a long series of interconnected gears. These represent, among others, the data itself, finance, legal services, human resources, ICT services, senior management, and academic leadership. This made me fully realise how the theme of the 2017 IDCC, embedding digital curation in academic workflows, is a very ambitious endeavour. Its success will depend not only on great technical and organisational skills, but also on wise stakeholder management.
The implementation of RDM and data curation practices in institutions is not as advanced as we’d like, as highlighted by Iain Hrynaszkiewicz (Head of Data Publishing at Springer Nature). In his presentation, he said that 88% of funder policies motivate researchers to share data but 54% of researchers find complying with policies difficult. If I may re-use the University of Delft’s metaphor, I would say that the gears in the RDM and data curation landscape need some lubricating, as there is a proven need for support on the researchers’ end! Thus, the next question is: How do we know whether we are going in the right direction?
One of the main steps in all long-term and impactful projects is monitoring progress and I would add that this is particularly important considering the multi-stakeholder nature of the RDM and data curation landscape. Every good manager knows that it is essential to periodically check how plans are changing and whether the actual achievements match the expected progress (see, e.g., the PRINCE2 concept of progress). In this field, initiatives such as the DCC Research Infrastructure Self-Evaluation (RISE) framework can help institutions critically review their RDM and data curation practices in time. During his presentation, William Michener (Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico) highlighted that the training offered on these topics should be assessed, too. This can be done, e.g., with the DataONE EEVA survey tool, which gives actionable feedback on how to improve training efforts.
Where does this leave us?
At the end of the 2017 IDCC conference, I felt like the most important topics related to RDM and data curation had been discussed. The birds of a feather sessions effectively covered the gaps in the programmed sessions, discussing how to motivate researchers and engage with senior management, but also very recent topics, such as rescuing data in a hostile political environment.
I think that, at this point, the supply side of the RDM landscape is getting readier and more mature every day. On the other hand, the demand side, represented by researchers and end users, still seems to lack motivation and commitment. Perhaps, a targeted policy intervention could be the straw that breaks the camel’s back. However, this doesn’t seem promptly achievable due to the problems at stake being inherently complex and variable.
In conclusion, I am still convinced that better and stronger evidence on why RDM and data curation constitute good practice is still needed. As a matter of fact, I am actively looking for it, so please do get in touch should you have any information related to case studies on the benefits yielded by RDM and data curation.