Data repository selection: criteria that matter

Matt Cannon, Head of Open Research, Taylor & Francis, and Nick Everitt, Head of Digital Production, Taylor & Francis

Update, Oct 2020: Preprint published

Thank you to everyone who submitted feedback for this important project. A new preprint is now available, addressing over 100 comments received on the early draft.

Original post:

Since rolling out our suite of data policies in 2018, the Taylor & Francis team have been working on materials and services to help authors who are interested or required to share their data.

As well as workflows to support authors, and an inbox to field queries we have been contributing to a project looking at the repositories we suggest to authors. The project has been led through a working group of the Research Data Alliance in collaboration with FAIRShaing and Datacite. The project has recently released its first output and looking for comments from the wider community. More details are below.

A request for comments

Publishers and journals are developing data policies to ensure that datasets, as well as other digital products associated with articles, are deposited and made accessible via appropriate repositories, also in line with the FAIR Principles.

With thousands of options available, however, the lists of deposition repositories recommended by publishers are often different (ref1, ref2) and consequently the guidance provided to authors may vary from journal to journal. This is due to a lack of common criteria used to select the data repositories, but also to the fact that there is still no consensus of what constitutes a good data repository.

To tackle this, FAIRsharing and DataCite have joined forces with a group of publisher representatives (authors of this work – see below) who are actively implementing data policies and recommending data repositories to researchers. The result of our work is a set of proposed criteria that journals and publishers believe are important for the identification and selection of data repositories, which can be recommended to researchers when they are preparing to publish the data underlying their findings.

Our work intends to:

  • reduce complexity for researchers when preparing their submissions to journals,
  • increase efficiency for data repositories that currently have to work with all individual publishers, and
  • simplify the process of recommending data repositories for publishers.

Our work will make the implementation of research data policies more efficient and consistent, which may help to improve approaches to data sharing by promoting the use of reliable data repositories.

Although we recognize that researchers and other stakeholders play a role in the research data life cycle, in this first instance the target audience for our work are other journals and publishers, repository developers and maintainers, certification and other evaluation initiatives, and other policy makers.

These proposed criteria are intended to:

  • guide journals and publishers in providing authors with consistent recommendations and guidance on data deposition, and improve authors’ data sharing practices;
  • reduce potential for confusion of researchers and support staff, and reduce duplication of effort by different publishers and data repositories
  • inform data repository developers and managers of the features believed to be important by journals and publishers;
  • apprise certification and other evaluation initiatives, serving as a reference and perspective from journals and publishers;
  • drive the curation of the description of the data repository in FAIRsharing, which will enable display, filter and search based on these criteria.

We invite you to read the pre-print article that describes the work, its motivation, relations to other initiatives, and provide us with feedback via this form.

Authors

Susanna-Assunta Sansone, FAIRsharing, University of Oxford, Oxford, OX1 3QG, UK
Peter McQuilton, FAIRsharing, University of Oxford, Oxford, OX1 3QG, UK
Helena Cousijn, DataCite, Welfengarten 1b, 30167 Hannover, Germany
Matthew Cannon, Taylor & Francis, Park Square, Milton Park, Abingdon, OX14 4RN, UK
Wei Mun Chan, eLife Sciences Publications, Ltd, Westbrook Centre, Milton Road, Cambridge, CB4 1YG, UK
Ilaria Carnevale, Elsevier, Radarweg 29, 1043NX, Amsterdam, The Netherlands
Imogen Cranston, F1000, Middlesex House, 34-42 Cleveland St, Fitzrovia, London W1T 4LB, UK
Scott Edmunds, GigaScience, BGI Hong Kong Tech Ltd., 26F A Kings Wing Plaza, 1 On Kwan St, Shek Mun, N.T., Hong Kong, China
Nicholas Everitt, Taylor & Francis, Park Square, Milton Park, Abingdon, OX14 4RN, UK
Emma Ganley, PLOS (Public Library of Science), Carlyle House, Carlyle Road, Cambridge CB4 3DN, UK. Current position: independent expert
Chris Graf, Wiley, 9600 Garsington Road, Oxford, OX4 2DQ, UK
Iain Hrynaszkiewicz, PLOS (Public Library of Science), Carlyle House, Carlyle Road, Cambridge CB4 3DN, UK.
Varsha K. Khodiyar, Springer Nature, 4 Crinan Street, London, N1 9XW, UK
Thomas Lemberger, EMBO Press, Meyerhofstrasse 1, 69117 Heidelberg, Germany
Catriona J. MacCallum, Hindawi Ltd, 1 Fitzroy Square, London, W1T 5HF, UK
Kiera McNeice, Cambridge University Press, Shaftesbury Rd, Cambridge, CB2 8BS, UK
Hollydawn Murray, F1000, Middlesex House, 34-42 Cleveland St, Fitzrovia, London W1T 4LB, UK
Philippe Rocca-Serra, FAIRsharing, University of Oxford, Oxford, OX1 3QG, UK
Kathryn Sharples, Wiley, 9600 Garsington Road, Oxford, OX4 2DQ, UK
Marina Soares E Silva, Elsevier, Radarweg 29, 1043NX, Amsterdam, The Netherlands
Jonathan Threlfall, F1000, Middlesex House, 34-42 Cleveland St, Fitzrovia, London W1T 4LB, UK