In August 2022, the United States Office of Science and Technology Policy (OSTP) issued a memo (PDF) on ensuring free, immediate, and equitable access to federally funded research (a.k.a. the “Nelson memo”). Crossref is particularly interested in and relevant for the areas of this guidance that cover metadata and persistent identifiers—and the infrastructure and services that make them useful.
Funding bodies worldwide are increasingly involved in research infrastructure for dissemination and discovery.
Preprints have become an important tool for rapidly communicating and iterating on research outputs. There is now a range of preprint servers, some subject-specific, some based on a particular geographical area, and others linked to publishers or individual journals in addition to generalist platforms. In 2016 the Crossref schema started to support preprints and since then the number of metadata records has grown to around 16,000 new preprint DOIs per month.
TL;DR One of the things that makes me glad to work at Crossref is the principles to which we hold ourselves, and the most public and measurable of those must be the Principles of Open Scholarly Infrastructure, or POSI, for short. These ambitions lay out how we want to operate - to be open in our governance, in our membership and also in our source code and data. And it’s that openness of source code that’s the reason for my post today - on 26th September 2022, our first collaboration with the JSON Forms open-source project was released into the wild.
Ans: metadata and services are all underpinned by POSI.
Leading into a blog post with a question always makes my brain jump ahead to answer that question with the simplest answer possible. I was a nightmare English Literature student. ‘Was Macbeth purely a villain?’ ‘No’. *leaves exam*
Just like not giving one-word answers to exam questions, playing our role in the integrity of the scholarly record and helping our members enhance theirs takes thought, explanation, transparency, and work.
Having joined the Crossref team merely a week previously, the mid-year community update on June 14th was a fantastic opportunity to learn about the Research Nexus vision. We explored its building blocks and practical implementation steps within our reach, and within our imagination of the future.
Read on (or watch the recording) for a whistlestop tour of everything – from what on Earth is Research Nexus, through to how it’s taking shape at Crossref, to how you are involved, and finally – to what concerns the community surrounding the vision and how we’re going to address that.
Summary of presentations
The idea is simple in principle: scholarly records ought to be transparent – available to examine and learn from for all. Much of scientific production and communication these days has a heavy digital footprint so the Nexus is nothing but simply connecting the loose strands, right? Yet, as the scholarly record is a reflection of the continuous progress made by multiple actors within the context of scientific structures and processes, bringing the Nexus to life is a little short of simple.
“What we think of as metadata is expanding, and the notion of ‘content types’ is changing” – said Ginny Hendricks. A great majority of scholarly ‘objects’, whether they are data sets, research articles, monographs, or others, undergo many processes (including review, publication, licensing, correction, derivation) and influence knowledge and practice over time.
Making that progress visible and discoverable will allow for tracing the development of ideas and changes in our thinking over time. Transparency of the complete scholarly records will help to understand the impact of science funding and changing policies. It can support a more robust and comprehensive assessment of research, and contribute to improving integrity within as well as public trust in sciences.
Patricia Feeney has given us reasons for optimism in building a robust Nexus. She’s shown areas of greatest growth in metadata reported to Crossref and shared a public roadmap of types of information we’re asked to enable in the future. We’re seeing a true boom of datasets and peer review reports registrations, and the relationship metadata for our records is improving too. At the dawn of defaulting to open references, 44% of records we hold have associated references and that is growing. Provision of the newly enabled affiliation information (ROR IDs) is on the rise, as is the funder information. Some conversations and questions followed highlighting the need for further guidance in these areas.
To make a case for enriching metadata records, Martyn Rittman demonstrated examples of traceability of research influence on realities outside academia. He captured recent examples of data citations and other references present not just between scholarly papers, but also in policy documents and popular media. These allow for greater discoverability of literature – but also show the public influence and impact of the research and the work’s context in our wider society.
While Martyn shared our blue-skies aspiration to streamline Crossref’s APIs to offer insight to all these relationships with a single service, Joe Wass grounded those ambitions in the reality of technical work underway. His team’s attention is divided between three main areas. They continue to maintain and de-bug our existing infrastructure. They are developing self-service solutions for members. Finally, they are mapping and planning improved infrastructure, evaluating technology against the Research Nexus vision.
Bringing it back to the source (of metadata), Rachael Lammey offered a very practical guide to key activities enabling Research Nexus that all members can take on now. She highlighted the benefits of collecting and registering data citations, ROR IDs, and grant funding information. She went on to talk about challenges of subject classification (at a journal level) that our research and development efforts are focusing on at the moment.
Summary of discussions
Publishing has changed dramatically and our members recognise increasing opportunities for transparency of the scholarly record. Breaking the distant vision of Research Nexus down into actionable chunks made it more relatable for call participants. Many reflected on seeing their place in it properly for the first time. Yet, challenges remain and many were brought to the fore in the discussions.
The reliability and usability of the technology for registering metadata with Crossref needs to improve. We need to do better in supporting multi-language and multi-alphabet information. Not just developing systems anew, but also streamline the way content is registered and annotated, and continue to disambiguate the competing identifiers. Different content types, chiefly books, present specific challenges in this regard. Finally, making all that metadata accessible and usable is key to enabling insights from the rich data we collectively make available.
Technology is important, but won’t overcome the barriers that exist in the mindsets. Siloed thinking means that publishers may not be sensitive to benefits that improved relationship metadata could have for colleagues working on assessment, even within the same institutions. Greater guidance or best practices for new identifiers, such as ORCID, ROR, grants, would allow more publishers to get on board with the changes. Researchers often don’t help the cause either – many don’t realise the role and benefits of metadata for their work and are reluctant to provide rich information related to it, perceiving it as a bureaucratic burden.
In a nutshell, I learnt that – while the concept of Research Nexus is pretty complex – we’re all already participating in making it a reality. I’m grateful to the call participants for sharing their challenges and ideas so generously. It means we can work to address those. I’ll be sure to follow-up on requests for support and clearer guidelines about citing data, recording ROR IDs and grants information in the metadata, and we’ll engage our community on complex topics of record updates (corrections, retractions and versions). Be sure to keep in touch with the conversations on the Community Forum. I’ll see you there!