Blog

Subject codes, incomplete and unreliable, have got to go

Patrick Polischuk

Patrick Polischuk – 2024 March 13

In MetadataAPIs

Subject classifications have been available via the REST API for many years but have not been complete or reliable from the start and will soon be deprecated. The subject metadata element was born out of a Labs experiment intended to enrich the metadata returned via Crossref Metadata Search with All Subject Journal Classification codes from Scopus. This feature was developed when the REST API was still fairly new, and we now recognize that the initial implementation worked its way into the service prematurely.

Metadata Retrieval

Analyse Crossref metadata to inform and understand research Crossref is the sustainable source of community-owned scholarly metadata and is relied upon by thousands of systems across the research ecosystem and the globe. Some of the typical users (outer) and uses (inner) of Crossref metadata Show image × People using Crossref metadata need it for all sorts of reasons including metaresearch (researchers studying research itself such as through bibliometric analyses), publishing trends (such as finding works from an individual author or reviewer), or incorporation into specific databases (such as for discovery and search or in subject-specific repositories), and many more detailed use cases.

Increasing Crossref Data Reusability With Format Experiments

Martin Eve

Martin Eve – 2024 January 19

In MetadataCommunityAPIs

Every year, Crossref releases a full public data file of all of our metadata. This is partly a commitment to POSI and partly just what we do. We want the community to re-use our metadata and to find interesting ends to which they can be put! However, we have also recognized, for some time, that 170GB of compressed .tar.gz files, spread over 27,000 items, is not the easiest of formats with which to work.

2023 public data file now available with new and improved retrieval options

We have some exciting news for fans of big batches of metadata: this year’s public data file is now available. Like in years past, we’ve wrapped up all of our metadata records into a single download for those who want to get started using all Crossref metadata records. We’ve once again made this year’s public data file available via Academic Torrents, and in response to some feedback we’ve received from public data file users, we’ve taken a few additional steps to make accessing this 185 gb file a little easier.

2022 public data file of more than 134 million metadata records now available

In 2020 we released our first public data file, something we’ve turned into an annual affair supporting our commitment to the Principles of Open Scholarly Infrastructure (POSI). We’ve just posted the 2022 file, which can now be downloaded via torrent like in years past. We aim to publish these in the first quarter of each year, though as you may notice, we’re a little behind our intended schedule. The reason for this delay was that we wanted to make critical new metadata fields available, including resource URLs and titles with markup.

With a little help from your Crossref friends: Better metadata

We talk so much about more and better metadata that a reasonable question might be: what is Crossref doing to help? Members and their service partners do the heavy lifting to provide Crossref with metadata and we don’t change what is supplied to us. One reason we don’t is because members can and often do change their records (important note: updated records do not incur fees!). However, we do a fair amount of behind the scenes work to check and report on the metadata as well as to add context and relationships.

A ROR-some update to our API

Earlier this year, Ginny posted an exciting update on Crossref’s progress with adopting ROR, the Research Organization Registry for affiliations, announcing that we’d started the collection of ROR identifiers in our metadata input schema. 🦁 The capacity to accept ROR IDs to help reliably identify institutions is really important but the real value comes from their open availability alongside the other metadata registered with us, such as for publications like journal articles, book chapters, preprints, and for other objects such as grants.

New public data file: 120+ million metadata records

Jennifer Kemp

Jennifer Kemp – 2021 January 19

In MetadataCommunityAPIs

2020 wasn’t all bad. In April of last year, we released our first public data file. Though Crossref metadata is always openly available––and our board recently cemented this by voting to adopt the Principles of Open Scholarly Infrastructure (POSI)</agic––we’ve decided to release an updated file. This will provide a more efficient way to get such a large volume of records. The file (JSON records, 102.6GB) is now available, with thanks once again to Academic Torrents.

Come for a swim in our new pool of Education materials

After 20 years in operation, and as our system matures from experimental to foundational infrastructure, it’s time to review our documentation. Having a solid core of education materials about the why and the how of Crossref is essential in making participation possible, easy, and equitable. As our system has evolved, our membership has grown and diversified, and so have our tools - both for depositing metadata with Crossref, and for retrieving and making use of it.

Helping researchers identify content they can text mine

TL;DR Many organizations are doing what they can to aid in the response to the COVID-19 pandemic. Crossref members can make it easier for researchers to identify, locate, and access content for text mining. In order to do this, members must include elements in their metadata that: Point to the full text of the content. Indicate that the content is available under an open access license or that it is being made available for free (gratis).