RISM Data, Big Data

A peek inside the data

The RISM online catalog currently contains over 1,052,000 records. When you count these along with 101,000 authority files for people, 63,000 for institutions, and 32,000 for secondary literature, that's a lot of data!

All of it is freely available as linked open data under a Creative Commons license, but what can you do with it?

Our colleague Sandra Tuppen from the British Library and RISM UK, along with Stephen Rose and Loukia Drosopoulou, included RISM data in their project "A Big Data History of Music." Insights gained from their project were published last year in Early Music, and this summer an article appeared in Fontes Artis Musicae that took a look at the role that bibliographic datasets – the basis for their data – played in their project:

Sandra Tuppen, Stephen Rose, and Loukia Drosopoulou, "Library Catalogue Records as a Research Resource: Introducing ‘A Big Data History of Music.’" Fontes Artis Musicae 63, no. 2 (April-June 2016): 67-88. DOI: 10.1353/fam.2016.0011

The datasets used were RISM's data on printed music (series A/I and B/I) and music manuscripts (series A/II) and data from the British Library's electronic and print catalogs (including Early Music Online). Combining RISM's data, which the researchers called "the most comprehensive body of information on musical sources between ca. 1500 and 1800" (p. 70), with the British Library's own extensive holdings of music published in Britain, Ireland, and abroad resulted in a dataset of over two million records.

The researchers describe what analyses were performed on the data, and using such a big dataset reveals interesting ways of looking at music history. For example, they compared publications of Palestrina's sacred music in counter-Reformation cities with the number of publications in Rome and Venice. With a network graph they could also show the relationship between RISM's genre terms and composers as evidenced by surviving printed music before 1800.

The RISM data are now available as open data and linked open data and the British Library made its data available through their Free Data Services page. The Fontes article describes how the data from these bibliographic records had to be cleaned up and unified. Microsoft Excel was the main tool used to manipulate the data though some visualizations were achieved using some tools developed by our colleagues at RISM Switzerland.

Anyone is welcome to take our data out for a spin. If you do so, we'd love to hear about it!

 

Image: From the British Library's open data

 

Category: New publications



Next article >
< Previous article

Comments

This article has no comments yet.
Add a comment

Please fill in all fields! Your email address will not be displayed.




CAPTCHA image for SPAM prevention
Code unreadable? Please click here.

Subscribe with your RSS reader

COPYRIGHT

All news entries are by the RISM Central Office staff unless otherwise noted. Reuse of RISM's own texts is permitted under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License—though please note that image credits and permissions are usually separate and noted at the bottom of each post. If authorship is attributed to someone else (indicated at the start of an entry and/or by a name following the word "Contact"), please contact the individual authors.