Last month on 17 July, the Commission convened its annual business meeting and held a small conference on “New Directions in Digital History of Science” at the Max Planck Institute for the History of Science, Berlin (MPIWG). The one-day conference included papers by four members of the governing board of the Commission along with a talk by two scholars from the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW) and a scholar at MPIWG. The six papers were followed by an open discussion on collaboration in digital history of science.
Below, I provide a brief discussion of the conference, highlighting a few of the key points that were mentioned and debated. For a fuller treatment of these topics, the Commission will be publishing the conference talks through the MPIWG preprint series later this year.
The day began with the Commission’s business meeting, where various projects were discussed. In addition to ongoing projects, such as the World History of Science Online, the commission turned its focus to the digital preservation and documentation of archives related to the institutions and conferences organized by historians of science. The Commission’s first goal is to find and list the locations of archival records related to all of the IUHPS and other major international meetings in history of science during the 20th and 21st centuries. With appropriate funding, we hope to be able to eventually digitize and make available many of these records. By focusing on archival material and its digital preservation, we look forward to producing a history of the discipline’s international development.
A second project related to the one just mentioned was proposed, focusing on dissertation records. The president of the Commission has found that doing research on existing digital dissertation records is currently frustrating, and the records are frequently poorly entered and inadequate. Since dissertations reveal a lot about the formation of a discipline through mentor-mentee relationships, the Commission agreed to encourage the building of a robust database of dissertations in the field of history of science. Since the IsisCB dataset contains much of this information, the project will begin by studying this data and building on it.
The conference that followed the business meeting explored the ways that current digital projects are being developed. Birute Railiene’s paper explored the problems and possibilities of doing research on digital dissertation data using the European NDLTD (Networked Digital Library of Theses and Dissertations). It was this research that demonstrated to her the need for a much more serious effort at cataloguing that would include all relevant fields—such as institution and dissertation advisor, both lacking in many records—and she also outlined other problems dealing with vocabulary, classification, and language.
The paper presented by Silvia Waisse and written by her, Ana Maria Alfonso-Goldfarb, and Marcia H. M. Ferraz explored the history of the CESIMA institute’s digital library project and the difficulties of classification of this library. She emphasized both the intellectual and technological hurdles that they have had to surmount. She began by pointing out how knowledge systems have undergone such vast change over the centuries, and why this has made it very difficult to build a coherent classification schema to cover topics in the history of science. New computational tools combined with human effort, she said, now make it possible to create a faceted system that will make the classification project more adequate.
Elise Hanrahan and Marcus Schnopf of the BBAW presented a fascinating paper on the work that their institution has done on digitizing and indexing archival resources, raising questions about current practices. One of the problems that Hanrahan pointed out is that scholars working on digital critical editions tend to vastly over-tag sources. When page images are readily available to everyone, for example, there is no need to describe every mark on a page, yet many scholars still do this. Over-tagging, she argues, is wasteful and time consuming. Schnopf’s questions were directed toward issues surrounding the interlinking of data between authority records and archival texts. Among the many projects that BBAW has developed, the Scalable Architecture for Digital Editions, is one that helps scholars do very useful interlinking of resources.
Dirk Wintergrün, head of IT at MPIWG, discussed in detail the various aspects of digital development, preservation, and publication for the Institute. Because the Institute is so heavily invested in disseminating the scholarship produced there, they have created many digital tools that give scholars open access to their resources. One theme that emerged from his talk was the need for institutions of this sort to adhere to open access and strict coding standards for sharing data. This, he explained, was behind the thinking of MPIWG’s development of new publication tools for the future of scholarship.
My paper raised questions about how and when to expand bibliographic citations to include internet resources. The problems are quite complex. What kinds of documents and resources are worthwhile to include—do we index websites, blogs, and twitter feeds just as we do books and articles? If we do include electronic media of this sort, at what level do we index it—is it worthwhile indexing individual posts for some blogs, for example, treating them, in essence, as academic journal articles? These questions raise the need to reexamine the purpose of bibliographies like the IsisCB in a digital age. What do we want them to do from now on?
The last paper by Gavan McCarthy, director of the e-Scholarship Research Centre, discussed the ways in which his team has been able to display data using on-the-fly visualizations. These visualizations reveal types of information about the resource that have been very difficult to extract until now. When dealing with big data of this sort, the ability to create visualizations is often hampered by the sheer size of the dataset, so bounding the set to a manageable size is essential. McCarthy made clear that this kind of visualization is at its infancy, and that the future is bright for this work.
During the discussion periods, a number of important notions came up. Urs Schoepflin, head of the MPIWG library, argued that classification has entered a new era with the advent of electronic resources. Libraries should not invest hours into classification systems, he contends, because this is the job of scholars who are better prepared to understand how and where objects should be placed. Moreover, different collections require different schemes to adequately address the topics. Furthermore, as scholarship changes, so must ways of access. The most useful approach to classification, Schoepflin contended, is to create very flexible and open systems that can be pulled together in different ways.
Following the conference, three members of the Committee traveled to Wolfenbüttel, Germany, to see the Herzog August Bibliothek, which is one of the most significant rare book libraries in Europe, having lost virtually none of its holding over the entire 500-year history of its existence. The pace at which they are digitizing and making records available to the public is phenomenal and impressed us with the extensive resources that Germany is devoting to cultural preservation and transmission.