Technical Issues by the Expert Committee

Comments from the members of the Expert Committee regarding technical issues

Alfredo Tolmasquim (November 14) on the importance of standardized forms as pertain to technical matters:

I completely agree with Peter on using the ICA’s standard for archival description, and the importance of a standardized form. When I send information on an archive file belongs to the Museum of Astronomy to the AIP database, for example, Joe (Anderson), or other persons, has to migrate this information to their database. This is excellent, since they can make the corrections are necessary, and to adapt it to their guidelines.

But, by the other side, if it works for a limited number of files, I understand that AIP (American Institute of Physics), using the same example, should need a big staff to deal with information they would receive from many archives in many knowledge areas, and countries.

So, we need a kind of open database, where each archive or Institution may include the data by itself. However, in the counterpart, we need very close guidelines to the archives or Institution to fill in the form in the same way, using the same vocabulary, the same fields, and so on. Of course, one Institution should host and give maintenance to the database, but it will have no responsibility for the data insertion. At present, it is not difficult to build a database like this, to be feed from different places.

Peter Harper (November 14) on the importance of standardized forms (in archival databases), as relates to technical matters.

The importance of the form in the creation of an archives database is that it controls the presentation of information on any archival collection in accordance with the ICA (International Council on Archives) international standard for archival description. The form can be used to send information about such archival collections to the ‘owner’ of a global database (e.g. the AIP and the ICOS) but of course it can also be used to create a number of local (e.g. national or subject databases) that might be merged at a later date to form a global IUHPS/DHS online archival sources database. In either case the form structures the information with respect to a standard that had international exchange of information by electronic means very much in mind when the standard was being created. Furthermore, to a considerable degree guidelines are implicit in the structure of the form itself which can therefore be used with a limited amount of additional information i.e. without a detailed mastery of a substantial technical standard.

Alfredo Tolmasquim (September 10) on the possible types of databases:

I agree with Peter about the separate databases: one for archival and the other for bibliographical sources. I think we need to make first some decisions about the structure of the database, so it will influence the guidelines we will produce. I imagine three basic types of databases:

The first one is like the AIP History Center database of archival sources. We fulfill a form with the information of our archive, and we send it to group or a person with the task to include it in the database. This is the easiest way, since this person can control and correct the information before the insertion into the database. In this case, the guidelines can be more open and free, but it is not useful for a large amount of information. Maybe, it may be used for the database of the archival sources, but surely not for the bibliographical one. But, even for the archival sources we would need an Institution or person responsible for this. We could solve this problem using a unique database, but with different Institutions or persons with the permission to include the information. I already do this with the Brazilian Bibliography of the History of Science, since I or any other person of the project can insert information in the database from any computer connected to the internet.

The second model is the model used by RLIN (Research Libraries Information Network). In this case, each one produces its own database, using any kind of informational system, just following the guidelines. The RLIN receives the database and develops a program to migrate the data from one database for their database. It already works, and very well. However, in this case we would concentrate the information in one unique Institution, as the RLIN for instance. And in this RLIN case the charge high prices to have access to the database.

The third model is the most complex, but more democratic and rich. It is using the system suggest by Roberto Martins. If I understood correct, each group constructs its own database, following very specific guidelines. And there is a search program that looks for the required information in all the databases. I suppose (but I am not sure) this is the system used by the Karlsruhe Virtual Catalog. If you don’t know this system, I would like to kind invite you to visit it at the following address: http://www.ubka.uni-karlsruhe.de/hylib/en/kvk.html

The last two models have the problem posed by Peter of compatibility with the databases already built, and we need to think how to deal with this problem.

Maybe, our first step should be to create a short group of experts in databases to propose the system, the platform, and other technical points we should use in the project. The decision, however, is not just technical, but also political.

At the same time, we could create two different groups to start thinking the guidelines of the two different databases. These two groups could send their demands for the technical one, and they would elaborate the guidelines in accordance with the information system will be used.

Peter Harper (September 4) on the need for close collaboration between a technical committee and an advising committee for bibliographies and archives.

When I was discussing the project with colleagues here they were very concerned that problems might arise from the divorce (separation in space) between those managing the database at UNAM and the bibliographical expertise (this concern can apply to archival expertise as well). I would hope that in the setting up any advisory or supervisory group of experts it would be possible to include someone close at hand at UNAM.

Peter Harper (September 2) on the need for an initial website.

I think this is very important because I should like to see a basic framework on the project website for the survey of existing bibliographical and archival sources, with a number of examples in the various categories so that members of our international history of science community can be challenged to fill in gaps from their specialist areas of knowledge.

In this context I have been considering very carefully Roberto de Andrade Martin’s document on Strategies for the Development of Databases. It would be very good to have a link from the project website to this document as a reference and/or discussion document but I think in the first instance at least it might be better to adopt a simpler framework for the survey.

Roberto Martins (May 5) on the general structure of the project, as pertains to some technical regards.

Following the main ideas of the initial document of the project, I suggest that the next steps of the experts committee should be:

  1. The building of the initial Internet site, containing as a minimum: the aims of the DHS project, links to some existing databases (or projects), and some new proposals. I don’t know if Prof. SaldaƱa does already have sufficient information for building this initial site (first front, or phase one), or if he needs some input from the committee. I have collected some useful information at this address:

    http://www.ifi.unicamp.br/~ghtc/sources/sources1.htm

    This is just a preliminary document, and it should be improved and complemented.

  2. To produce a preliminary analysis of all types of databases (on history of science primary sources) that exist and that could be produced, in principle (a description grid) – so that it will be possible to search for existing databases and to discuss the existing gaps.

    I suppose that the scope of each database can be well described by specifying the kind of sources it intends to include (printed books, articles published in periodicals, maps, scientific instruments, historical scientific collections, manuscript books, archival collections, iconography, archaeological monuments of scientific relevance, etc.), the covered scientific fields (natural sciences, technology, medicine, social sciences, etc.), the included period (Antiquity, Middle Ages, from the Xth to the Yth century, etc.), and eventually language and/or nationality constraints. I wonder if other descriptive criteria (besides those cited above) should be introduced.

  3. Suggestion of names of other experts, institutions and commissions (including those belonging to specific societies) that could be contacted asking for co-operation, and the eventual formation of sub-committees. Those sub-committees could also provide the stimulus for new projects.
  4. A careful appraisal of available resources and projects, and an assessment of desiderata to be filled in the future, as regards database content. This will lead to a second version of the central Internet site.
  5. An analysis of desirable database instrumentality – what kind of information historians of science would like to obtain, and what types of search strategies should the databases permit. I have presented some suggestions in the Internet document cited above.
  6. Only after that, in my opinion, should we ask the help of technical assessors (archivists, librarians, computer analysts, etc.) concerning the structure of the several types of desirable databases. This should lead to the technical guidelines of the project.

Parallel to this work of the experts committee, I think that after launching the initial site the DHS could already begin to publicize the project and to ask all interested groups to submit proposals and suggestions.