CAMENA - Latin Texts of Early Modern Europe: the XML files
Background
CAMENA (Corpus Automatum Multiplex Electorum Neolatinitatis Auctorum), a DFG-funded
research project carried out at the German Department of Heidelberg
University Chair of German Literature (Modern Period), in cooperation
with the Information Technology Center and the Library of the University
of Mannheim, and led by Prof. Dr. Wilhelm Kühlmann, was active from 1999 to 2013; we particularly thank the spiritus movens of Wolfgang Schibel, as well as Reinhard Gruhl, Emir Zuljevic, Heinz Kredel, and other members of the team.
In our opinion, CAMENA was one of the most important Neo-Latin
digital initiatives. Since its machine-readable texts were made
available under the Creative Commons Attribution / Share Alike license,
here we are republishing the XML files of all the CAMENA collections as
a Github repository, with all the caveats of the original project
regarding citing and reliability, and with the intent to enable further
digital experiments with CAMENA Neo-Latin material.
Again, sincere gratitude goes to colleagues involved in CAMENA for all their efforts, and for making this possible. Sumus nani gigantum humeris insidentes.
Contents
In CAMENA, the texts are divided in five collections: POEMATA, Neo-Latin poetry composed by German authors; HISTORICA & POLITICA, Latin historical and political writing; THESAURUS ERUDITIONIS, a reference collection of dictionaries and handbooks of the period 1500-1750; CERA,
printed Latin letters, mostly by German scholars, from the period
1530-1770; and ITALI, works by Italian Renaissance humanists born before
1500. The collection ITALI has no XML files, so it was not included in
this repository.
We were not able to find information on the exact number of XML files produced by CAMENA. This repository contains 949 XML files in the POEMATA section, 382 files in the HISTORICA & POLITICA, 296 files in the THESAURUS ERUDITIONIS, and 124 files in CERA, with the total of 1751 files. These files contain 50,458,045 words (tokens) below the text element (more on this in Word count).
Not all CAMENA XML files provide full text of the digitized source. For example, the file Arenhold_conspectus_index_II.xml in CERA offers only the table of contents to the digitized volume of Arenhold, Silvester Johannes: Conspectus
Bibliothecae Universalis Historico-Literario-Criticae Epistolarum :
Typis Expressarum Et M[anu]S[crip]tarum, Illustrium Omnis Aevi Et
Eruditissimorum Auctorum. - Hanoverae : Sumptibus Hereduum [!] Foersterianorum, 1746. In the CAMENA-CERA version,
the table of contents contains links to respective page images of the
digitized book. We did not try to exclude such partial XML publications
from this repository.
No comments:
Post a Comment