Tuesday, November 10, 2020

Callimachus: A Regest of Greek Papyri


Callimachus is an automated regest of published papyri and ostraka, ie. a processed extract of the formal contents of the text in the papyri hosted at the Papyri.info site. Additional info about the date, origin, material, etc., of the papyri (from the HGV database) is included in order to enrich the queries. Currently Callimachus (in beta) contains only the data on documentary papyri. Literary papyri will be added in the following weeks. Lexical information about the papyrus is contained in the sibling Anagnostes database, soon to be released here.

Callimachus contains three kinds of information.

The first one refers to several countable features of the text, as it was encoded by the Papyri.info project; for instance, how many words, letters, gaps, letters per line, scribal hands, etc. can be found inside every document. These data was extracted during the parsing of the documents from the Integrating Digital Papyrology Papyri.info github repo. The lexical information belongs to another project, Anagnostes, soon to appear here.

The second type of information is an automated calculation of the state of the text of the papyrus (Callimachus' number). In other words, how much (and how well) the original text of the papyrus can be read in the edition used by Papyri.info. This calculation is provided as two decimal numbers (CRN and CNN) from 0 to 1 (one means all the text is perfectly readable).

Callimachus Readability Number (CRN) is a measure of the readability of the part of the text that was edited (up to which point the editor was able to read or conjecture the papyrus' text information).
Callimachus Conservation Number (CCN) is a measure of the conservation of the papyrus' text. CRN (center) and CCN (center) refers only to the "center" of the papyrus, defined as the part of the text after the first full word preserved and before the last full word preserved. Here you may find how this number is obtained.

There is still another variety of the CRN and CNN, (namely CRN2 and CRN2) which somehow amplifies the differences between different states of preservation: this is obtained by squaring the values of each letter and then obtaining the square root of the total. Whether this, or the simple number is more useful, is a matter to be resolved.

The third kind of data is mainly data about the papyrus (or ostrakon) itself, as provided by the Papyri.info project: Date, Origin, material, content, etc. This information comes from the metadata included in the XML documents, or from the HGV database. All this info (and many more) can be consulted in the Papyri.info site as well.

