Thursday, November 27, 2014

CDLI search

CDLI has recently implemented at <http://cdli.ucla.edu/search/> a search functionality, similar to that for transliterations, for lines of translation and comment that form a part of our core text annotation files. While only a fraction of the numbers of translations available through the Oracc consortium, there are, still, currently some 54,000 lines of translated cuneiform text in CDLI files, mostly in English, but including some instances in German, French, and even Catalan; 14,700 lines of interlinear annotation, from comment on sign preservation up to calculations that underlie numbers in accounts and metrological-mathematical texts, and 88,000 lines of (usually formulaic) comment to text structure. The bulk of current CDLI translations is comprised of those created by Dan Foxvog for the Mesopotamian Royal Inscriptions component of the website (nearly 30,000 lines in 1550 texts; see <http://tinyurl.com/mdhzlrg> and <http://cdli.ucla.edu/projects/royal/royal.html>), and we anticipate more translation content of Sumerian literary texts as ETCSL migrates to CDLI; but 13,600 lines in 1530 administrative texts are also now in some form of translation (<http://tinyurl.com/kjkcut4>). For the record, CDLI restricts translation of texts liable to appear in multiple witness artifacts to their artificial composite entries. As with transliteration search, the exact string of searched characters in translations and comments are highlighted in blue to facilitate their discovery within the displayed texts. Exact string in these instances means that, for example, a search for “pig” will display that string as a discrete word, but also all uses of “pigs,” “pigherder,” and so on. Only “pig” will be highlighted. Please note that the search engine results pages only report numbers of texts found, not individual references to a given search string. Thus a search for “calculation:” in comment results in 228 texts found, but altogether 1026 uses of “calculation:”. As with transliteration search, users can enter multiple character strings in a field, each separated by a comma, for instance "lukalla,account” in translation (currently just six hits, at <http://tinyurl.com/pegtatb>), but unlike transliteration these searches are always of full texts and cannot be restricted to single line, and are not case sensitive, neither of which seemed to us to contribute materially to search strategies.

Bob Englund
UCLA

No comments:

Post a Comment