Monday, January 22, 2024

THOT - Thesauri & Ontology for documenting Ancient Egyptian Resources

Thanks to the Anneliese Maier Research Award granted to Prof. Jean Winand by the Humboldt Fundation, the Department of Egyptology of the University of Liege, in collaboration with the Berlin-Branderburg Academie of Sciences and Humanities, and the Saxon Academy of Sciences and Humanities in Leipzig (Thesaurus Linguae Aegyptiae), is now developing Thot, a set of resources for documenting and encoding ancient Egyptian resources in a shared, interoperable approach. This effort has sprang from the wish of several projects dealing with ancient Egyptian textual corpora to become more interoperable and follows the path of previous attempts, such as the Multilingual Egytpological Thesaurus (MET). Moving towards Web Semantic and Linked Open Data, the resources that will be pushed online progressively will consist in a set of thesauri covering most of the metadata relating to ancient Egyptian texts and monuments, as well as a proposal for a TEI interchange format that enables exchange and sharing of textual data.


Using the definition set by the International Standard Organisation, thesauri are ‘controlled and structured vocabulary in which concepts are represented by terms, organised so that relationships between concepts are made explicit, and preferred terms are accompanied by lead-in entries for synonyms or quasi-synonyms’. In Thot, each concept is represented by one or several terms expressed in Arabic, English, French and German, and identified by a unique identifier (the ‘thot-number’). It should also be noted that a supplementary ‘language” has been included, the so-called ‘Thot-xml’, in order to provide terms to be used as values of XML attributes in digital documents, such as TEI files.

Thesauri covered by the project

As a first approach, the Thot project aims at compiling a wide range of thesauri relating to artifactual and textual metadata. This will consist in quite small thesauri made of a couple of terms, as well as rather big ones populated by several hundreds of concepts. Subject headings pertaining to general and abstract concepts will be included in a second stage.

Main thesauri

The main Thot thesauri include:

  • Dating List of periods, reigns (from the prehistoric times down to Islamic period), and chronological entities such 'first quarter of the 5th Century BC.' Each concept is attached to one or more chronological range(s), provided as a support for date-related questions rather than as absolute datings (see also here)
  • Languages List of languages in use in Egypt from the beginning of the pharaonic period until to the Islamic period. This thesaurus also includes ancient and modern 'foreign' languages, in order to cover all the textual material found and/or produced in Egypt, as well as modern languages used in digital files.
  • Material List of material types used by the ancient Egyptians or their contemporaries in architecture and object production.
  • Types of object List of concepts for characterization of objects produced and/or found in Egypt.
  • Types of preservation state This vocabulary aims at providing terms for describing the state of preservation of inscriptions and of text supports.
  • Repositories List of present and past locations where ancient Egyptian objects are or were stored.
  • Scripts List of scripts attested in documents produced and/or found in Egypt.
  • Text types List of terms for characterization of texts in terms of genre and register.

Subsidiary thesauri

A set of small-scaled thesauri will also be developed, in order to fulfil all the needs in terms of controlled vocabularies for the creation of TEI XML files (see also here). These thesauri include:

  • Concepts relating to Bibliography
      Bibliographical types and subtypes, Types of reference, of titles, of title levels, and of bibliographical scopes;
  • Concepts relating to Digital Edition
      Types of hand styles, of roles; types of list of people for editorial and prosopographical purposes, types of revision statuses, Types of text sections, Text line orientations, Text transcription features);

Concept overview and data-model

THOT Thesauri comply with the ISO standard 25964-1 relating to thesauri development, and are implemented in Simple Knowledge Organizational System (SKOS). Identified and accessible through Unique Resource Identifiers (URIs), each Thot concept is defined within a skos:Concept element and is characterized by:

  • Preferred terms (skos:prefLabel) in English, French, German and Arabic; as a complementary piece of information, it will also be provided a preferred term for use of concepts as xml attribute values, as those must comply with XML specific requirements in terms of naming (space and special characters not allowed, etc.).
  • Alternative terms for each concepts, when necessary (skos:altLabel). Including non-preferred terms in the four languages will enhance information retrieval.
  • A scope note (skos:scopeNote), providing a clear and distinctive definition of the concept.
  • Reference(s) to broader and/or narrower term(s) (skos:broader and skos:narrower), if applicable, providing the concept’s context. It is worth noting that Thot Thesauri will be poly-hierarchical, which means that a concept might appear in several places in a thesaurus tree.
  • Reference(s) to related concepts (skos:related), i.e. concepts that can be associated to another on a semantic ground other than hierarchical one.
  • A scheme note (based on various element from skos:schemeNote), which is intended to provide information about the sources used and the collaborators involved in the collection of terms.
  • Mapping with other thesauri (skos:exactMatch), such as MET, Europeana Eagle Project (see Roadmap, below)
  • One or several date ranges (time:temporalEntity), if applicable (e.g. pharaohs' reigns). This is a project specific piece of information that is not suggested by the ISO Standard, but required so that the full temporal dimension of the text production involved in the digital text corpora can be taken into account.

Hierarchical structure and facets.

Concepts are bound together through Narrower and Broader Term relationships and form a hierarchical network, which is the basis for the left-hand browsing tree menu.

Facet analysis (skos:Collection and skos:OrderedCollection) is also implemented and is marked by the use of '<' and '>' signs around the facet label: e.g., '<Scripts by types>'.

The rule is that concepts are arranged in alphabetical order, according to the language in use (language switching is possible through the icons above the left-hand menu). For some sets of concepts, such as those related to chronology, logical order can be preferrable and thus prevail over alphabetical one (e.g. concepts under Pharaonic period). Please note that concepts with children (i.e. 'narrower terms') organised in logical order are considered as skos:Concept in their own right, although a full skos approach would require to see them as skos:orderedCollection.

Mapping Thot to other thesauri

In order to fully implement a semantic web approach, it is foreseen to map Thot to current and past thesauri relating to the Egyptian cultural heritage. Although currently not being maintained, the Multilingual Egyptological Thesaurus is the obvious candidate, as a substantial number of projects have been using it. Large thesauri such those curated by the British Museum, the Getty Research Institute or the Deutsches Archäologisches Institut, will also be taken into account progressively.

A preliminary of these thesauri includes:


No comments:

Post a Comment