Saturday, December 7, 2024

NIKAW: exploring networks of ideas and knowledge in the Ancient World using a bilingual language model

NIKAW: exploring networks of ideas and knowledge in the Ancient World using a bilingual language model
info Type
Project
code Projectcode
3H230094
calendar_month Period
February 15, 2023 - February 15, 2027
vpn_key Key area(s)
sell Discipline(s)
06020113 Latin language 06020303 Classical literature

Project summary

The NIKAW-project aims to exploit textual information from the ancient world to reconstruct the transmission of knowledge across multilingual, geographically and chronologically extended communities. The period in focus is the advent of Christianity, the VIII century BCE to the IV century CE. In the context of this project, we will attempt to create a bilingual, state of the art NLP pipeline for extracting and examining the mentions of authors in the ancient Greek GLAUx-corpus and the Corpus Latin antiquité et antiquité tardive lemmatisé. As a first step, we will be extracting a list of named entities from these two corpora, using existing manually annotated data (e.a. trismegistos names). Next, in order to improve, structure and disambiguate the data, we will build tailored NEL and NER-models, also taking advantage of existing models and corpora (Latin BERT, Greek ELECTRA…).. To then better access the context of these mentions, an NLP model will then be trained from scratch on the two corpora, creating interoperable context representations for the two languages. The data will be represented in a multimodal social network graph, on which various further improvements and additions not part of this PhD will be made. We will then attempt to enrich the graph with textual information (e.g. sentiment analysis). As a final part of this PhD and with the help of a postdoc-coworker, we will extract relational information from the graph and attempt novel ways to integrate it into the bilingual language model. Afterwards we will test whether this improves the downstream tasks.

 

No comments:

Post a Comment