info TypeProjectcode Projectcode3H230094calendar_month PeriodFebruary 15, 2023 - February 15, 2027vpn_key Key area(s)sell Discipline(s)06020113 Latin language 06020303 Classical literatureProject summary
The NIKAW-project aims to exploit textual information from the ancient world to reconstruct the transmission of knowledge across multilingual, geographically and chronologically extended communities. The period in focus is the advent of Christianity, the VIII century BCE to the IV century CE. In the context of this project, we will attempt to create a bilingual, state of the art NLP pipeline for extracting and examining the mentions of authors in the ancient Greek GLAUx-corpus and the Corpus Latin antiquité et antiquité tardive lemmatisé. As a first step, we will be extracting a list of named entities from these two corpora, using existing manually annotated data (e.a. trismegistos names). Next, in order to improve, structure and disambiguate the data, we will build tailored NEL and NER-models, also taking advantage of existing models and corpora (Latin BERT, Greek ELECTRA…).. To then better access the context of these mentions, an NLP model will then be trained from scratch on the two corpora, creating interoperable context representations for the two languages. The data will be represented in a multimodal social network graph, on which various further improvements and additions not part of this PhD will be made. We will then attempt to enrich the graph with textual information (e.g. sentiment analysis). As a final part of this PhD and with the help of a postdoc-coworker, we will extract relational information from the graph and attempt novel ways to integrate it into the bilingual language model. Afterwards we will test whether this improves the downstream tasks.
Saturday, December 7, 2024
NIKAW: exploring networks of ideas and knowledge in the Ancient World using a bilingual language model
NIKAW: exploring networks of ideas and knowledge in the Ancient World using a bilingual language model
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment