A Full Morphosyntactic Annotation of the State Archives of Assyria Letter Corpus
The
dataset consists of a full morphosyntactic annotation of the normalized
letter corpus of the State Archives of Assyria online (SAAo), plus
associated metadata regarding sender, recipient, estimated date of
composition, script, and dialect of Akkadian (if determinable). This
corpus comprises ten of the twenty-one current volumes of SAAo and
contains approximately 2600 letters from the royal archives of the late
Neo-Assyrian kings. Each letter features morphosyntactic annotations
specifying part of speech, lemma, morphological decomposition, and
syntactic dependencies of all relevant tokens in the text. The
annotations were made with the help of a spaCy language model with
additional human checking and completion. The annotations are available
both as a set of CONLLU files (one per text) and as linked open data in a
single TTL file. The associated metadata is available as a CSV file and
a TTL. Due to the letters' shared format, topics of concern, and
historical period in which they were written, this corpus forms a
natural object of study from a linguistic and social historical
perspective. It is hoped this data will be of use to researchers wishing
to do linguistic and sociolinguistic corpus research on these texts.
md5:f2cf59bf53bb96f9dc5121a20acc04ae
|
597.4 kB |
|
|
md5:02f85a8615c1f3828a86c0d6257038e5
|
7.6 MB |
|
|
md5:d0f18d5d7f18ed62790cf7695e034473
|
1.1 kB |
|
|
md5:859f24c630baa9bb3a8c8a5e9cf3dae5
|
6.6 MB |
|
|
md5:a2ed5f7f412364d92dd1545b34505efd
|
5.7 MB |
|
No comments:
Post a Comment