Automatic linguistic analysis and Entity Linking from I Samuel 25
It is our pleasure to announce the latest data release from Coptic
Scriptorium, version 4.2.0. This release contains both new Coptic
material and additions to older datasets, as well as expanding our
entity annotations and named-entity linking to all of our data,
including the semi-automatically annotated Old Testament. The also means
automatic updates to all of our interfaces, such as the recently added
example usage functionality in the Coptic Dictionary Online, which is linked to the corpora.
The new material, including more digitized data courtesy of the Marcion project, as well as manually digitized and corrected OCR data from out of print editions includes:
More Apophthegmata Patrum (work by Christine Luckritz Marquis, So Miyagawa, Caroline T. Schroeder and Amir Zeldes)
Further material from Shenoute’s works:
God Says Through Those Who Are His (including
parallel witnesses and new material, data courtesy of David Brakke,
annotations by Rebecca Krawiec, Lance Martin, Dana Robinson, Caroline T.
Schroeder)
Acephalous Work 22 (data
courtesy of David Brakke, annotations by Elizabeth Davidson, Rebecca
Krawiec, Elizabeth Platte, Caroline T. Schroeder, Amir Zeldes)
More syntactically annotated gold treebanked data in the Coptic Treebank
Completely re-annotated Old Testament corpus, based on the base text courtesy of the Digital Edition of the Coptic Old Testament
(CoptOT) project – with improved segmentation and parsing, now complete
with semi-automatic entity recognition and linking to Wikipedia entries
for people and places
With this new release, the semi-automatically annotated data
(excluding automatically processed Bible materials) in the project
covers close to 300,000 words of Sahidic Coptic annotated for entities.
This release represents a tremendous amount of work over the past few
months by the Coptic Scriptorium team. We would also like to thank
individual contributors (which you can always find in the ‘annotation’
metadata for each document), and specifically So Miyagawa for help with
Coptic OCR models, as well as the Marcion and CoptOT project for sharing
their data with us, and the National Endowment for the Humanities for
supporting us. We are continuing to work on more data, links to other
resources and new kinds of annotations and tools. Please let us know if
you have any feedback!
The AWOL Index: The bibliographic data presented herein has been programmatically extracted from the content of AWOL - The Ancient World Online (ISSN 2156-2253) and formatted in accordance with a structured data model.
AWOL is a project of Charles E. Jones, Tombros Librarian for Classics and Humanities at the Pattee Library, Penn State University
AWOL began with a series of entries under the heading AWOL on the Ancient World Bloggers Group Blog. I moved it to its own space here beginning in 2009.
The primary focus of the project is notice and comment on open access material relating to the ancient world, but I will also include other kinds of networked information as it comes available.
The ancient world is conceived here as it is at the Institute for the Study of the Ancient World at New York University, my academic home at the time AWOL was launched. That is, from the Pillars of Hercules to the Pacific, from the beginnings of human habitation to the late antique / early Islamic period.
AWOL is the successor to Abzu, a guide to networked open access data relevant to the study and public presentation of the Ancient Near East and the Ancient Mediterranean world, founded at the Oriental Institute, University of Chicago in 1994. Together they represent the longest sustained effort to map the development of open digital scholarship in any discipline.
No comments:
Post a Comment