Iliados: Structural Search: Perform grammatical and syntactical searches on the Perseus Greek Treebank
Iliados: Structural Search: Perform grammatical and syntactical searches on the Perseus Greek Treebank
This is a brief overview of the query language for searching the Perseus Treebank data,
which has syntactically annotated ancient texts, such as Homer's Iliad.
Each sentence in the texts are turned into trees, like sentence
diagrams, in a format called a dependency tree. The query language for searching these trees is just the CSS3 query language, with some custom additions.
- To begin, you can search for any lemma by simply typing the lemma directly:
φθογγή
- To search for all words in the accusative case, simply proceed the case by a colon (
:
):
:accusative
- A term proceeded by a colon is called a "pseudo-selector". Any
part-of-speech, tense, gender, person, number, case, voice, mood, or
degree can be searched with this pseudo-selector syntax, as in
:accusative
:optative
:imperative
:dual
:verb
- To search for words that have multiple features, such as a singular,
third-person, verb, concatenate the pseudo-selectors together:
:third:singular:verb
- To search for the occurrence of a specific lemma with certain attributes, concatenate the lemma to the the pseudo-selectors:
αἰτέω:first:singular:present
- Another psuedo-selector worth knowing is the
:root
selector. It searches for all "root" words in a sentence, i.e., the main clause, i.e., those words with parentId=0
. It's worth noting that, in the dependency tree format, final punctuation (".", ";", and so forth) also have parentId=0
. The :root
selector, however, excludes punctuation. - To search for a specific morpheme/form (i.e., an inclined or conjugated word), use a selector like
[form=?]
, as in
[form=φθογγὴν]
- Selectors like
[a=?]
are called "attribute-selectors". In addition to the form
attribute-selector, the relation
selector is very useful. Here is a search for all conditionals:
εἰ[relation=AuxC]
- With everything we've learned so far, it's easy to find substantival infinitives used as subjects, isn't it?
:infinitive:verb[relation=SBJ]
- Or we can search for genitive absolutes, my least favorite feature of Greek!
:genitive[relation=SBJ]
- So far we've seen how to search for individual words that have
certain inflectional and syntactic features. But we can also search for
the relationships between words, or in the jargon of
dependency-trees, the dependency relationship between terms. In a
dependency tree, a word that depends upon another, or modifies another,
has a parent-child relationship. For example, when an adjective modifies
a noun, the adjective is the child, the noun is the parent, and the
relationship between them is
ATR
. Searching for parent-child relationships uses the greater-than (>
) operator, as in
:noun > :adjective
- As a more concrete example, to find all adjectives modifying
μῆνις
, do this:
μῆνις > :adjective[relation=ATR]
- It's worth noting that not adjectives aren't the only things that
can modify nouns. Certain genitives do this, as in "Διὸς μῆνις". Here is
a search for anything modifying "μῆνις"
μῆνις > [relation=ATR]
- At this point, we know enough to search for indicative verbs with accusative objects where the verb is that of a main clause:
:verb:indicative:root > :accusative[relation=OBJ]
- But we should note that the query above is incomplete! In fact, some
sentences begin with a coordinating conjunction ("but", "and", etc.).
In the dependency-tree scheme, these are the parents of the main verb.
So now we must express a pattern 3 levels deep:
[relation=COORD]:root > :verb:indicative > :accusative[relation=OBJ]
- Fortunately, it's possible to combine two independent queries together, by using the comma (
,
) operator, as in selector1, selector2
. So we can mingle the previous two queries like this:
:verb:indicative:root > :accusative[relation=OBJ], [relation=COORD]:root > :verb:indicative > :accusative[relation=OBJ]
- At this point, we also should know how to search for future-less-vivid conditionals, which could be my favorite!
:optative > εἰ[relation=AuxC] > :optative
- Another class of problems relates to the order of words
in sentences. Since the trees themselves express only syntactic
relationships, there are special operators to look for ordinal
relationships between words. For example, to search for
"subject-verb-object" word order, you can use the
:before
and :after
pseudo-selectors:
:verb:before([relation=SBJ]):after([relation=OBJ])
In this example, the before
and after
pseudo-selectors are functions that take arguments. There arguments are selectors themselves (e.g., [relation=SBJ]
) and those are evaluated relative to a parent selector. That sounds a bit complicated, but it means that :verb:before([relation=SBJ])
will only look for words with [relation=SBJ]
that are also children of the verb in the dependency tree. - Sometimes, however, it's simpler or more useful to search only for
word order and ignore any syntactic relationship between the words.
There are two ways to do this. The first uses the plus (
+
) operator, which looks for immediately adjacent words within a sentence, as in
φίλος + γάρ + εἰμί
- The second is the tilda (
~
) operator, which looks only at word order in a sentence, ignoring whether the terms are right next to each other:
φίλος ~ εἰμί
No comments:
Post a Comment