regex ?* {mente} "+Adv":0;
This script provides a series of fallback transducers that try to tag unknown entities in the input.
If a word in Spanish ends in mente, it’s almost certainly an adverb Adv
.
regex ?* {mente} "+Adv":0;
A lot of adjectives (Adj
) related to a process end in an l, and it’s not a common
ending for other types of words.
regex ?* l "+Adj":0;
Noun (sustantive, Sust
) morphology in Spanish is quite regular, so o and
a endings usually denote gender, and an s is often a plural marker.
regex ?* [o "+Masc":0 |a "+Fem":0] ["+Sing":0|"+Plural":s] "+Sust":0;
Another option is a noun that ends in a consonant, then it’s usually pluralized with an es morpheme.
define Cons [b|c|d|f|g|h|j|k|l|m|n|ñ|p|q|r|s|t|v|w|x|y|z];
regex ?* Cons 0:e "+Plural":s "+Sust":0;
If we can’t detect the gender and number of the word, just mark it as a plain noun.
regex ?* "+Sust":0;