Tesis – Conclusion

6.6 Conclusion

SignWriting is a featureful writing system used to transcribe sign languages. Its power as a faithful phonetic representation of signs, both in the lexical and the spatial domains, means it can be a useful tool for empirical linguistic research. For this to work, examples and empirical processing methodologies need to be developed and tested.

To this end, we have built the VisSE Corpus of Spanish SignWriting, a collection of handwritten SignWriting logograms which can be used for automatic processing or linguistic analysis of the data. The corpus is freely available online, and in this article we have presented its annotation schema and construction, and we have performed a brief analysis of the data it includes. We have seen that processing SignWriting requires careful analysis, and believe our work may be useful for the research community both in how we have proceeded as well as in the end result, the published corpus.

As an example of the insight that can be gained from having data and annotating it, we have observed in our research some similarities between SignWriting and oral language writing systems, and some meaningful differences. This similarity and differences parallel the relation of sign languages with oral languages: where sign languages are similar to oral ones, SignWriting resembles oral writing. Both differ to their oral counterparts in the same way: use of space to convey meaning and syntactic relations.

Another insight can be found in how to think of SignWriting. Due to the complexity of sign languages at the phonetic level, SignWriting can be a powerful and flexible tool for dealing with them, thanks to its very detailed and phonetically precise nature. However, it requires a mental and computational model more complex than that of other writing systems. To deal with this complexity, we think it is useful to think of SignWriting as a language in itself, even if a graphical one.

This mental model can be seen in our classification of graphemes in hierarchies, in which if we think of SignWriting as a language, the CLASS tags might be parallel to parts of speech, and SHAPE to particular words. The rest of the annotation, using sets of features, might somewhat resemble the morphological features of words. In the corpus and in this article not much attention has been paid to these semantic properties of graphemes, since we have focused on their descriptive annotation, but we want to note that this similarity of SignWriting elements to words in an oral language, with their syntactic and morphological properties, is not only found in their appearance, but also in their interpretation. Properly extracting this interpretation is a task which remains as future work.

On the other hand, just as space and movement are still unresolved issues for sign language theoretical description, annotation of spatial properties of SignWriting graphemes has been a challenge in itself. We could not base ourselves in any widespread linguistic standard or know-how, since these characteristics are not found in oral languages. We have seen that some of the graphical attributes of graphemes are similar to morphological derivational processes, manifesting as rotations and reflections of characters rather than their insertion or deletion. But others, intrinsically locative, require numerical annotation of positions and regions, and subdivision of paths into smaller elements.

We believe this insight might not be only useful for the computational treatment of SignWriting, but may also mirror some of the problems of computational treatment of sign language per se, and may inspire solutions or ideas in that space.

Moreover, as SignWriting elements map very well to the phonetic features of sign languages, especially when annotated in detail like is done in the VisSE corpus, we think that the study of a corpus of SignWriting can very well be useful to draw conclusions about the sign language it transcribes. This can help make data-based linguistic research on sign languages less costly, as collection of video corpora requires much time, face-to-face collaboration of native informants, and attention to issues such as privacy and distribution of the videos. Linguistic annotation of video is difficult and essentially a manual process, while annotating SignWriting can be faster with the right tools, or can be even done with hand written samples as we show in this corpus, and using the machine learning algorithms trained on it.