3.3.7 Closed word-classes: Adverbs, Conjunctions, Particles, Interjections and Adpositions
3.3.8 Abbreviations
3.4 Summary
References
4 Testing and Evaluation
4.1 Introduction
4.2 Rule integrity
4.3 Consistency and order of tags
4.4 Language coverage test: wordlists and corpus data
4.4.1 Corpus compilation
4.4.2 Corpus processing and markup
4.4.3 Language coverage
4.5 Summary
References
5 APPENDICES
5.1 Appendix A. Tagsets
5.2 Appendix B. Flag diacritics
5.3 Appendix C. Structural markup
INDEXES
Name index
Subject index
Irina Lobzhanidze is Professor of Linguistics at Ilia State University, Georgia, where she is also the Director of MA program in Applied linguistics. She received her Ph.D. in Linguistics from Ilia State University, Georgia. She held visiting Georgian studies fellow position at the Oxford School of Global and Area Studies (2019-2020). Her main research interests lie in the areas of morphology and syntax, and their interface. She has worked extensively on developing language processing tools and resources for Georgian, including the morphological analyzer and generator of Georgian. She is the Linguistic Coordinator of the Georgian Language Corpus (GLC), a co-author of the Dictionary of Idioms (2014-2017) and the Principal Re-searcher in the construction of the Wardrops’ Collection Online (WCO). Previously, Irina conducted research on various aspects of Georgian idioms and degree of their “frozenness”.
This handbook provides a comprehensive account of current research on the finite-state morphology of Georgian and enables the reader to enter quickly into Georgian morphosyntax and its computational processing. It combines linguistic analysis with application of finite-state technology to processing of the language. The book opens with the author’s synoptic overview of the main lines of research, covers the properties of the word and its components, then moves up to the description of Georgian morphosyntax and the morphological analyzer and generator of Georgian.
The book comprises three chapters and accompanying appendices. The aim of the first chapter is to describe the morphosyntactic structure of Georgian, focusing on differences between Old and Modern Georgian. The second chapter focuses on the application of finite-state technology to the processing of Georgian and on the compilation of a tokenizer, a morphological analyzer and a generator for Georgian. The third chapter discusses the testing and evaluation of the analyzer’s output and the compilation of the Georgian Language Corpus (GLC), which is now accessible online and freely available to the research community.
Since the development of the analyzer, the field of computational linguistics has advanced in several ways, but the majority of new approaches to language processing has not been tested on Georgian. So, the organization of the book makes it easier to handle new developments from both a theoretical and practical viewpoint.
The book includes a detailed index and references as well as the full list of morphosyntactic tags. It will be of interest and practical use to a wide range of linguists and advanced students interested in Georgian morphosyntax generally as well as to researchers working in the field of computational linguistics and focusing on how languages with complicated morphosyntax can be handled through finite-state approaches.