Chapter 1: Traditions in Corpus Design.- Chapter 2: Current Welsh-language Corpora.- Chapter 3: Visions and Aims of CorCenCC.- Chapter 4: Work-flow: The Processes of Corpus Design and Construction.- Chapter 5: Copyright, Permissions and Licenses.- Chapter 6: Spoken Data.- Chapter 7: Written Data.- Chapter 8: E-Language Data.- Chapter 9: Challenges and Solutions.- Chapter 10: Summary and Guidelines for Good Practice.
Dawn Knight is a Reader in Applied Linguistics at Cardiff University, UK, and Chair of the British Association for Applied Linguistics (BAAL). She was the Principal Investigator (PI) of the CorCenCC (National Corpus of Contemporary Welsh) project.
Steve Morris is an Honorary Research Fellow in Applied Linguistics at Swansea University, UK. He was a co-investigator on the CorCenCC (National Corpus of Contemporary Welsh) project.
Laura Arman is a Research Associate at Cardiff University, UK. Her research is centred on the linguistics of minoritised languages, with a focus on her native language of Welsh.
Jennifer Needs works for the UK Civil Service as a Welsh language translator. She previously worked at Swansea University and Cardiff University as a Welsh-medium researcher.
Mair Rees gained her PhD in Welsh literature from Cardiff University, UK in 2012. She was a creative editor for a major Welsh-language publisher for four years before joining the CorCenCC team as a research assistant. She currently works part-time as a translator in addition to running her own small business.
This book aims to provide a micro-level, working model of a methodological approach and practical guidelines for building a corpus, informed by the work on the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes - the National Corpus of Contemporary Welsh). It focuses specifically on the development of detailed design frames for corpora across communicative modes (spoken, written and e-language), and the practical processes involved in the planning, collection, transcription, collation and (re)presentation of language data. The book is designed to be of significant value and relevance to those interested in critically engaging with corpus methodology. Although Welsh is the language under discussion, the processes and approaches discussed in the building of CorCenCC can be applied to a lesser or greater extent to other language contexts. This book provides a working model, and an account of how to build a corpus dataset from which step by step guidelines for creating other linguistic corpora in any language can be easily extrapolated. It will be of value to students and scholars of minority languages and corpus linguistics.
Dawn Knight is a Reader in Applied Linguistics at Cardiff University, UK, and Chair of the British Association for Applied Linguistics (BAAL). She was the Principal Investigator (PI) of the CorCenCC (National Corpus of Contemporary Welsh) project.
Steve Morris is an Honorary Research Fellow in Applied Linguistics at Swansea University, UK. He was a co-investigator on the CorCenCC (National Corpus of Contemporary Welsh) project.
Laura Arman is a Research Associate at Cardiff University, UK. Her research is centred on the linguistics of minoritised languages, with a focus on her native language of Welsh.
Jennifer Needs works for the UK Civil Service as a Welsh language translator. She previously worked at Swansea University and Cardiff University as a Welsh-medium researcher.
Mair Rees gained her PhD in Welsh literature from Cardiff University, UK in 2012. She was a creative editor for a major Welsh-language publisher for four years before joining the CorCenCC team as a research assistant. She currently works part-time as a translator in addition to running her own small business.