Article published In:
Compilation, transcription, markup and annotation of spoken corpora
Edited by John M. Kirk and Gisle Andersen
[International Journal of Corpus Linguistics 21:3] 2016
► pp. 419438
References (50)
Anderson, A.H., Bader, M., Gurman Bard, E., Boyle, E., Doherty, G., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H.S., & Weinert, R. (1991). The HCRC Map Task Corpus. Language and Speech, 34(4), 351–366. DOI logoGoogle Scholar
Belz, M. (2013). Disfluencies und Reparaturen bei Muttersprachlern und Lernern: Eine kontrastive Analyse. Humboldt-Universität zu Berlin. Retrieved from [URL] (last accessed March 2014).Google Scholar
BeMaTaC. (2014). BeMaTaC: A Deeply Annotated Multimodal Map-task Corpus of Spoken Learner and Native German. Retrieved from [URL] (last accessed March 2014).Google Scholar
Boersma, P. (2010). Praat: A system for doing phonetics by computer. Glot International, 5(9/10), 341–345.Google Scholar
Brinckmann, C., Kleiner, S., Knöbl, R., & Berend, N. (2008). German today: An areally extensive corpus of spoken Standard German. In N. Calzolari, Kh. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis & D. Tapias (Eds.), Proceedings of the Sixth International Conference on Language Resources and Evaluation (pp. 3185–3191). Paris: ELRA.Google Scholar
Buchholz, S., & Marsi, E. (2006). CoNLL-X shared task on multilingual dependency parsing. In L. Màrquez & D. Klein (Eds.), Proceedings of the 10th Conference on Computational Natural Language Learning (pp. 149–164). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Burnard, L. (Ed.). (2007). Reference Guide for the British National Corpus (XML Edition). Oxford: Research Technologies Service. Retrieved from [URL] (last accessed March 2014).Google Scholar
Carletta J., Evert, S., Heid, U., Kilgour, J., Robertson, J., & Voormann, H. (2003). The NITE XML Toolkit: Flexible annotation for multi-modal language data. Behavior Research Methods, Instruments, & Computers, 35(3), 353–363. DOI logoGoogle Scholar
Carletta J., Evert, S., Heid, U., & Kilgour, J. (2005). The NITE XML Toolkit: Data model and query. Language Resources and Evaluation, 39(4), 313–334. DOI logoGoogle Scholar
Chiarcos, C., Dipper, S., Götze, M., Leser, U., Lüdeling, A., Ritz, J., & Stede, M. (2009). A flexible framework for integrating annotations from different tools and tagsets. Traitement Automatique des Langues, 49(2), 271–291.Google Scholar
Creative Commons. (2014). About the Licenses - Creative Commons. Retrieved from [URL] (last accessed March 2014).Google Scholar
Dipper, S. (2005). XML-based stand-off representation and exploitation of multi-level linguistic annotation. In R. Eckstein & R. Tolksdorf (Eds.), Proceedings of Berliner XML Tage 2005 (pp. 39–50). Berlin: Humboldt-Universität zu Berlin.Google Scholar
Dipper, S., Lüdeling, A., & Reznicek, M. (2013). NoSta-D: A corpus of German non-standard varieties. In M. Zampieri & S. Diwersy (Eds.), Non-Standard Data Sources in Corpus-Based Research (pp. 69–76). Aachen: Shaker.Google Scholar
Druskat, S., Bierkandt, L., Gast, V., Rzymski, C., & Zipser, F. (2014). Atomic: An open-source software platform for multi-level corpus annotation. In J. Ruppenhofer & G. Faaß (Eds.), Proceedings of the 12th Konferenz zur Verarbeitung natürlicher Sprache (KONVENS 2014) (pp. 228–234). Retrieved from [URL] (last accessed May 2015).Google Scholar
Gerdes, K. (2014). Arborator [Computer software]. Retrieved from [URL] (last accessed March 2014).Google Scholar
Giesel, L., Klapi, M., Krüger, D., Nunberger, I., Rasskazova, O., & Sauer, S. (2013) Berlin Map Task Corpus: A deeply annotated multimodal map-task corpus of spoken learner and native German. Poster presented at the 35. Jahrestagung der Deutschen Gesellschaft für Sprachwissenschaft , Potsdam, Germany. Retrieved from [URL] (last accessed March 2014).
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I.H. (2009). The WEKA data mining software: An update. In O.R. Zaiane (Ed.), SIGKDD Explorations, 11(1), 10–18.Google Scholar
Hanke, T., & Storz, J. (2008). iLex: A database tool for integrating sign language corpus linguistics and sign language lexicography. In O. Crasborn, E. Efthimiou, T. Hanke, E. Thoutenhoofd & I. Zwitserlood (Eds.), LREC 2008 Workshop, Proceedings, W 25: 3rd Workshop on the Representation and Processing of Sign Languages: Construction and Exploitation of Sign Language Corpora (pp. 64–67). Paris: ELRA.Google Scholar
Himmelmann, N.P. (2012). Linguistic data types and the interface between language documentation and description. Language Documentation & Conservation, 61, 187–207.Google Scholar
Hinrichs, E.W., Hinrichs, M., & Zastrow, T. (2010). WebLicht: Web-Based LRT services for German. In ACL 2010 System Demonstrations, Proceeding (pp. 25–29). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Ide, N., & Suderman, K. (2007). GrAF: A graph-based format for linguistic annotations. In B. Boguraev, N. Ide, A. Meyers, Sh. Nariyama, M. Stede, J. Wiebe & G. Wilcock (Eds.), ACL 2007 Workshop, Proceedings, Linguistic Annotation Workshop (pp. 25–29). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Kirk, J.M. (this volume). The pragmatic annotation scheme of the SPICE-Ireland corpus.
Krause, T., Lüdeling, A., Odebrecht, C., & Zeldes, A. (2012). Multiple tokenization in a diachronic corpus. Paper presented at Exploring Ancient Languages through Corpora Conference 2012 , Oslo. Retrieved from [URL] (last accessed March 2014).
Krause, T., & Zeldes, A. (2014). ANNIS3: A new architecture for generic corpus query and visualization. Digital Scholarship in the Humanities. Retrieved from [URL] (last accessed May 2015).Google Scholar
Lüdeling, A. (2011). Corpora in linguistics: Sampling and annotation. In K. Grandin (Ed.), Going Digital. Evolutionary and Revolutionary Aspects of Digitization (pp. 220–243). New York, NY: Science History Publications.Google Scholar
Max Planck Society. (2014). Max Planck Open Access: Berlin Declaration. Retrieved from [URL] (last accessed March 2014).Google Scholar
Müller, C., & Strube, M. (2006). Multi-level annotation of linguistic data with MMAX2. In S. Braun, K. Kohn & J. Mukherjee (Eds.), Corpus Technology and Language Pedagogy (pp. 197–214). Frankfurt am Main: Peter Lang,Google Scholar
Nivre, J. (2008). Treebanks. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics. An International Handbook (pp. 225–241). Berlin: Mouton de Gruyter.Google Scholar
Pajas P., & Stepanek, J. (2008). Recent advances in a feature-rich framework for treebank annotation. In Proceedings of the 22nd International Conference on Computational Linguistics (pp. 673–680). Stroudsburg, PA: Association for Computational Linguistics.
R Core Team. (2013). R: A Language and Environment for Statistical Computing [Computer software]. Retrieved from [URL] (last accessed March 2014).Google Scholar
Sauer, S., & Rasskazova, O. (2014). BeMaTaC: Eine digitale multimodale Ressource für Sprach- und Dialogforschung. Poster presented at the workshop Grenzen überschreiten – Digitale Geisteswissenschaft heute und morgen , Berlin, Germany. Retrieved from [URL] (last accessed March 2014).
Schiel, F., Draxler, C., & Harrington, J. (2011). Phonemic segmentation and labelling using the MAUS technique. Workshop New Tools and Methods for Very-Large-Scale Phonetics Research . Retrieved from [URL] (last accessed April 2016).
Schiller, A., Teufel, S., Stöckert, C., & Thielen, C. (1999). Guidelines für das Tagging deutscher Textcorpora mit STTS (Kleines und großes Tagset). Retrieved from [URL] (last accessed March 2014).Google Scholar
Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of International Conference on New Methods in Language Processing . Retrieved from [URL] (last accessed November 2014).
. 2008. Tokenizing and part-of-speech tagging. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics. An International Handbook (pp. 527–551). Berlin: Mouton de Gruyter.Google Scholar
Schmidt, T. (2004). Transcribing and annotating spoken language with EXMARaLDA. In A. Witt, U. Heid, H.S. Thompson, J. Carletta & P. Wittenburg (Eds.), LREC 2004 Workshop, Proceedings, XML-based Richly Annotated Corpora (pp. 69–74). Paris: ELRA.Google Scholar
Schmidt, T., & Wörner, K. (2009.) EXMARaLDA: Creating, analysing and sharing spoken language corpora for pragmatic research. Pragmatics, 19(4), 565–582. DOI logoGoogle Scholar
Schmidt, T., Hedeland, H., Lehmberg, T., & Wörner, K. (2010). HAMATAC: The Hamburg MapTask Corpus. Retrieved from [URL] (last accessed March 2014).
Sloetjes, H., & Wittenburg, P. (2008). Annotation by category: ELAN and ISO DCR. In N. Calzolari, Kh. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis & D. Tapias (Eds.), Proceedings of the Sixth International Conference on Language Resources and Evaluation (pp. 816–820). Paris: ELRA.Google Scholar
Stede, M. (2011). Discourse Processing. San Rafael, CA: Morgan & Claypool.Google Scholar
Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., & Tsujii, J. 2012. Brat: A web-based tool for NLP-assisted text annotation. In F. Segond (Ed.), Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp. 102–107). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Stührenberg, M. (2012). The TEI and current standards for structuring linguistic data. In P. Bański, E. Litta Modignani Picozzi & A. Witt (Eds.), Journal of the Text Encoding Initiative, 31. Retrieved from [URL] (last accessed March 2014).Google Scholar
TEI Consortium. (2014). TEI: Text Encoding Initiative. Retrieved from [URL] (last accessed March 2014).Google Scholar
Thompson, P. (2005). Spoken language corpora. In M. Wynne (Ed.), Developing Linguistic Corpora: A Guide to Good Practice (pp. 59–70). Oxford: Oxbow Books. Retrieved from [URL] (last accessed March 2014).Google Scholar
Wichmann, A. (2008). Speech corpora and spoken corpora. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics. An International Handbook (pp. 187–207). Berlin: Mouton de Gruyter.Google Scholar
Wörner, K. (2009). Werkzeuge zur flachen Annotation von Transkriptionen gesprochener Sprache. Bielefeld: Bielefeld University. Retrieved from [URL] (last accessed April 2016).Google Scholar
Wynne, M. (2008). Searching and concordancing. In A. Lüdeling, & M. Kytö. (Eds.), Corpus Linguistics. An International Handbook (pp. 706–737). Berlin: Mouton de Gruyter.Google Scholar
Yimam, S.M., Gurevych, I., Eckart de Castilho, R., & Biemann, C. (2013). WebAnno: A flexible, web-based and visually supported system for distributed annotations. In M. Butt & S. Hussain (Eds.), 51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference System Demonstration (pp. 1–6). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Zeldes, A., Ritz, J., Lüdeling, A., & Chiarcos, C. (2009). ANNIS: A search tool for multi-layer annotated corpora. In M. Mahlberg, V. González-Díaz & C. Smith (Eds.), Proceedings of Corpus Linguistics 2009. Retrieved from [URL] (last accessed March 2014).Google Scholar
Zipser, F., & Romary, L. (2010). A model oriented approach to the mapping of annotation formats using standards. In G. Budin, L. Romary, T. Declerck & P. Wittenburg (Eds.), LREC 2010 Workshop, Proceedings, W4: Language Resource and Language Technology Standards. Paris: ELRA. Retrieved from [URL] (last accessed November 2014).Google Scholar
Cited by (9)

Cited by nine other publications

Lemmenmeier-Batinić, Dolores, Josip Batinić & Anastasia Escher
2023. Map Task Corpus of Heritage BCMS spoken by second-generation speakers in Switzerland. Language Resources and Evaluation 57:4  pp. 1607 ff. DOI logo
Hirschmann, Hagen & Thomas Schmidt
2022. Gesprochene Lernerkorpora: Methodisch-technische Aspekte der Erhebung, Erschließung und Nutzung. Zeitschrift für germanistische Linguistik 50:1  pp. 36 ff. DOI logo
Wisniewski, Katrin
2022. Gesprochene Lernerkorpora des Deutschen: Eine Bestandsaufnahme. Zeitschrift für germanistische Linguistik 50:1  pp. 1 ff. DOI logo
Põldvere, Nele, Johan Frid, Victoria Johansson & Carita Paradis
2021. Challenges of releasing audio material for spoken data: The case of the London-Lund Corpus 2. Research in Corpus Linguistics 9:1  pp. 35 ff. DOI logo
Weise, Andreas, Vered Silber-Varod, Anat Lerner, Julia Hirschberg & Rivka Levitan
2020. Entrainment in spoken Hebrew dialogues. Journal of Phonetics 83  pp. 101005 ff. DOI logo
Zeldes, Amir
2020. Corpus Architecture. In A Practical Handbook of Corpus Linguistics,  pp. 49 ff. DOI logo
Belz, Malte, Simon Sauer, Anke Lüdeling & Christine Mooshammer
2017. Fluently disfluent?. International Journal of Learner Corpus Research 3:2  pp. 118 ff. DOI logo
KIRK, JOHN M.
2017. Developments in the spoken component of ICE corpora. World Englishes 36:3  pp. 371 ff. DOI logo
Diemer, Stefan, Marie-Louise Brunner & Selina Schmidt
2016. Compiling computer-mediated spoken language corpora. International Journal of Corpus Linguistics 21:3  pp. 348 ff. DOI logo

This list is based on CrossRef data as of 5 august 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.