Corpuscle is a new corpus query engine and Web-based corpus management system. The main design goals were the ability to handle very large corpora, support for structured data (XML), and seamless integration of manual corpus annotation and editing. New algorithms have been developed, among them a technique for running finite state automata from edges with lowest corpus counts, and an implementation of regular expressions on suffix arrays for fast reverse index lookup. These algorithms allow for a clean and elegant implementation of multi-valued and set-valued attributes. The web interface offers rich functionality for concordancing, collocations, distribution statistics, and more. Queries can be input in a graphical, menu-driven way, freeing the user from dealing with the complexities of the query language.
2023. All-cleft constructions in the London–Lund Corpora of spoken English: Empirical and methodological perspectives. Journal of Pragmatics 207 ► pp. 78 ff.
Batinić, Josip, Elena Frick & Thomas Schmidt
2021. Accessing spoken language corpora: an overview of current approaches. Corpora 16:3 ► pp. 417 ff.
Põldvere, Nele, Johan Frid, Victoria Johansson & Carita Paradis
2021. Challenges of releasing audio material for spoken data: The case of the London-Lund Corpus 2. Research in Corpus Linguistics 9:1 ► pp. 35 ff.
PÕLDVERE, NELE, VICTORIA JOHANSSON & CARITA PARADIS
2021. OnThe London–Lund Corpus 2: design, challenges and innovations. English Language and Linguistics 25:3 ► pp. 459 ff.
2019. Avenir et climat : représentations de l’avenir dans des blogs francophones portant sur le changement climatique. Mots :119 ► pp. 33 ff.
Lapponi, Emanuele, Martin G. Søyland, Erik Velldal & Stephan Oepen
2018. The Talk of Norway: a richly annotated corpus of the Norwegian parliament, 1998–2016. Language Resources and Evaluation 52:3 ► pp. 873 ff.
This list is based on CrossRef data as of 29 october 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.