Designing CoSIH: The Corpus of Spoken Israeli Hebrew

Izre'el, Shlomo; Hary, Benjamin; Rahav, Giora

doi:10.1075/ijcl.6.2.01izr

Article published In:

International Journal of Corpus Linguistics
Vol. 6:2 (2001) ► pp.171–197

Designing CoSIH: The Corpus of Spoken Israeli Hebrew

Shlomo Izre'el | Tel Aviv University

Benjamin Hary | Emory University

Giora Rahav | Tel Aviv University

This paper describes the initial design of the Corpus of Spoken Israeli Hebrew (CoSIH). CoSIH will attempt to include a representation of most varieties of spoken Hebrew as it is used in Israel today. CoSIH is designed to consist of two complementary corpora: a main corpus and a supplementary corpus. The main corpus, which will comprise about 90% of the entire collection, will be sampled statistically. For analytical purposes it will use a conceptual tool in the form of a multidimensional matrix combining demographic and contextual tiers. The combined demographic and contextual design will be capable of showing the distribution of speech types in various subgroups of the population. The supplementary corpus will include about 10% of the collected data, and will add to the statistically-sampled corpus some targeted demographically sampled texts and a contextually designed collection. This design is culturally dependent to suit the special structure of the Israeli Hebrew speech community and thus includes both native and non-native speakers of Hebrew. Nonetheless, the principles governing this design are such that they would service study of many other speech communities, to the extent that the design itself may be employed for other corpora with only slight modifications.

Keywords: Israeli Hebrew, corpus design, spoken corpus

Published online: 8 August 2002

https://doi.org/10.1075/ijcl.6.2.01izr

Cited by (14)

Cited by 14 other publications

Order by:

Raso, Tommaso, Bruno Neves Rati de Melo Rocha, João Vinícius Salgado, Breno Fiuza Cruz, Lucas Machado Mantovani & Heliana Mello

2024. The C-ORAL-ESQ project: a corpus for the study of spontaneous speech of individuals with schizophrenia. Language Resources and Evaluation 58:3 ► pp. 903 ff.

Shirtz, Shahar

2023. Siuslaw final-consonant reduplication and the anti-mirative domain. STUF - Language Typology and Universals 76:4 ► pp. 471 ff.

Shirtz, Shahar

2024. Discourse markers as the locus of signaling the main-event line in Alsea narratives. Linguistics 62:1 ► pp. 229 ff.

Dash, Niladri Sekhar & L. Ramamoorthy

2019. Corpus and Future Indian Needs. In Utility and Application of Language Corpora, ► pp. 251 ff.

Ozerov, Pavel

2019. This is not an interrogative: the prosody of “wh-questions” in Hebrew and the sources of their questioning and rhetorical interpretations. Language Sciences 72 ► pp. 13 ff.

Faust, Noam

2014. Where it's [at]: A phonological effect of phasal boundaries in the construct state of Modern Hebrew. Lingua 150 ► pp. 315 ff.

Ribeiro De Mello, Heliana

2014. Methodological issues for spontaneous speech corpora compilation. In Spoken Corpora and Linguistic Studies [Studies in Corpus Linguistics, 61], ► pp. 27 ff.

Verdonik, Darinka, Iztok Kosem, Ana Zwitter Vitez, Simon Krek & Marko Stabej

2013. Compilation, transcription and usage of a reference speech corpus: the case of the Slovene corpus GOS. Language Resources and Evaluation 47:4 ► pp. 1031 ff.

Moneglia, Massimo

2011. Spoken corpora and pragmatics. Revista Brasileira de Linguística Aplicada 11:2 ► pp. 479 ff.

Moneglia, Massimo

2014. The variation of action verbs in multilingual spontaneous speech corpora. In Spoken Corpora and Linguistic Studies [Studies in Corpus Linguistics, 61], ► pp. 152 ff.

Green, Hila

2009. Intonation in Hebrew-Speaking Children with High Functioning Autism. Asia Pacific Journal of Speech, Language and Hearing 12:2 ► pp. 187 ff.

Green, Hila & Yishai Tobin

2009. Prosodic analysis is difficult … but worth it: A study in high functioning autism. International Journal of Speech-Language Pathology 11:4 ► pp. 308 ff.

Conrad, Susan M. & Kimberly R. LeVelle

2008. Corpus Linguistics and Second Language Instruction. In The Handbook of Educational Linguistics, ► pp. 539 ff.

Izre’el, Shlomo

2005. Transcribing Spoken Israeli Hebrew: Preliminary Notes. In Perspectives on Language and Language Development, ► pp. 61 ff.

This list is based on CrossRef data as of 17 october 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.