Een Vergelijking Van Discussie-Opdrachten En Geleide Test
At higher levels of second language proficiency, the speaking skill is frequently measured in interviews, but the use of guided tests and group discussions is common too and can improve efficiency. Guided test and discussion create rather different speech situations and might well elicitate dissimilar kinds of oral production: the informal setting of group discussion is certainly less 'anxiety-provoking' and elicitates more natural speech, but it may lead to the use of elementary and unmonitored, minimally correct speech.
In this article we report on a small-scale empirical investigation intended to lay bare differences between the language used in a guided test and in a group discussion at the level of 1st year university students of French.
Although the discussion subject was defined in such a way that it would enable students to discuss a rather wide range of aspects, even superficial analysis of guided test and discussion subject suggested the first to be more content valid: this quality manifested itself in the higher proportion of different words used in the guided test answers. In other respects, there were no systematic differences between test and discussion: proportions of unique words and of less frequent words, in relation to the number of different words, were nearly the same in the two kinds of speech production. Contrary to intuitive expectation, the number of lexical and grammatical errors was greater in test production (i.e. in the formal setting) than in discussion, the guided test being perhaps a more demanding task, but the possible conclusion that the overall quality of second language use was less good in the guided test was not supported by other findings: mean scores of the three raters did not show systematic differences between test and discussion.
Correlations between test scores and discussion scores were about .77, suggesting that as tests of speaking proficiency when the criterion is correctness, guided test and discussion are not as different as they may seem.
The main difference between the two is in rater reliability: interrater correlations for the test were about .82; for the discussion the mean of three correlations was .52, but two of them approximated .60. One of the problems of rating discussions may be the rather unequal participation of the members of the group.
The quality of discussion as a speaking proficiency test can, in our opinion, be improved by defining its subject in such a way that the aspects discussed will be sufficiently diverse and by training or instructing students: they should all participate actively and pay attention to regular turn-taking.
Article language: Dutch