From real-life situated discourse to video-stream data-mining
An argument for agent-oriented modeling for multimodal corpus compilation
This paper presents an argument for agent-oriented modeling (AOM) as a research methodology and a metalanguage for corpus linguistics. It is triggered by three closely related issues arising from compiling multimodal corpora such as the Spoken Chinese Corpora of Situated Discourse (SCCSD). Given a real-life situation, there are three types of representation: (i) the Written Word representation, (ii) audio recording, and (iii) video recording. It is shown that the three types are all data-transformative and involve data loss, and that they are intrinsically flawed. The current multiple-layered approach to data integration is also shown to be inadequate. AOM is proposed to be a potential solution to the problems. Modeling decision tree, levels of modeling, and modeling schema written in XML are demonstrated. The philosophical basis of AOM, and its theoretical implications are also discussed.
Keywords: real-life situated discourse, multimodal corpus, corpus quality, video data-mining, agent-oriented modeling
Published online: 15 December 2009
Cited by 5 other publications
This list is based on CrossRef data as of 15 april 2022. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.