Article published In:
Linguistic Approaches to Bilingualism: Online-First ArticlesThe art of wrangling
Working with web-based visual world paradigm eye-tracking data in language research
Web-based eye-tracking is more accessible than ever. Researchers can now carry out visual world paradigm studies
remotely and access never before tested, multilingual populations via the internet all without the need for an expensive
eye-tracker. Web-based eye-tracking, however, requires careful experimental design and extensive data wrangling skills. In this
paper, we provide a framework for reproducible, open science visual world paradigm studies using online experiments. We provide
step-by-step instructions to building a typical visual world paradigm psycholinguistics study, and walk the reader through a
series of data wrangling steps needed to prepare the data for visualization and analysis using the open-source software
environment, R. Importantly, we highlight the key decisions researchers need to make and report in order to reproduce an analysis.
We demonstrate our approach by carrying out a single change replication of an in-person eye-tracking study by Porretta et al. (2020). We conclude with best practices and recommendations for
researchers carrying out bi-/multilingualism web-based visual world paradigm studies.
Article outline
- 1.Introduction
- 1.1The visual world paradigm
- 1.2The core four constructs of a VWP experiment
- Time
- Audio stimuli
- Visual stimuli
- Eye-fixations
- 2.Building a web-based visual world paradigm experiment
- 2.1VWP raw data and tidy data
- 3.Replication of Porretta et al. (2020)
- 3.1Background and motivation
- 3.2Methods
- 3.2.1Participants
- 3.2.2Materials
- 3.2.3Procedure
- 3.3Data analysis
- 3.3.1Questionnaire wrangling
- 3.3.2Behavioral-task wrangling
- 3.3.3Eye-tracking wrangling
- 4.Modeling ET data
- 4.1GLMMs
- 4.1.1GLMMs: Coding
- 4.1.2GLMMs: Models
- 4.2GAMMs
- 4.3Results
- 4.3.1GLMM results
- 4.3.2GAMM results
- 4.1GLMMs
- 5.Discussion
- 5.1Web-based eye-tracking may provide access to unique populations
- 5.2Best practices for web-based visual world paradigm eye-tracking research
- Set clear exclusion criteria for participants prior to data collection
- Include and report behavioral/attention task checks
- Report accuracy cutoffs for participant background information
- Include and report eye-calibration
- Require a minimum median frame-rate greater than 5 Hz
- Identify a quadrant classification method
- Report all time adjustments
- Use a meaningful eye-fixation bin size given the research question
- 6.Conclusion
- Data availability statement
- Competing interests declaration
- Notes
-
References
Published online: 13 August 2024
https://doi.org/10.1075/lab.23071.bra
https://doi.org/10.1075/lab.23071.bra
References (38)
Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking
the time course of spoken word recognition using eye movements: Evidence for continuous mapping
models. Journal of Memory and
Language,
38
(4), 419–439.
Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J. K. (2019). Gorilla
in our midst: An online behavioral experiment builder. Behavior Research
Methods,
52
(1), 388–407.
Apfelbaum, K. S., Klein-Packard, J., & McMurray, B. (2021). The
pictures who shall not be named: Empirical support for benefits of preview in the visual world
paradigm. Journal of Memory and
Language,
121
1, 104279.
Barr, D. J. (2008). Analyzing
‘visual world’ eyetracking data using multilevel logistic regression. Journal of Memory and
Language,
59
(4), 457–474.
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random
effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and
Language,
68
(3), 255–278.
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2014). Fitting
linear Mixed-Effects models using lme4.
Bolibaugh, C., Vanek, N., & Marsden, E. J. (2021). Towards
a credibility revolution in bilingualism research: Open data and materials as stepping stones to more reproducible and
replicable research. Bilingualism: Language and
Cognition,
24
(5), 801–806.
Brysbaert, M., & Stevens, M. (2018). Power
analysis and effect size in mixed effects models: A tutorial. Journal of
Cognition,
1
(1):9.
Brown, B., Tusmagambet, B., Rahming, V., Tu, C.-Y., DeSalvo, M. B., & Wiener, S. (2023). Searching
for the “native” speaker: A preregistered conceptual replication and extension of Reid, Trofimovich, and O’Brien
(2019). Applied
Psycholinguistics,
44
(4), 475–494.
Chen, M. C., Anderson, J. R., & Sohn, M. H. (2001). What
can a mouse cursor tell us more? CHI ’01 Extended Abstracts on Human Factors in Computing
Systems.
Cooper, R. M. (1974). The
control of eye fixation by the meaning of spoken language. Cognitive
Psychology,
6
(1), 84–107.
Coretta, S., Casillas, J. V., Roessig, S., Franke, M., Ahn, B., Al-Hoorie, A. H., Al-Tamimi, J., Alotaibi, N. E., AlShakhori, M. K., Altmiller, R. M., & et al. (2023). Multidimensional
signals and analytic flexibility: Estimating degrees of freedom in human-speech
analyses. Advances in Methods and Practices in Psychological
Science,
6
(3).
Cunnings, I., & Fujita, H. (2021). Quantifying
individual differences in native and nonnative sentence processing. Applied
Psycholinguistics,
42
(3), 579–599.
Foster, E. D., & Deardorff, A. (2017). Open
science framework (osf). Journal of the Medical Library
Association,
105
(2).
Han, J., Kim, J., & Tsukada, K. (2023). Foreign
accent in L1 (first language). Linguistic Approaches to Bilingualism.
Huettig, F., & McQueen, J. M. (2007). The
tug of war between phonological, semantic and shape information in language-mediated visual
search. Journal of Memory and
Language, 57(4), 460–482.
Ito, A., & Knoeferle, P. (2022). Analysing
data from the psycholinguistic visual-world paradigm: Comparison of different analysis
methods. Behavior Research Methods. [URL].
Kidd, E., Donnelly, S., & Christiansen, M. H. (2018). Individual
differences in language acquisition and processing. Trends in Cognitive
Sciences,
22
(2), 154–169.
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting
outliers: Do not use standard deviation around the mean, use absolute deviation around the
median. Journal of Experimental Social
Psychology,
49
(4), 764–766.
Marsden, E., Morgan-Short, K., Thompson, S., & Abugaber, D. (2018). Replication
in second language research: Narrative and systematic reviews and recommendations for the
field. Language
Learning,
68
(2), 321–391.
Matin, E., Shao, K. C., & Boff, K. R. (1993). Saccadic
overhead: Information-processing time with and without saccades. Perception &
Psychophysics,
53
(4), 372–380.
McMurray, B. (2023). I’m
not sure that curve means what you think it means: Toward a [more] realistic understanding of the role of eye-movement
generation in the visual world paradigm. Psychonomic Bulletin &
Review,
30
(1), 102–146.
Milne, A. E., Bianco, R., Poole, K. C., Zhao, S., Oxenham, A. J., Billig, A. J., & Chait, M. (2021). An
online headphone screening test based on dichotic pitch. Behavior Research
Methods,
53
(4), 1551–1562.
Mirman, D., Dixon, J. A., & Magnuson, J. S. (2008). Statistical
and computational models of the visual world paradigm: Growth curves and individual
differences. Journal of Memory and
Language,
59
(4), 475–494.
Palan, S., & Schitter, C. (2018). Prolific.ac –
a subject pool for online experiments. Journal of Behavioral and Experimental
Finance,
17
1, 22–27.
Papoutsaki, A., Sangkloy, P., Laskey, J., Daskalova, N., Huang, J., & Hays, J. (2016). Webgazer:
Scalable webcam eye tracking using user interactions. Proceedings of the Twenty-Fifth
International Joint Conference on Artificial
Intelligence, 3839–3845.
Perpiñán, S. and Montrul, S. (2023). Does
your regional variety help you acquire an additional language? Linguistic Approaches to
Bilingualism,
13
(5), 663–692.
Porretta, V., Buchanan, L., & Järvikivi, J. (2020). When
processing costs impact predictive processing: The case of foreign-accented speech and accent
experience. Attention, Perception, &
Psychophysics,
82
(4), 1558–1565.
Prystauka, Y., Altmann, G. T., & Rothman, J. (2023). Online
eye tracking and real-time sentence processing: On opportunities and efficacy for capturing psycholinguistic effects of
different magnitudes and diversity. Behavior Research Methods.
R Core Team. (2022). R: A language
and environment for statistical computing. R Foundation for Statistical
Computing. Vienna, Austria.
Rodd, J. M. (in
press). Moving experimental psychology online: How to obtain high quality data when we can’t
see our participants. Journal of Memory and
Language,
134
(104472), 104472.
Rothman, J., Bayram, F., DeLuca, V., Di Pisa, G., Duñabeitia, J. A., Gharibi, K., … Wulff, S. (2023). Monolingual
comparative normativity in bilingualism research is out of “control”: Arguments and
alternatives. Applied
Psycholinguistics,
44
(3), 316–329.
Semmelmann, K., & Weigelt, S. (2017). Online
webcam-based eye tracking in cognitive science: A first look. Behavior Research
Methods,
50
(2), 451–465.
Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. C. (1995). Integration
of visual and linguistic information in spoken language
comprehension. Science,
268
(5217), 1632–1634.
Vos, M., Minor, S., & Ramchand, G. C. (2022). Comparing
infrared and webcam eye tracking in the visual world paradigm. Glossa
Psycholinguistics,
1
(1).
Wickham, H., & Grolemund, G. (2017, January). R
for data science: Import, tidy, transform, visualize, and model data (1st
ed.). O’Reilly Media. [URL]