From manual to machine
Evaluating automated ear–voice span measurement in simultaneous interpreting
This study introduces a groundbreaking automated methodology for measuring ear–voice span (EVS) in simultaneous
interpreting (SI). Traditionally, assessing EVS – a critical temporal metric in SI – has been hampered by labour-intensive and
time-consuming manual methods that are prone to inconsistency. To overcome these challenges, our research harnesses
state-of-the-art natural language processing (NLP) technologies, including automatic speech recognition (ASR), sentence boundary
detection (SBD) and cross-lingual alignment, to automate EVS measurement. We deployed a comprehensive array of NLP models and
evaluated the automated pipelines on a 20-hour English-to-Portuguese SI corpus which featured 57 varied audio pairings. The
findings are encouraging: the most effective model combination achieved a median EVS error of less than 0.1 seconds across the
corpus. Moreover, the automated pipelines exhibited a high level of accuracy, strong correlation and substantial agreement with
manual measurements when assessing median EVS for individual audio pairs. Despite these satisfactory results, certain challenges
persist with some NLP models, indicating clear avenues for future research. This study not only introduces a groundbreaking
approach to large-scale EVS measurement but also propels the automation of process analysis in Interpreting Studies.
Article outline
- Introduction
- 1.Ear–voice span measurement in simultaneous interpreting
- 1.1Methods and tools for ear–voice span measurement
- 1.2Statistical techniques in ear–voice span measurement
- 1.3Innovations in ear–voice span measurement
- 2.Natural language processing technologies for ear–voice span measurement
- 2.1Automatic speech recognition models
- 2.2Sentence boundary detection models
- 2.3Cross-lingual alignment models
- 3.Data collection and preparation
- 3.1Compilation of the simultaneous interpreting corpus focused on ear–voice span
- 3.2Stratified corpus sampling for manual validation
- 4.Methodology
- 4.1Automated pipeline for ear–voice span measurement
- 4.2Manual annotation of ear–voice span
- 4.3Manual validation of pipeline components of natural language processing
- 4.4Data-preprocessing and -analysis techniques
- 5.Results
- 5.1Comparative analysis of manual and automated ear–voice span measurement approaches
- 5.2Evaluation of automatic speech recognition, sentence boundary detection and cross-lingual alignment
- 6.Discussion
- 6.1Performance of automated pipelines for ear–voice span measurement
- 6.2Performance of pipeline components
- 6.3Implications and limitations
- 7.Conclusion
- Acknowledgements
- Notes
-
References