Chapter published in:Language and Text: Data, models, information and applications
Edited by Adam Pawłowski, Jan Mačutek, Sheila Embleton and George Mikros
[Current Issues in Linguistic Theory 356] 2021
► pp. 163–176
A Modern Greek readability tool
Development of evaluation methods
The aim of this paper is to develop an automatic readability analysis tool that focusses on Modern Greek as a foreign language. Based on previous work done in the Centre for the Greek Language (CGL), we offer an enhanced methodology in readability prediction for Modern Greek texts matching the adequacy level (A1 to C2) according to the Common European Framework of Languages. The proposed tool is based on several stylometric indices inspired by work done in the field of quantitative linguistics. The resulting feature vectors train a Random Forest, a robust and accurate machine learning algorithm that predicts readability in our testing dataset with 0.943 accuracy, surpassing all previous readability tools for Modern Greek. Further, analysis of the results with advanced visualization methods reveals the complex and fluid dynamics of the features used and their readability predictions.
- 2.Readability analysis: A short literature review
- 3.3Machine learning algorithm: Random Forest
Published online: 22 December 2021
Azpiazu, Ion Madrazo & Maria Soledad Pera
Collins-Thompson, Kevyn & James P. Callan
Dale, Edgar & Jeanne S. Chall
François, Thomas & Cédrick Fairon
2012 An “AI readability” formula for French as a foreign language. In Jun’ichi Tsujii, James Henderson & Marius Paşca (eds.), Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 466–477. Jeju Island, Korea: Association for Computational Linguistics.
Graesser, Arthur C., Danielle S. McNamara, Max M. Louwerse & Zhiqiang Cai
Hirsch, Jorge E.
2018 October 19. Why random forest is my favorite machine learning model. Towards Data Science. Retrieved 5 September 2020, from https://towardsdatascience.com/why-random-forest-is-my-favorite-machine-learning-model-b97651fa3706
Kincaid, Peter J., Robert P. Fishburne Jr., Richard L. Rogers & Brad S. Chissom
2018 August 30. An implementation and explanation of the random forest in Python. Towards Data Science. Retrieved 5 September 2020, from https://towardsdatascience.com/an-implementation-and-explanation-of-the-random-forest-in-python-77bf308a9b76
Kubát, Miroslav, Vladimír Matlach & Radek Čech
Martinc, Matej, Senja Pollak & Marko Robnik Šikonja
2018 Assessing readability with deep neural language models. Paper presented at the 2nd HBP Student Conference: Transdisciplinary Research Linking Neuroscience, Brain Medicine and Computer Science, Ljubljana, Slovenia, February 14–16.
McIntosh, Robert P.
McLaughlin, G. Harry
Mohammadi, Hamid & Seyed Hossein Khasteh
Pitler, Emily & Ani Nenkova
Popescu, Ioan-Iovitz & Gabriel Altmann
Popescu, Ioan-Iovitz, Karl-Heinz Best & Gabriel Altmann
Popescu, Ioan-Iovitz, Gabriel Almann, Peter Grzybek, Bijapur D. Jayaram, Reinhard Köhler, Viktor Krupa, Ján Mačutek, Regina Pustet, Ludmila Uhlířová & Matummal N. Vidya
Popescu, Ioan-Iovitz, Ján Mačutek & Gabriel Altmann
Popescu, Ioan-Iovitz, Ján Mačutek, Emmerich Kelih, Radek Čech, Karl-Heinz Best & Gabriel Altmann
Schwarm, Sarah E. & Mari Ostendorf
Tweedie, Fiona J. & Harald R. Baayen
Welling, Soeren H., Hanne H. F. Refsgaard, Per B. Brockhoff & Line H. Clemmensen