A Modern Greek readability tool
Development of evaluation methods
The aim of this paper is to develop an automatic readability analysis tool that focusses on Modern Greek as a foreign language. Based on previous work done in the Centre for the Greek Language (CGL), we offer an enhanced methodology in readability prediction for Modern Greek texts matching the adequacy level (A1 to C2) according to the Common European Framework of Languages. The proposed tool is based on several stylometric indices inspired by work done in the field of quantitative linguistics. The resulting feature vectors train a Random Forest, a robust and accurate machine learning algorithm that predicts readability in our testing dataset with 0.943 accuracy, surpassing all previous readability tools for Modern Greek. Further, analysis of the results with advanced visualization methods reveals the complex and fluid dynamics of the features used and their readability predictions.
Article outline
- 1.Introduction
- 2.Readability analysis: A short literature review
- 3.Methodology
- 3.1Corpus
- 3.2Features
- 3.3Machine learning algorithm: Random Forest
- 4.Results
- 5.Conclusion
-
Note
-
References