Chapter 5
Primitive exploration of variants using comparable corpora
Article outline
- 5.1Comparable corpora
- 5.1.1Corpus
- 5.1.2Properties
- Genre
- Domain
- Communication level
- 5.1.3Collecting comparable corpora
-
5.1.4Comparability
- 5.2Comparable corpora used in this study
- Breast cancer
- Diabetes
- Renewable energy
- 5.3Looking for variants
- 5.3.1Implementation
- 5.3.2N-gram massive data
- 5.3.3Unigrams
- Derivations and compounds
- Word splitting characters
- Reliability of unigrams
- Analysis of Fr unigrams
-
Analysis of En unigrams
-
Analysis of Es unigrams
- Analysis of De unigrams
- Analysis of Ru unigrams
-
5.3.4Skip-grams
-
5.3.5Categories of variants facing data
-
5.4Comparison according to communication levels
- 5.4.1Unigrams
- Corpus Diabetes in Fr
- Corpus Breast cancer in Fr and En
- 5.4.2Skip-grams
- Corpus Diabetes in Fr
- Corpus Breast cancer in Fr and En
-
Notes