Tracing semantic change with distributional methods
The contexts of algo
This paper uses the tools of distributional semantics to investigate the semantic change of
algo from a noun meaning ‘goods, possessions’ and an indefinite pronoun ‘something’ in the Medieval/Classical period of Spanish to an indefinite pronoun and degree adverb ‘a bit’ in contemporary Spanish. We compare the results of a previous corpus-based study (
Amaral 2016) on the semantic change of
algo with an analysis using word embeddings models with two goals: (i) to show how word embeddings can help identify different synchronic values of a word, and (ii) to provide measures of change through distributional semantic methods. We discuss the challenges of a study with this methodology using limited data from older periods of a language, hence putting into focus decisions that have to be made and their implications for the analysis. In this way, we hope to contribute to a fruitful integration of more traditional studies in diachronic semantics with the methodology of word embeddings.
Article outline
- 1.Introduction
- 2.Background and rationale
- 2.1Previous work on semantic change
- 2.2Studying meaning change with distributional methods
- 2.3Previous research on algo
- 3.Corpora and processing of the data
- 3.1Modern Spanish: Spanish Billion Word Corpus (SBW)
- 3.2Medieval and Classical Spanish: Chronicles Corpus
- 3.3Processing of the data
- 4.Methodology
- 4.1Representing meaning in word embedding models
- 4.1.1Embeddings models used in this study
- Singular Value Decomposition
- Skip-Gram with Negative Sampling
- Global Vectors for Word Representation
- 4.1.2Using word embeddings to determine semantic neighbors
- 4.2Visualizations with t-SNE
- 5.Results
- 5.1Neighbors of algo in Chronicles and SBW
- 5.2Comparison between algo and two nouns, using the t-SNE visualization
- 6.Analysis of nearest neighbors
- 6.1Studying algo with word embeddings
- 6.2Comparison with previous work
- 7.Conclusion
- Notes
- Abbreviations
-
References