On the validity of crowdsourced data
This chapter demonstrates the validity of
crowdsourced data by comparing the crowdsourced data from the VinKo
project with traditionally collected data from the AThEME project.
Both datasets target non-standard language varieties of the South
Tyrol, Trentino, and Veneto regions in north-eastern Italy. Three
different morphosyntactic phenomena are discussed, each relating to
a particular language variety, providing evidence that the
crowdsourced data is of comparable quality to the traditionally
gathered data and has the added advantage of yielding a larger
overall dataset covering a denser location network.
Article outline
- 1.Introduction
- 2.The AThEME and VinKo projects
- 3.VinKo platform design and data collection methods
- 3.1Data collection
- 3.2Representation
- 3.3Technical aspects
- 4.VinKo and AThEME data in comparison: three case studies
- 4.1Tyrolean dialects: pronominal case patterns
- 4.2Trentino dialects: agreement with a postverbal
- 4.3Venetan dialects: obligatory and optional subject
- 5.Conclusions
