Chapter 7
Live exploration of Wikipedia editing dynamics with visual analytics
WhoColor and Interactive Wikipedia Article Analysis Notebooks
The revision histories of Wikipedia
articles are a rich source of data about the interactions of editors with each other and with the content, yet they
are not straightforward to mine or understand. We describe two tools for visual analytics that support this effort:
(i) An interactive browser extension to study word authorship, age, and conflict dynamics, which provides an overlay on live Wikipedia articles;
and (ii) a novel interactive Jupyter Notebook package that allows us to run analyses of editorial dynamics
out-of-the-box and is easily modifiable. Both leverage live data for any article on demand from several Web APIs,
centering on our own WikiWho service, providing the most accurate mining of live word-level changes currently
available. We show how these tools enable the exploration of the survival of content, productivity of editors,
conflict dynamics, and other metrics through low-barrier interfaces while providing the opportunity for more
quantitative investigations via access to the notebooks’ underlying data structures.
Article outline
- 1.Introduction
- 1.1Related work
- 1.2What is in this chapter?
- 2.Analyzing articles “in situ”: WhoColor
- 3.IWAAN: Visual analytics with Jupyter Notebooks in the
cloud
- 3.1Jupyter Notebooks as a versatile analysis,
sharing and documentation tool
- 3.2IWAAN content structure
- 4.Use case: “Genetically modified organism”
- 4.1Authorship distribution and conflict with
WhoColor
- 4.2IWAANs — Templates and protection
- 4.3IWAANs — Actions over time
- 4.4IWAANs — Talk topics
- 4.5IWAANs — Impactful editors and token ownership
- 4.6IWAANs — Editor activity drilldown
- 4.7IWAANs — Text change overall
- 4.8Focus on the “volatile phase”
- 4.9Further large-scale changes and conflicts
- 4.10Beyond predesigned interface modules
- 4.11A note on privacy
- 5.Conclusion
-
Notes
-
References